MyNixOS website logo
Description

Retrieve Records from the Urban Institute's Education Data Portal API.

Allows R users to retrieve and parse data from the Urban Institute's Education Data API <https://educationdata.urban.org/> into a 'data.frame' for analysis.

educationdata

CRANstatus R-CMD-check

Retrieve data from the Urban Institute’s Education Data API as a data.frame for easy analysis.

NOTE: By downloading and using this programming package, you agree to abide by the Data Policy and Terms of Use of the Education Data Portal.

Installation

You can install the released version of educationdata from CRAN with:

install.packages("educationdata")

And the development version from GitHub with:

# install.packages('devtools') # if necessary
devtools::install_github('UrbanInstitute/education-data-package-r')

Usage

library(educationdata)

df <- get_education_data(level = 'schools', 
                         source = 'ccd', 
                         topic = 'enrollment', 
                         subtopic = list('race', 'sex'),
                         filters = list(year = 2008,
                                        grade = 9:12,
                                        ncessch = '340606000122'),
                         add_labels = TRUE)

str(df)
#> 'data.frame':    96 obs. of  9 variables:
#>  $ year       : int  2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 ...
#>  $ ncessch    : chr  "340606000122" "340606000122" "340606000122" "340606000122" ...
#>  $ ncessch_num: num  3.41e+11 3.41e+11 3.41e+11 3.41e+11 3.41e+11 ...
#>  $ grade      : Factor w/ 19 levels "Pre-K","Kindergarten",..: 11 11 11 11 11 11 11 11 11 11 ...
#>  $ race       : Factor w/ 14 levels "White","Black",..: 2 3 5 5 2 4 6 11 1 7 ...
#>  $ sex        : Factor w/ 7 levels "Male","Female",..: 1 1 2 1 2 2 2 1 2 1 ...
#>  $ enrollment : int  41 39 0 0 46 32 3 270 166 0 ...
#>  $ fips       : Factor w/ 79 levels "Alabama","Alaska",..: 34 34 34 34 34 34 34 34 34 34 ...
#>  $ leaid      : chr  "3406060" "3406060" "3406060" "3406060" ...

The get_education_data() function will return a data.frame from a call to the Education Data API.

get_education_data(level, source, topic, subtopic, filters, add_labels)

where:

  • level (required) - API data level to query.
  • source (required) - API data source to query.
  • topic (required) - API data topic to query.
  • subtopic (optional) - Optional list of grouping parameters for an API call.
  • filters (optional) - Optional list query to filter the results from an API call.
  • add_labels - Add variable labels (when applicable)? Defaults to FALSE.
  • csv - Download the full csv file? Defaults to FALSE.

Available Endpoints

LevelSourceTopicSubtopicMain FiltersYears Available
college-universityfsa90-10-revenue-percentagesNAyear2014–2017
college-universityfsacampus-based-volumeNAyear2001–2017
college-universityfsafinancial-responsibilityNAyear2006–2016
college-universityfsagrantsNAyear1999–2018
college-universityfsaloansNAyear1999–2018
college-universityipedsacademic-librariesNAyear2013–2019
college-universityipedsacademic-year-room-board-otherNAyear1999–2020
college-universityipedsacademic-year-tuition-prof-programNAyear1986–2008, 2010–2020
college-universityipedsacademic-year-tuitionNAyear1986–2020
college-universityipedsadmissions-enrollmentNAyear2001–2019
college-universityipedsadmissions-requirementsNAyear1990–2019
college-universityipedscompletersNAyear2011–2019
college-universityipedscompletions-cip-2NAyear1991–2019
college-universityipedscompletions-cip-6NAyear1983–2019
college-universityipedsdirectoryNAyear1980, 1984–2020
college-universityipedsenrollment-full-time-equivalentNAyear, level_of_study1997–2018
college-universityipedsenrollment-headcountNAyear, level_of_study1996–2018
college-universityipedsfall-enrollmentage, sexyear, level_of_study1991, 1993, 1995, 1997, 1999–2020
college-universityipedsfall-enrollmentrace, sexyear, level_of_study1986–2020
college-universityipedsfall-enrollmentresidenceyear1986, 1988, 1992, 1994, 1996, 1998, 2000–2020
college-universityipedsfall-retentionNAyear2003–2020
college-universityipedsfinanceNAyear1979, 1983–2017
college-universityipedsgrad-rates-200pctNAyear2007–2017
college-universityipedsgrad-rates-pellNAyear2015–2017
college-universityipedsgrad-ratesNAyear1996–2017
college-universityipedsinstitutional-characteristicsNAyear1980, 1984–2020
college-universityipedsoutcome-measuresNAyear2015–2018
college-universityipedsprogram-year-room-board-otherNAyear1999–2020
college-universityipedsprogram-year-tuition-cipNAyear1987–2020
college-universityipedssalaries-instructional-staffNAyear1980, 1984, 1985, 1987, 1989–1999, 2001–2018
college-universityipedssalaries-noninstructional-staffNAyear2012–2018
college-universityipedssfa-all-undergraduatesNAyear2007–2017
college-universityipedssfa-by-living-arrangementNAyear2008–2017
college-universityipedssfa-by-tuition-typeNAyear1999–2017
college-universityipedssfa-ftftNAyear1999–2017
college-universityipedssfa-grants-and-net-priceNAyear2008–2017
college-universityipedsstudent-faculty-ratioNAyear2009–2020
college-universitynacuboendowmentsNAyear2012–2018
college-universitynccs990-formsNAyear1993–2016
college-universitynhgiscensus-1990NAyear1980, 1984–2017
college-universitynhgiscensus-2000NAyear1980, 1984–2017
college-universitynhgiscensus-2010NAyear1980, 1984–2017
college-universityscorecarddefaultNAyear1996–2017
college-universityscorecardearningsNAyear2003–2014
college-universityscorecardinstitutional-characteristicsNAyear1996–2017
college-universityscorecardrepaymentNAyear2007–2016
college-universityscorecardstudent-characteristicsaid-applicantsyear1997–2016
college-universityscorecardstudent-characteristicshome-neighborhoodyear1997–2016
school-districtsccddirectoryNAyear1986–2020
school-districtsccdenrollmentNAyear, grade1986–2020
school-districtsccdenrollmentraceyear, grade1986–2020
school-districtsccdenrollmentrace, sexyear, grade1986–2020
school-districtsccdenrollmentsexyear, grade1986–2020
school-districtsccdfinanceNAyear1991, 1994–2018
school-districtsedfactsassessmentsNAyear, grade_edfacts2009–2018
school-districtsedfactsassessmentsraceyear, grade_edfacts2009–2018
school-districtsedfactsassessmentssexyear, grade_edfacts2009–2018
school-districtsedfactsassessmentsspecial-populationsyear, grade_edfacts2009–2018
school-districtsedfactsgrad-ratesNAyear2010–2018
school-districtssaipeNANAyear1995, 1997, 1999–2018
schoolsccddirectoryNAyear1986–2020
schoolsccdenrollmentNAyear, grade1986–2020
schoolsccdenrollmentraceyear, grade1986–2020
schoolsccdenrollmentrace, sexyear, grade1986–2020
schoolsccdenrollmentsexyear, grade1986–2020
schoolscrdcalgebra1disability, sexyear2011, 2013, 2015, 2017
schoolscrdcalgebra1lep, sexyear2011, 2013, 2015, 2017
schoolscrdcalgebra1race, sexyear2011, 2013, 2015, 2017
schoolscrdcap-examsdisability, sexyear2011, 2013, 2015, 2017
schoolscrdcap-examslep, sexyear2011, 2013, 2015, 2017
schoolscrdcap-examsrace, sexyear2011, 2013, 2015, 2017
schoolscrdcap-ib-enrollmentdisability, sexyear2011, 2013, 2015, 2017
schoolscrdcap-ib-enrollmentlep, sexyear2011, 2013, 2015, 2017
schoolscrdcap-ib-enrollmentrace, sexyear2011, 2013, 2015, 2017
schoolscrdcchronic-absenteeismdisability, sexyear2013, 2015
schoolscrdcchronic-absenteeismlep, sexyear2013, 2015
schoolscrdcchronic-absenteeismrace, sexyear2013, 2015
schoolscrdccredit-recoveryNAyear2015, 2017
schoolscrdcdirectoryNAyear2011, 2013, 2015, 2017
schoolscrdcdiscipline-instancesNAyear2015, 2017
schoolscrdcdisciplinedisability, lep, sexyear2011, 2013, 2015, 2017
schoolscrdcdisciplinedisability, race, sexyear2011, 2013, 2015, 2017
schoolscrdcdisciplinedisability, sexyear2011, 2013, 2015, 2017
schoolscrdcdual-enrollmentdisability, sexyear2013, 2015, 2017
schoolscrdcdual-enrollmentlep, sexyear2013, 2015, 2017
schoolscrdcdual-enrollmentrace, sexyear2013, 2015, 2017
schoolscrdcenrollmentdisability, sexyear2011, 2013, 2015, 2017
schoolscrdcenrollmentlep, sexyear2011, 2013, 2015, 2017
schoolscrdcenrollmentrace, sexyear2011, 2013, 2015, 2017
schoolscrdcharassment-or-bullyingallegationsyear2013, 2015, 2017
schoolscrdcharassment-or-bullyingdisability, sexyear2011, 2013, 2015, 2017
schoolscrdcharassment-or-bullyinglep, sexyear2011, 2013, 2015, 2017
schoolscrdcharassment-or-bullyingrace, sexyear2011, 2013, 2015, 2017
schoolscrdcmath-and-sciencedisability, sexyear2011, 2013, 2015, 2017
schoolscrdcmath-and-sciencelep, sexyear2011, 2013, 2015, 2017
schoolscrdcmath-and-sciencerace, sexyear2011, 2013, 2015, 2017
schoolscrdcoffensesNAyear2015, 2017
schoolscrdcofferingsNAyear2011, 2013, 2015, 2017
schoolscrdcrestraint-and-seclusiondisability, lep, sexyear2011, 2013, 2015, 2017
schoolscrdcrestraint-and-seclusiondisability, race, sexyear2011, 2013, 2015, 2017
schoolscrdcrestraint-and-seclusiondisability, sexyear2011, 2013, 2015, 2017
schoolscrdcrestraint-and-seclusioninstancesyear2013, 2015, 2017
schoolscrdcretentiondisability, sexyear, grade2011, 2013, 2015, 2017
schoolscrdcretentionlep, sexyear, grade2011, 2013, 2015, 2017
schoolscrdcretentionrace, sexyear, grade2011, 2013, 2015, 2017
schoolscrdcsat-act-participationdisability, sexyear2011, 2013, 2015, 2017
schoolscrdcsat-act-participationlep, sexyear2011, 2013, 2015, 2017
schoolscrdcsat-act-participationrace, sexyear2011, 2013, 2015, 2017
schoolscrdcschool-financeNAyear2011, 2013, 2015, 2017
schoolscrdcsuspensions-daysdisability, sexyear2015, 2017
schoolscrdcsuspensions-dayslep, sexyear2015, 2017
schoolscrdcsuspensions-daysrace, sexyear2015, 2017
schoolscrdcteachers-staffNAyear2011, 2013, 2015, 2017
schoolsedfactsassessmentsNAyear, grade_edfacts2009–2018
schoolsedfactsassessmentsraceyear, grade_edfacts2009–2018
schoolsedfactsassessmentssexyear, grade_edfacts2009–2018
schoolsedfactsassessmentsspecial-populationsyear, grade_edfacts2009–2018
schoolsedfactsgrad-ratesNAyear2010–2018
schoolsmepsNANAyear2013–2018
schoolsnhgiscensus-1990NAyear1986–2020
schoolsnhgiscensus-2000NAyear1986–2020
schoolsnhgiscensus-2010NAyear1986–2020

Main Filters

Due to the way the API is set-up, the variables listed within ‘main filters’ are the fastest way to subset an API call.

In addition to year, the other main filters for certain endpoints accept the following values:

Grade

Filter ArgumentGrade
grade = 'grade-pk'Pre-K
grade = 'grade-k'Kindergarten
grade = 'grade-1'Grade 1
grade = 'grade-2'Grade 2
grade = 'grade-3'Grade 3
grade = 'grade-4'Grade 4
grade = 'grade-5'Grade 5
grade = 'grade-6'Grade 6
grade = 'grade-7'Grade 7
grade = 'grade-8'Grade 8
grade = 'grade-9'Grade 9
grade = 'grade-10'Grade 10
grade = 'grade-11'Grade 11
grade = 'grade-12'Grade 12
grade = 'grade-13'Grade 13
grade = 'grade-14'Adult Education
grade = 'grade-15'Ungraded
grade = 'grade-99'Total

Level of Study

Filter ArgumentLevel of Study
level_of_study = 'undergraduate'Undergraduate
level_of_study = 'graduate'Graduate
level_of_study = 'first-professional'First Professional
level_of_study = 'post-baccalaureate'Post-baccalaureate
level_of_study = '99'Total

Examples

Let’s build up some examples, from the following set of endpoints.

LevelSourceTopicSubtopicMain FiltersYears Available
schoolsccdenrollmentNAyear, grade1986–2020
schoolsccdenrollmentraceyear, grade1986–2020
schoolsccdenrollmentrace, sexyear, grade1986–2020
schoolsccdenrollmentsexyear, grade1986–2020
schoolscrdcenrollmentdisability, sexyear2011, 2013, 2015, 2017
schoolscrdcenrollmentlep, sexyear2011, 2013, 2015, 2017
schoolscrdcenrollmentrace, sexyear2011, 2013, 2015, 2017

The following will return a data.frame across all years and grades:

library(educationdata)
df <- get_education_data(level = 'schools', 
                         source = 'ccd', 
                         topic = 'enrollment')

Note that this endpoint is also callable by certain subtopic variables:

  • race
  • sex
  • race, sex

These variables can be added to the subtopic argument:

df <- get_education_data(level = 'schools', 
                         source = 'ccd', 
                         topic = 'enrollment', 
                         subtopic = list('race', 'sex'))

You may also filter the results of an API call. In this case year and grade will provide the most time-efficient subsets, and can be vectorized:

df <- get_education_data(level = 'schools', 
                         source = 'ccd', 
                         topic = 'enrollment', 
                         subtopic = list('race', 'sex'),
                         filters = list(year = 2008,
                                        grade = 9:12))

Additional variables can also be passed to filters to subset further:

df <- get_education_data(level = 'schools', 
                         source = 'ccd', 
                         topic = 'enrollment', 
                         subtopic = list('race', 'sex'),
                         filters = list(year = 2008,
                                        grade = 9:12,
                                        ncessch = '3406060001227'))

The add_labels flag will map variables to a factor from their labels in the API.

df <- get_education_data(level = 'schools', 
                         source = 'ccd', 
                         topic = 'enrollment', 
                         subtopic = list('race', 'sex'),
                         filters = list(year = 2008,
                                        grade = 9:12,
                                        ncessch = '340606000122'),
                         add_labels = TRUE)

Finally, the csv flag can be set to download the full .csv data frame. In general, the csv functionality is much faster when retrieving the full data frame (or a large subset) and much slower when retrieving a small subset of a data frame (especially ones with a lot of filters added). In this example, the full csv for 2008 must be downloaded and then subset to the 96 observations.

df <- get_education_data(level = 'schools', 
                         source = 'ccd', 
                         topic = 'enrollment', 
                         subtopic = list('race', 'sex'),
                         filters = list(year = 2008,
                                        grade = 9:12,
                                        ncessch = '340606000122'),
                         add_labels = TRUE,
                         csv = TRUE)

Summary Endpoints

You can access the summary endpoint functionality using the get_education_data_summary() function.

df <- get_education_data_summary(
    level = "schools",
    source = "ccd",
    topic = "enrollment",
    stat = "sum",
    var = "enrollment",
    by = "fips",
    filters = list(fips = 6:8, year = 2004:2005)
)

In this example, we take the schools/ccd/enrollment endpoint and retrieve the sum of enrollment by fips code, filtered to fips codes 6, 7, 8 for the years 2004 and 2005.

The syntax largely follows the original syntax of get_education_data(): with three new arguments:

  • stat is the summary statistic to be retrieved. Valid statistics include: avg, sum, count, median, min, max, stddev, and variance.
  • var is the variable to run the summary statistic on.
  • by is the grouping variable(s) to use. This can be a single string, or a vector of multiple variables, i.e., by = c("fips", "race").

Some endpoints are further broken out by subtopic. These can be specified using the subtopic option.

df <- get_education_data_summary(
    level = "schools",
    source = "crdc",
    topic = "harassment-or-bullying",
    subtopic = "allegations",
    stat = "sum",
    var = "allegations_harass_sex",
    by = "fips"
)

Note that only some endpoints have an applicable subtopic, and this list is slightly different from the syntax of the full data API. Endpoints with subtopics for the summary endpoint functionality include:

  • schools/crdc/harassment-or-bullying/allegations
  • schools/crdc/harassment-or-bullying/students
  • schools/crdc/restraint-and-seclusion/instances
  • schools/crdc/restraint-and-seclusion/students
  • college-university/ipeds/enrollment-full-time-equivalent/summaries
  • college-university/ipeds/fall-enrollment/age/summaries
  • college-university/ipeds/fall-enrollment/race/summaries
  • college-university/ipeds/fall-enrollment/residence/summaries
  • college-university/scorecard/student-characteristics/aid-applicants/summaries
  • college-university/scorecard/student-characteristics/home-neighborhood/summaries

For more information on the summary endpoint functionality, see the full API documentation.

Metadata

Version

0.1.3

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows