MyNixOS website logo
Description

Easily Access US Census Bureau Survey and Geographic Data.

The key function 'get_vintage_data()' returns a dataframe and is the window into the Census Bureau API requiring just a dataset name, vintage(year), and vector of variable names for survey estimates/percentages. Other functions assist in searching for available datasets, geographies, group/variable concepts of interest. Also provided are functions to access and layer (via standard piping) displayable geometries for the US, states, counties, blocks/tracts, roads, landmarks, places, and bodies of water. Joining survey data with many of the geometry functions is built-in to produce choropleth maps.

RcensusPkg

Rick Dean 2025-04-10

The goal of RcensusPkg is to provide easy access to the US Census Bureau’s datasets and collection of TIGER/Line Shapefiles providing plot geometries for states, counties, roads, landmarks, water, enumeration tracts/blocks for the entire United States. The only requirement is for the user to apply for and obtain a free access key issued from the Bureau. See Guidance for Developers for additional information.

Three functions (get_vintage_data(), multi_vintage_data(), get_idb-data()) call for a key along with many of the examples and tests. Be sure to read the description of these functions to learn more about incorporating the key.

The example below illustrates a simple workflow for downloading a dataset, merging the data with shapefile geometries, and plotting the merge to create a choropleth map. A more detailed example of the RcensusPkg workflow is available here.

Installation

The package is available for installation from CRAN.

You can install the development version of RcensusPkg from GitHub with:

# install.packages("pak")
pak::pak("deandevl/RcensusPkg")

Using devtools::install_github():

devtools::install_github("deandevl/RcensusPkg")

Example

Setup

We will be using the following packages.

library(httr)
library(jsonlite)
library(stringr)
library(data.table)
library(withr)
library(sf)
library(kableExtra)
library(ggplot2)
library(RplotterPkg)
library(RcensusPkg)

A look at the US Census Bureau’s community resilience estimates (CRE) database.

Among the list of available API, there is the Community Resilience Estimates based on such factors as:

  • Income-to-Poverty Ratio (IPR) < 130 percent
  • Single or zero caregiver household
  • Aged 65 years or older
  • No health insurance coverage
  • No vehicle access (Household)
  • Disability, at least one serious constraint to significant life activity
  • No one in the household is employed full-time, year-round
  • Households without broadband internet access
  • Unit-level crowding with >= 0.75 persons per room
  • No one in the household has received a high school diploma
  • No one in the household speaks English “very well”

The factors are used to estimate the number of people with:

  • 0 risk factors (Low risk)
  • 1-2 risk factors (Medium risk)
  • 3 or more risk factors (High risk)

The workflow for using RcensusPkg is the following:

  1. Get a database name recognized by the API. Use RcensusPkg::get_dataset_names() filtering for a “resilience” title and vintage of 2022.
datasets_dt <- RcensusPkg::get_dataset_names(
  vintage = 2022,
  filter_title_str = "resilience"
)
namevintagetitle
cre2022Community Resilience Estimates
crepuertorico2022Community Resilience Estimates for Puerto Rico

Table 1: ‘resilience’ datasets

The returned dataframe shows a dataset name for CRE 2022 as (surprise) “cre”.

  1. Get the variable names available for the “cre” dataset.
cre_var_names_dt <- RcensusPkg::get_variable_names(
  dataset = "cre",
  vintage = 2022
) |> 
  _[, .(name, label)]
namelabel
COUNTYGeography
GEOCOMPGEO_ID Component
GEO_IDGeographic Identifier
NATIONGeography
POPUNIPopulation Universe
PRED0_EEstimated number of individuals with zero components of social vulnerability
PRED0_PERate of individuals with zero components of social vulnerability
PRED12_EEstimated number of individuals with one-two components of social vulnerability
PRED12_PERate of individuals with one-two components of social vulnerability
PRED3_EEstimated number of individuals with three or more components of social vulnerability
PRED3_PERate of individuals with three or more components of social vulnerability
STATEGeography
SUMLEVELSummary Level code
TRACTGeography
forCensus API FIPS ‘for’ clause
inCensus API FIPS ‘in’ clause
ucgidUniform Census Geography Identifier clause

Table 2: Variable names from the CRE dataset

We are interested in the percentage of individuals with three or more vulnerabilities (“PRED3_PE”)

  1. Get the regions available for the CRE dataset.
cre_regions_dt <- RcensusPkg::get_geography(
  dataset = "cre",
  vintage = 2022
)
namegeoLevelDisplay
us010
state040
county050
tract140

Table 3: Regions from the CRE dataset

So we can get CRE estimates from the entire US, state, county, and tract enumeration levels. We are interested in the counties for the state of Florida.

  1. Download the data:
florida_fips <-  usmap::fips(("FL"))

florida_cre_dt <- RcensusPkg::get_vintage_data(
  dataset = "cre",
  vintage = 2022,
  vars = "PRED3_PE",
  region = "county",
  regionin = paste0("state:", florida_fips)
) |> 
  _[, PRED3_PE := as.numeric(PRED3_PE)] |> 
  data.table::setnames(old = "PRED3_PE", new = "CRE_GE_3")
NAMECRE_GE_3statecountyGEOID
Alachua County, Florida19.291200112001
Baker County, Florida22.931200312003
Bay County, Florida20.051200512005
Bradford County, Florida24.771200712007
Brevard County, Florida20.671200912009
Broward County, Florida22.441201112011
Calhoun County, Florida31.521201312013
Charlotte County, Florida27.531201512015

Table 4: Florida county percent risk from the CRE dataset

  1. Merge the CRE county data with the Florida county shapefile geographies from the Bureau.
output_dir <- withr::local_tempdir()
if(!dir.exists(output_dir)){
  dir.create(output_dir)
}
express <- parse(text = paste0("STATEFP == ", '"', florida_fips, '"'))
cre_florida_sf <- RcensusPkg::tiger_counties_sf(
  output_dir = output_dir,
  vintage = 2022,
  general = TRUE,
  express = express,
  datafile = florida_cre_dt,
  datafile_key = "county"
)
  1. Plot the choropleth map of the percent of individuals with 3 or more risk factors.
RplotterPkg::create_sf_plot(
  sf = cre_florida_sf,
  aes_fill = "CRE_GE_3",
  hide_x_tics = TRUE,
  hide_y_tics = TRUE,
  panel_color = "white",
  panel_border_color = "white",
  caption = "Percent 3 or more risk factors among Florida counties"
)

Metadata

Version

0.1.5

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows