2024-08-15
Workshop aims & objectives
To follow this demo you will need to:
Census Hierarchies
Census Hierarchies example
We are going to start installing the following packages
tidycensus: allows users to interface with a select number of the US Census Bureau’s data APIs and return tidyverse-ready data frames, optionally with simple feature geometry included.
tidyverse: is a collection of open-source R packages that help with data science tasks like importing, tidying, manipulating, and visualizing data
mapview: is a tool for quickly creating interactive maps of spatial data.
ggspatial: is a framework for interacting with spatial data using ggplot2 to create maps.
leafsync: is a plugin for leaflet to produce potentially synchronised small multiples of leaflet web maps wrapping Leaflet.
Allows users to interface with a select number of the US Census Bureau’s data APIs and return tidyverse-ready data frames, optionally with simple feature geometry included.
get_decennial()
, which requests data from the US Decennial Census APIs for 2000, 2010, and 2020.
get_acs()
, which requests data from the 1-year and 5-year American Community Survey samples. Data are available from the 1-year ACS back to 2005 and the 5-year ACS back to 2005-2009.
get_estimates()
, an interface to the Population Estimates APIs. These datasets include yearly estimates of population characteristics by state, county, and metropolitan area, along with components of change demographic estimates like births, deaths, and migration rates.
get_pums()
, which accesses data from the ACS Public Use Microdata Sample APIs. These samples include anonymized individual-level records from the ACS organized by household and are highly useful for many different social science analyses. get_pums() is covered in more depth in Chapters 9 and 10.
get_flows()
, an interface to the ACS Migration Flows APIs. Includes information on in- and out-flows from various geographies for the 5-year ACS samples, enabling origin-destination analyses.
Nationwide survey that collects and publishes information about the US population’s social, economic, housing, and demographic characteristics.
Income, jobs and occupations, educational attainment, veterans, and housing tenure.
Get the number of latino population B03001_003
at the county
level for the state of Pennsylvania PA
# A tibble: 67 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 42001 Adams County, Pennsylvania B03001_003 7688 NA
2 42003 Allegheny County, Pennsylvania B03001_003 29272 NA
3 42005 Armstrong County, Pennsylvania B03001_003 560 NA
4 42007 Beaver County, Pennsylvania B03001_003 3227 NA
5 42009 Bedford County, Pennsylvania B03001_003 598 NA
6 42011 Berks County, Pennsylvania B03001_003 99460 NA
7 42013 Blair County, Pennsylvania B03001_003 1721 NA
8 42015 Bradford County, Pennsylvania B03001_003 946 NA
9 42017 Bucks County, Pennsylvania B03001_003 38195 NA
10 42019 Butler County, Pennsylvania B03001_003 3391 NA
# ℹ 57 more rows
With load_variables()
you can see a list of all variables with their corresponding code. Simply set the year and source.
# A tibble: 6 × 4
name label concept geography
<chr> <chr> <chr> <chr>
1 B01001A_001 Estimate!!Total: Sex by Age (Whi… tract
2 B01001A_002 Estimate!!Total:!!Male: Sex by Age (Whi… tract
3 B01001A_003 Estimate!!Total:!!Male:!!Under 5 years Sex by Age (Whi… tract
4 B01001A_004 Estimate!!Total:!!Male:!!5 to 9 years Sex by Age (Whi… tract
5 B01001A_005 Estimate!!Total:!!Male:!!10 to 14 years Sex by Age (Whi… tract
6 B01001A_006 Estimate!!Total:!!Male:!!15 to 17 years Sex by Age (Whi… tract
latino_table_PA <- get_acs(
geography = "county",
state = "PA",
table = "B03001",
year = 2022,
)
latino_table_PA
# A tibble: 2,077 × 5
GEOID NAME variable estimate moe
<chr> <chr> <chr> <dbl> <dbl>
1 42001 Adams County, Pennsylvania B03001_001 104604 NA
2 42001 Adams County, Pennsylvania B03001_002 96916 NA
3 42001 Adams County, Pennsylvania B03001_003 7688 NA
4 42001 Adams County, Pennsylvania B03001_004 4419 430
5 42001 Adams County, Pennsylvania B03001_005 1607 336
6 42001 Adams County, Pennsylvania B03001_006 243 149
7 42001 Adams County, Pennsylvania B03001_007 405 263
8 42001 Adams County, Pennsylvania B03001_008 385 244
9 42001 Adams County, Pennsylvania B03001_009 4 6
10 42001 Adams County, Pennsylvania B03001_010 91 70
# ℹ 2,067 more rows
latino_table_PA <- get_acs(
geography = "county",
state = "PA",
table = "B03001",
year = 2022,
output = "wide"
)
latino_table_PA
# A tibble: 67 × 64
GEOID NAME B03001_001E B03001_001M B03001_002E B03001_002M B03001_003E
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 42001 Adams Coun… 104604 NA 96916 NA 7688
2 42003 Allegheny … 1245310 NA 1216038 NA 29272
3 42005 Armstrong … 65538 NA 64978 NA 560
4 42007 Beaver Cou… 167629 NA 164402 NA 3227
5 42009 Bedford Co… 47613 NA 47015 NA 598
6 42011 Berks Coun… 428483 NA 329023 NA 99460
7 42013 Blair Coun… 122640 NA 120919 NA 1721
8 42015 Bradford C… 60159 NA 59213 NA 946
9 42017 Bucks Coun… 645163 NA 606968 NA 38195
10 42019 Butler Cou… 194562 NA 191171 NA 3391
# ℹ 57 more rows
# ℹ 57 more variables: B03001_003M <dbl>, B03001_004E <dbl>, B03001_004M <dbl>,
# B03001_005E <dbl>, B03001_005M <dbl>, B03001_006E <dbl>, B03001_006M <dbl>,
# B03001_007E <dbl>, B03001_007M <dbl>, B03001_008E <dbl>, B03001_008M <dbl>,
# B03001_009E <dbl>, B03001_009M <dbl>, B03001_010E <dbl>, B03001_010M <dbl>,
# B03001_011E <dbl>, B03001_011M <dbl>, B03001_012E <dbl>, B03001_012M <dbl>,
# B03001_013E <dbl>, B03001_013M <dbl>, B03001_014E <dbl>, …
Code | Variable |
---|---|
B03001_004 | Mexican |
B03001_005 | Puerto Rican |
B03001_006 | Cuban |
B03001_007 | Dominican |
B03001_009 | Costa Rican |
B03001_010 | Guatemalan |
B03001_011 | Honduran |
B03001_012 | Nicaraguan |
B03001_013 | Panamanian |
B03001_014 | Salvadoran |
B03001_015 | Other Central American |
B03001_017 | Argentinian |
B03001_018 | Bolivian |
B03001_019 | Chilean |
B03001_020 | Colombian |
B03001_021 | Ecuadorian |
B03001_022 | Paraguayan |
B03001_023 | Peruvian |
B03001_024 | Uruguayan |
B03001_025 | Venezuelan |
B03001_026 | Other South American |
B03001_027 | Other Hispanic or Latino |
library("ggplot2")
ggplot(latino_table_PA, aes(x = reorder(NAME, -latino_percentage), y = latino_percentage)) +
geom_bar(stat = "identity", fill = "darkred") +
labs(
x = "County",
y = "Latino Population Percentage",
title = "Percentage of Latino Population by County in Pennsylvania"
) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
ggplot(data = latino_table_PA, aes(fill = latino_percentage)) +
geom_sf() +
scale_fill_distiller(palette = "YlOrRd",
direction = 1) +
labs(title = " Percentage of latino population by county in Pennsylvania, 2022",
caption = "Data source: 2022 5-year ACS, US Census Bureau",
fill = "Percenatge") +
theme_void()
ggplot(data = latino_table_tracts_PA, aes(fill = latino_percentage)) +
geom_sf() +
scale_fill_distiller(palette = "YlOrRd",
direction = 1) +
labs(title = " Percentage of latino population by tract in Pennsylvania, 2022",
caption = "Data source: 2022 5-year ACS, US Census Bureau",
fill = "Percentage") +
theme_void()
library("maps")
ggplot(data = latino_table_tracts_PA, aes(fill = latino_percentage)) +
geom_sf(linewidth=0.01) +
scale_fill_distiller(palette = "YlOrRd",
direction = 1) +
labs(title = " Percentage of latino population by tract in Pennsylvania, 2022",
caption = "Data source: 2022 5-year ACS, US Census Bureau",
fill = "Percentage") +
borders("county","pennsylvania") +
theme_void()
lehigh_latino$latino_percentage <- ifelse(lehigh_latino$B03001_003E>0, 100* (lehigh_latino$B03001_003E / lehigh_latino$B03001_001E),0)
ggplot(data = lehigh_latino, aes(fill = latino_percentage)) +
geom_sf(linewidth=0.01) +
scale_fill_distiller(palette = "YlOrRd",
direction = 1) +
labs(title = " Percentage of latino population by tract in Lehigh County, 2022",
caption = "Data source: 2022 5-year ACS, US Census Bureau",
fill = "Percentage") +
theme_void()
The mapgl R package allows users to create interactive maps in R using the Mapbox GL JS and MapLibre GL JS libraries:
Features Create globe visualizations, layer objects to make filled maps, circle maps, heatmaps, and 3D graphics, and customize map styles and views.
Ease of use Designed to be intuitive for R users while still offering the capabilities of the Mapbox GL JS and MapLibre GL JS libraries
Flexibility Allows for more code to be written when making maps, but also gives users more flexibility in how they design their maps
Shiny web applications Includes utilities to use Mapbox and MapLibre maps in Shiny web applications
pa_map |>
add_fill_layer(
id = "pa_latino",
source = latino_table_PA,
fill_color = interpolate(
column = "latino_percentage",
values = c(1, 40),
stops = c("lightyellow", "darkorange"),
na_color = "lightgrey"
),
fill_opacity = 0.7
) |>
add_legend(
"Percentage of Latino Population, 2022",
values = c(1, 40),
colors = c("lightyellow", "darkorange")
)
latino_table_PA$popup <- glue::glue(
"<strong>County: </strong>{latino_table_PA$NAME}<br><strong>Percentage: </strong>{sprintf('%.2f', latino_table_PA$latino_percentage)}%")
brewer_pal <- RColorBrewer::brewer.pal(6, "YlGnBu")
pa_map |>
add_fill_layer(
id = "pa_latino",
source = latino_table_PA,
fill_color = step_expr(
column = "latino_percentage",
base = brewer_pal[1],
stops = brewer_pal[1:6],
values = seq(0.8, 27.2, length.out = 6),
na_color = "white"
),
fill_opacity = 0.5,
popup = "popup",
tooltip = "latino_percentage",
hover_options = list(
fill_color = "yellow",
fill_opacity = 1
)
) |>
add_legend(
"Latino Population, 2022",
values = c(
"Less than 5%",
"5%-10%",
"10%-15%",
"15%-20%",
"20%-25%",
"More than 25%"
),
colors = brewer_pal,
type = "categorical"
) |>
add_navigation_control()