library(tidycensus)
library(tidyverse)
library(viridis)
library(ggplot2)
library(scales)
<-2023
year
<-get_acs(
poverty_datageography="state"
variables=c(
,below_p5pov="C17002_002"
p5_1pov="C17002_003"
,total_population="C17002_001"
,
)year=year
,survey="acs5"
,%>%
) pivot_wider(
id_cols=c(GEOID,NAME)
names_from="variable"
,values_from="estimate"
,%>%
) filter(!GEOID %in% c(72,11)) %>%
mutate(
prop_below=(below_p5pov+p5_1pov)/total_population
year=.env$year
,%>%
) arrange(desc(prop_below))
Lists are, in many ways, R’s most powerful objects. They are vectors without type, general and flexible. You can fill them with anything—including other lists—and while this enables some truly useful complexity in our R-based processes, it can also make creating and working with lists daunting, especially for newer R programmers.
With this post I make no attempt to explain (let alone fully explain) lists in R. Instead I just hope to showcase one example of doing something fun (and maybe kinda useful-ish?) with them: storing and retrieving income disparity choropleth maps made with the Census API and ggplot2
. Everyone loves maps, right?!
Here’s what we’ll do:
- retrieve a state-level data frame with 2023 ACS 5-year poverty estimates via the Census API
- create the recode variable
prop_below
that represents the proportion of the state population with household income below the poverty limit - sort in descending order by
prop_below
- using the resulting data frame as a parameter file, iterate a custom function that
- retrieves county-level median income estimates and polygons for a given state via the Census API
- ranks the counties by their median income
- generates clean labels for the top and bottom counties
- creates a plot object containing the county choropleth map for the state
- returns a list containing the plot, the state abbreviation, and the state name
At this point, we will have a list with 50 elements—one for each state—each of which is a list containing the state-level plots plus state names and abbreviations. We will explore three ways to extract plots from this list.
First, we’ll load the necessary libraries and create our state-level parameter file.
Let’s look at the parameter file to see what our functional process has to work with.
poverty_data
# A tibble: 50 × 7
GEOID NAME total_population below_p5pov p5_1pov prop_below year
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 28 Mississippi 2851847 244374 299323 0.191 2023
2 22 Louisiana 4494539 390425 458344 0.189 2023
3 35 New Mexico 2073857 174355 201026 0.181 2023
4 54 West Virginia 1728580 131400 156260 0.166 2023
5 21 Kentucky 4382816 325901 381579 0.161 2023
6 05 Arkansas 2944742 209434 262349 0.160 2023
7 01 Alabama 4913932 352821 415364 0.156 2023
8 40 Oklahoma 3872738 276048 317772 0.153 2023
9 45 South Carolina 5072217 340381 379339 0.142 2023
10 48 Texas 29016925 1845809 2159608 0.138 2023
# ℹ 40 more rows
And here we fill our list: maplist
. To do that, we iterate over our parameter file with the pmap()
functional and an anonymous function containing the guts of our process.
<-poverty_data %>%
maplistpmap( #use pmap so we can provide df as parameter file
function(...){ #function takes in all variables in df because of dots param
<-rlang::list2(...) #extract all var values for current iteration into named list
parms
#save plot to p
<-get_acs( #api call returns county-level data with polygons
pgeography="county"
variables="B19013_001" #median income
,state=parms$GEOID #note use of parms list
,geometry=TRUE #include polygons
,year=parms$year #again here
,%>%
) mutate(
goodlabel=case_when(
rank(estimate,na.last=NA,ties.method="first")==1|
percent_rank(estimate)==1
~str_replace(NAME,"(.+)(,.+)","\\1") %>% str_remove(" County")
TRUE~NA_character_
,
)%>%
) ggplot()+
#geom for plotting shapefile polygons
geom_sf(
size=0.05
color="#000000"
,aes(fill=as.numeric(estimate))
,+
)geom_sf_label(
aes(label=goodlabel)
color="#000000"
,vjust=1
,+
)coord_sf(crs=4326)+
scale_fill_viridis_c(
option="viridis"
breaks=seq(0,200000,by=10000)
,labels=dollar
,+
)labs(
title=str_glue("{parms$NAME} Median Income by County"),
subtitle=str_glue(
"American Community Survey 5-Year Estimates {parms$year-4}-{parms$year}\n"
"Highest and Lowest Income Counties Labelled"
,
)+
)guides(fill=guide_colorbar("Median\nPast-Year\nHH Income"))+
theme_bw()+
theme_update(legend.key.height=unit(.35,"in"))
list("state_abb"=parms$state_abb,"state"=parms$NAME,"plot"=p)
}
)
At this point, maplist
has been populated, and we can extract plot objects from it. First, let’s try just returning the first element. Recall that because our parameter file was sorted highest to lowest in terms of the proportion of the state population with household income below the poverty limit, the first element of our list will contain a plot for the most impoverished state.
We can also walk
over the list to present ranges of its elements. Here we look at the 5 least impoverished states. Note that in this case we need an explicit print()
to force the plots out of the walk
functional environment.
Or we can extract the map corresponding to a specific state of interest. We can do this because we loaded each element of maplist
with a list containing both a plot object and state identifiers.
detect(maplist,~.x$state=="New York")$plot
detect(maplist,~.x$state=="Texas")$plot
detect(maplist,~.x$state=="California")$plot
In conclusion… maps are fun, and lists are useful!
Citation
@online{couzens2025,
author = {Couzens, Lance},
title = {Building and {Working} with a {List} of Ggplot {Objects} in
{R}},
date = {2025-03-22},
url = {https://mostlyunoriginal.github.io/posts/2025-03-22-Choropleths-and-LIst-Retrieval-Fun/},
langid = {en}
}