Tutorial 5: CPS Microdata¶
The Current Population Survey (CPS) supplements provide specialized microdata on topics like tobacco use, voting, and food security.
Goal: Analyze tobacco use patterns across states using multi-year CPS data.
Setup¶
In [1]:
Copied!
import os
from cendat import CenDatHelper
from dotenv import load_dotenv
load_dotenv()
cdh = CenDatHelper(key=os.getenv("CENSUS_API_KEY"))
# Get multiple years of CPS Tobacco Use Supplement
cdh.list_products(years=[2022, 2023], patterns="/cps/tobacco")
cdh.set_products()
import os
from cendat import CenDatHelper
from dotenv import load_dotenv
load_dotenv()
cdh = CenDatHelper(key=os.getenv("CENSUS_API_KEY"))
# Get multiple years of CPS Tobacco Use Supplement
cdh.list_products(years=[2022, 2023], patterns="/cps/tobacco")
cdh.set_products()
✅ API key loaded successfully.
✅ Product set: 'Current Population Survey: Tobacco Use Supplement (2022/cps/tobacco/sep)' (Vintage: [2022]) ✅ Product set: 'Current Population Survey: Tobacco Use Supplement (2023/cps/tobacco/jan)' (Vintage: [2023]) ✅ Product set: 'Current Population Survey: Tobacco Use Supplement (2023/cps/tobacco/may)' (Vintage: [2023])
Step 2: Explore and Select Variables¶
In [2]:
Copied!
# See available variable groups
cdh.list_groups()
# See available variable groups
cdh.list_groups()
Out[2]:
[]
In [3]:
Copied!
# Select specific variables
# PEA1, PEA3: Tobacco use questions
# PWNRWGT: Person weight
cdh.set_variables(["PEA1", "PEA3", "PWNRWGT"])
cdh.set_geos("state", "desc")
# Select specific variables
# PEA1, PEA3: Tobacco use questions
# PWNRWGT: Person weight
cdh.set_variables(["PEA1", "PEA3", "PWNRWGT"])
cdh.set_geos("state", "desc")
✅ Variables set:
- Product: Current Population Survey: Tobacco Use Supplement (2022/cps/tobacco/sep) (Vintage: [2022])
Variables: PEA1, PEA3, PWNRWGT
- Product: Current Population Survey: Tobacco Use Supplement (2023/cps/tobacco/jan) (Vintage: [2023])
Variables: PEA1, PEA3, PWNRWGT
- Product: Current Population Survey: Tobacco Use Supplement (2023/cps/tobacco/may) (Vintage: [2023])
Variables: PEA1, PEA3, PWNRWGT
✅ Geographies set: 'state'
Step 3: Get Data¶
In [4]:
Copied!
response = cdh.get_data(within={"state": ["06", "48"]})
response = cdh.get_data(within={"state": ["06", "48"]})
✅ Parameters created for 3 geo-variable/group combinations.
✅ Data fetching complete. Stacking results.
Step 4: Analyze with Pooled Weights¶
When combining multiple survey years, divide the weights:
In [5]:
Copied!
response.tabulate(
"PEA1",
"PEA3",
strat_by="state",
weight_var="PWNRWGT",
weight_div=3 # Divide weight for pooled years
)
response.tabulate(
"PEA1",
"PEA3",
strat_by="state",
weight_var="PWNRWGT",
weight_div=3 # Divide weight for pooled years
)
shape: (43, 7) ┌───────┬──────┬──────┬──────────────┬──────┬──────────────┬────────┐ │ state ┆ PEA1 ┆ PEA3 ┆ n ┆ pct ┆ cumn ┆ cumpct │ ╞═══════╪══════╪══════╪══════════════╪══════╪══════════════╪════════╡ │ 48 ┆ -3 ┆ -3 ┆ 12,151.4 ┆ 0.1 ┆ 12,151.4 ┆ 0.1 │ │ 48 ┆ -3 ┆ -2 ┆ 7,862.0 ┆ 0.0 ┆ 20,013.4 ┆ 0.1 │ │ 48 ┆ -3 ┆ -1 ┆ 31,283.4 ┆ 0.1 ┆ 51,296.9 ┆ 0.2 │ │ 48 ┆ -3 ┆ 3 ┆ 6,011.1 ┆ 0.0 ┆ 57,308.0 ┆ 0.3 │ │ 48 ┆ -2 ┆ -9 ┆ 7,299.8 ┆ 0.0 ┆ 64,607.8 ┆ 0.3 │ │ 48 ┆ -2 ┆ -2 ┆ 1,871.9 ┆ 0.0 ┆ 66,479.6 ┆ 0.3 │ │ 48 ┆ -2 ┆ -1 ┆ 42,634.2 ┆ 0.2 ┆ 109,113.9 ┆ 0.5 │ │ 48 ┆ -2 ┆ 2 ┆ 2,698.4 ┆ 0.0 ┆ 111,812.3 ┆ 0.5 │ │ 48 ┆ -2 ┆ 3 ┆ 11,474.7 ┆ 0.1 ┆ 123,287.0 ┆ 0.6 │ │ 48 ┆ -1 ┆ -1 ┆ 0.0 ┆ 0.0 ┆ 123,287.0 ┆ 0.6 │ │ 48 ┆ 1 ┆ -9 ┆ 12,524.6 ┆ 0.1 ┆ 135,811.6 ┆ 0.6 │ │ 48 ┆ 1 ┆ -3 ┆ 6,074.9 ┆ 0.0 ┆ 141,886.5 ┆ 0.6 │ │ 48 ┆ 1 ┆ -2 ┆ 2,451.6 ┆ 0.0 ┆ 144,338.1 ┆ 0.6 │ │ 48 ┆ 1 ┆ 1 ┆ 1,230,269.3 ┆ 5.5 ┆ 1,374,607.4 ┆ 6.2 │ │ 48 ┆ 1 ┆ 2 ┆ 521,851.7 ┆ 2.3 ┆ 1,896,459.0 ┆ 8.5 │ │ 48 ┆ 1 ┆ 3 ┆ 2,993,540.7 ┆ 13.4 ┆ 4,889,999.7 ┆ 21.9 │ │ 48 ┆ 2 ┆ -9 ┆ 9,233.3 ┆ 0.0 ┆ 4,899,233.0 ┆ 21.9 │ │ 48 ┆ 2 ┆ -1 ┆ 11,019,066.7 ┆ 49.3 ┆ 15,918,299.7 ┆ 71.2 │ │ 48 ┆ 2 ┆ 2 ┆ 63,562.7 ┆ 0.3 ┆ 15,981,862.4 ┆ 71.5 │ │ 48 ┆ 2 ┆ 3 ┆ 6,364,974.8 ┆ 28.5 ┆ 22,346,837.2 ┆ 100.0 │ │ 6 ┆ -9 ┆ -1 ┆ 2,588.4 ┆ 0.0 ┆ 2,588.4 ┆ 0.0 │ │ 6 ┆ -3 ┆ -3 ┆ 4,518.2 ┆ 0.0 ┆ 7,106.6 ┆ 0.0 │ │ 6 ┆ -3 ┆ -1 ┆ 38,037.9 ┆ 0.1 ┆ 45,144.6 ┆ 0.2 │ │ 6 ┆ -3 ┆ 3 ┆ 3,679.8 ┆ 0.0 ┆ 48,824.4 ┆ 0.2 │ │ 6 ┆ -2 ┆ -9 ┆ 2,126.6 ┆ 0.0 ┆ 50,951.0 ┆ 0.2 │ │ 6 ┆ -2 ┆ -3 ┆ 1,726.2 ┆ 0.0 ┆ 52,677.2 ┆ 0.2 │ │ 6 ┆ -2 ┆ -2 ┆ 13,550.3 ┆ 0.0 ┆ 66,227.5 ┆ 0.2 │ │ 6 ┆ -2 ┆ -1 ┆ 100,758.4 ┆ 0.3 ┆ 166,985.9 ┆ 0.6 │ │ 6 ┆ -2 ┆ 3 ┆ 16,578.8 ┆ 0.1 ┆ 183,564.7 ┆ 0.6 │ │ 6 ┆ -1 ┆ -1 ┆ 0.0 ┆ 0.0 ┆ 183,564.7 ┆ 0.6 │ │ 6 ┆ 1 ┆ -9 ┆ 11,301.9 ┆ 0.0 ┆ 194,866.6 ┆ 0.6 │ │ 6 ┆ 1 ┆ -3 ┆ 14,985.0 ┆ 0.0 ┆ 209,851.7 ┆ 0.7 │ │ 6 ┆ 1 ┆ -2 ┆ 14,227.3 ┆ 0.0 ┆ 224,078.9 ┆ 0.7 │ │ 6 ┆ 1 ┆ 1 ┆ 1,054,186.4 ┆ 3.5 ┆ 1,278,265.3 ┆ 4.3 │ │ 6 ┆ 1 ┆ 2 ┆ 576,434.2 ┆ 1.9 ┆ 1,854,699.5 ┆ 6.2 │ │ 6 ┆ 1 ┆ 3 ┆ 3,532,156.8 ┆ 11.8 ┆ 5,386,856.3 ┆ 18.0 │ │ 6 ┆ 2 ┆ -9 ┆ 18,986.8 ┆ 0.1 ┆ 5,405,843.1 ┆ 18.0 │ │ 6 ┆ 2 ┆ -3 ┆ 4,104.9 ┆ 0.0 ┆ 5,409,947.9 ┆ 18.0 │ │ 6 ┆ 2 ┆ -2 ┆ 7,485.7 ┆ 0.0 ┆ 5,417,433.7 ┆ 18.1 │ │ 6 ┆ 2 ┆ -1 ┆ 16,344,706.7 ┆ 54.5 ┆ 21,762,140.4 ┆ 72.5 │ │ 6 ┆ 2 ┆ 1 ┆ 13,137.2 ┆ 0.0 ┆ 21,775,277.6 ┆ 72.6 │ │ 6 ┆ 2 ┆ 2 ┆ 74,384.9 ┆ 0.2 ┆ 21,849,662.4 ┆ 72.8 │ │ 6 ┆ 2 ┆ 3 ┆ 8,153,180.9 ┆ 27.2 ┆ 30,002,843.3 ┆ 100.0 │ └───────┴──────┴──────┴──────────────┴──────┴──────────────┴────────┘