Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible rounding issue in tidycensus micro data in the wage and earning variables. #593

Open
markbauby opened this issue Dec 3, 2024 · 1 comment

Comments

@markbauby
Copy link

Ive noticed a possible issue in regard to microdata through tidycensus and the wage data.

Long story short I have been working on micro data with IPUMSR and tidycensus packages and while I can get close to similiar results it looks like some of the variables within tidycensus are rounded. Specifically the "WAGP" and "PERNP" variables. While their equilavents in IPMUSR ("INCWAGE" and "INCEARN") are not.

Is this a bug/error in tidycensus or is it from user error on my part?

My code is below.

###IPUMSR segment
ipums_extract_test <- define_extract_micro(
collection = "usa",
description = "USA extract for API vignette",
samples = c("us2022c"),
variables = c("AGE", "STATEFIP", "EMPSTAT", "INCWAGE", "INCEARN", "us2022c_schl"))

ipums_data <- ipums_extract_test %>%
submit_extract() %>%
wait_for_extract() %>%
download_extract() %>%
read_ipums_micro()

ipums_test <- ipums_data %>%
filter(STATEFIP == 26 & AGE >= 16 & EMPSTAT == 1 & US2022C_SCHL %in% 1:21)

####tidycensus segment

tidy_test <- get_pums(
year = 2022,
survey = "acs5",
state = "MI",
variables = c("AGEP", "ESR", "WAGP", "PERNP", "SCHL")
) %>%
filter(AGEP >= 16 & (ESR == 1 | ESR == 2 | ESR == 4 |ESR == 5) & SCHL %in% 1:21)

Thank you.

@walkerke
Copy link
Owner

walkerke commented Dec 4, 2024

Interesting. We don't do any post-processing of the PUMS data like that in tidycensus, so whatever you're seeing is what's coming through the Census API.

You can take a look here: https://api.census.gov/data/2022/acs/acs5/pums?get=SERIALNO%2CSPORDER%2CWGTP%2CPWGTP%2CAGEP%2CESR%2CWAGP%2CPERNP%2CSCHL&ucgid=0400000US26

Perhaps the IPUMS team does some post-processing of the data?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants