Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Is data available through the API the same as in the "qs.Crops" file? #38

Closed
danielreispereira opened this issue Mar 11, 2024 · 3 comments

Comments

@danielreispereira
Copy link

The following works and returns the data for 2023 and 2024(if avail):

params <- list(commodity_desc = "CORN", year__GE = 2023, agg_level_desc = "AGRICULTURAL DISTRICT")

If I try to get the same data from the files at

[https://www.nass.usda.gov/datasets/](https://www.nass.usda.gov/datasets/qs.crops_20240309.txt.gz)

recent data does not appear to be available (>2022). Even though the docs suggest it should.

Is the API querying the same sources as the raw files? Or this is an expected result ?

@potterzot
Copy link
Collaborator

I don't have information on exactly what files QuickStats queries vs the downloaded files. My assumption has been that the queries are made on database versions that are equal and may even directly import the downloadable files, or that the downloadable files are exported directly from the QuickStats database.

But I don't find data for your query in either the downloaded data or via rnassqs:

params <- list(commodity_desc = "CORN", year__GE = 2023, agg_level_desc = "AGRICULTURAL DISTRICT")
rnassqs::record_count(params)

> $count
> [1] = 0
> 
> $Value
> numeric(0)

# Get the data
d <- rnassqs::nassqs(params)

> Error: HTTP Failure: 400
> bad request - invalid query

Is this due to agg_level_desc = AGRICULTURAL DISTRICT? If I run the same query for the state of Iowa:

params <- list(commodity_desc = "CORN", 
                       year__GE = 2023, 
                       agg_level_desc = "STATE",
                       state_alpha = "IA")
rnassqs::record_count(params)

> $count
> [1] = 653
> 
> $Value
> numeric(0)

I also get 0 records for agg_level_desc = AGRICULTURAL DISTRICT and 653 records for agg_level_desc = STATE and state_alpha = IA if I manually use the Quick Stats interface.

I don't currently have a strong enough computer to load the entirety of the downloadable file, but in 50,000 records I find years only up to 2021. You may want to follow up directly with NASS on this question. The rnassqs package is not associated with NASS and they don't provide information in their documentation on the differences between the downloadable files and the Quick Stats database, so I could only guess as to the differences. If you find out something interesting please let me know!

@potterzot
Copy link
Collaborator

I can confirm that the downloadable file doesn't have any data after 2022, but I couldn't say why. I'm closing this issue but please let me know if something changes that I can incorporate into the rnassqs package.

@danielreispereira
Copy link
Author

Thanks @potterzot -- also contacted the staff at NASS: data for some crops is not available in the qs file for 2022, 2023, and 2024. The API is then the safest way to retrieve recent data. Appreciate your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants