-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a field
without data breaks larger occurrences download
#930
Comments
Relates to support ticket: https://support.ehelp.edu.au/a/tickets/209037 |
OK thanks @adam-collins, that makes sense. It also tallies with our workflows; we only allow users to query fields that are listed in While we're doing that it might make sense to have a spring clean of other content too. The first three fields listed are |
Post cleanup of
To differentiate between the two
|
This is based on an issue identified using galah here. Basically, when we select a field in our occurrence download, for a query where no records have data in that field, the whole download fails. I've put @daxkellie's summary of the problem below.
To walk through the problem, the following query asks for counts of Acacia aneura grouped by
scientficName
:https://api.ala.org.au/occurrences/occurrences/facets?fq=%28year%3A%222002%22%29AND%28lsid%3A%22https%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F6707550%22%29&qualityProfile=ALA&facets=scientificName&fsort=count&flimit=10000
It returns this:
[{"fieldName":"scientificName","fieldResult":[{"label":"Acacia aneura","i18nCode":"scientificName.Acacia aneura","count":80,"fq":"scientificName:\"Acacia aneura\""},{"label":"Acacia aneura var. major","i18nCode":"scientificName.Acacia aneura var. major","count":6,"fq":"scientificName:\"Acacia aneura var. major\""},{"label":"Acacia aneura var. aneura","i18nCode":"scientificName.Acacia aneura var. aneura","count":1,"fq":"scientificName:\"Acacia aneura var. aneura\""}],"count":3}]
Which is great. By changing
facets
tolocation
, we get no records, suggesting that this field is empty:https://api.ala.org.au/occurrences/occurrences/facets?fq=%28year%3A%222002%22%29AND%28lsid%3A%22https%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F6707550%22%29&qualityProfile=ALA&facets=location&fsort=count&flimit=10000
Again, fine. We then format request as an occurrence download, including a number of fields including
location
:"https://biocache-ws.ala.org.au/ws/occurrences/offline/download?fq=%28year%3A%222002%22%29AND%28lsid%3A%22https%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F6707550%22%29&qualityProfile=ALA&fields=recordID%2CscientificName%2CvernacularName%2Ckingdom%2CeventDate%2CsamplingProtocol%2CindividualCount%2CrecordedBy%2Clocation&qa=none&facet=false&emailNotify=false&sourceTypeId=2004&reasonTypeId=4&email=martinjwestgate%40gmail.com&dwcHeaders=true"
This runs, stating we expect to receive 87 records:
{"status":"inQueue","totalRecords":87,"queueSize":1,"statusUrl":"https://biocache-ws.ala.org.au/ws/occurrences/offline/status/bb0481b1-8b95-33af-9a6d-16c7aa24f0f1-1729125651839","cancelUrl":"https://biocache-ws.ala.org.au/ws/occurrences/offline/cancel/bb0481b1-8b95-33af-9a6d-16c7aa24f0f1-1729125651839","searchUrl":"https://biocache.ala.org.au/occurrences/search?&q=*%3A*&fq=%28year%3A%222002%22%29AND%28lsid%3A%22https%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F6707550%22%29&disableAllQualityFilters=true&fq=-basisOfRecord%3A%22FOSSIL_SPECIMEN%22+AND+-%28basisOfRecord%3A%22MATERIAL_SAMPLE%22+AND+contentTypes%3A%22Environmental+DNA%22%29&fq=-%28duplicate_status%3A%22ASSOCIATED%22+AND+duplicateType%3A%22DIFFERENT_DATASET%22%29&fq=-assertions%3ATAXON_MATCH_NONE+AND+-assertions%3AINVALID_SCIENTIFIC_NAME+AND+-assertions%3ATAXON_HOMONYM+AND+-assertions%3AUNKNOWN_KINGDOM+AND+-assertions%3ATAXON_SCOPE_MISMATCH&fq=-occurrenceStatus%3AABSENT&fq=-identificationVerificationStatus%3A%22needs_id%22&fq=-userAssertions%3A50001+AND+-userAssertions%3A50005&fq=-year%3A%5B*+TO+1700%5D&fq=-establishmentMeans%3A%22MANAGED%22+AND+-decimalLatitude%3A0+AND+-decimalLongitude%3A0+AND+-assertions%3A%22PRESUMED_SWAPPED_COORDINATE%22+AND+-assertions%3A%22COORDINATES_CENTRE_OF_STATEPROVINCE%22+AND+-assertions%3A%22COORDINATES_CENTRE_OF_COUNTRY%22+AND+-assertions%3A%22PRESUMED_NEGATED_LATITUDE%22+AND+-assertions%3A%22PRESUMED_NEGATED_LONGITUDE%22&fq=-outlierLayerCount%3A%5B3+TO+*%5D&fq=-spatiallyValid%3A%22false%22&fq=-coordinateUncertaintyInMeters%3A%5B10001+TO+*%5D"}
Finally, the resulting Zip file (
https://biocache.ala.org.au/biocache-download/bb0481b1-8b95-33af-9a6d-16c7aa24f0f1/1729125651839/data.zip"
) has no data in it. What we would expect instead would be for all the requested fields to be downloaded, but with onlyNA
s in thelocation
column.The text was updated successfully, but these errors were encountered: