-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Odd behaviour when chunks are used along quantiles and time grouper dimensions in SDBA EQM. #1481
Comments
Hi @isaachcamp ! Thank you for raising this issue, this slipped through the testing... Sadly, the way
I'll make PR that adds an error with a meaningful message. |
That's great, thanks @aulemahal! |
<!--Please ensure the PR fulfills the following requirements! --> <!-- If this is your first PR, make sure to add your details to the AUTHORS.rst! --> ### Pull Request Checklist: - [x] This PR addresses an already opened issue (for bug fixes / features) - This PR fixes #1481 - [x] Tests for the changes have been added (for bug fixes / features) - [ ] (If applicable) Documentation has been added / updated (for bug fixes / features) - [x] CHANGES.rst has been updated (with summary of main changes) - [x] Link to issue (:issue:`number`) and pull request (:pull:`number`) has been added ### What kind of change does this PR introduce? If a `map_blocks`-wrapped function receives input chunked along the dimensions marked as "reduced", it raises an error. This means one cannot chunk the training dataset (`QM.ds`, for example) along the distribution ('quantiles') or group ('month', 'dayofyear') dimensions. Previously, this didn't raise an error, but the adjustment would be done separately for each chunk, yielding incorrect results, silently. ### Does this PR introduce a breaking change? No. ### Other information:
Generic Issue
Description
I'm attempting to bias correct precipitation data using the EmpiricalQuantileMapping class from the sdba module. The code runs with no problems, however, I noticed there was some strange results. When plotting the long-term monthly accumulated mean precipitation of adjusted data over a few grid cells, the corrected series looks considerably different to the reference data still -- in some cases worse than the raw data.
The transfer function was applied directly back on to the same years that were used in the training stage, so I would expect the adjusted data to look almost identical to the reference data (with some a smoothing from the monthly grouper and 3-month window).
I believe the cause stems from loading the QM.ds from a ZARR file and leaving the chunks along the quantiles and month dimensions unchanged. If I remove all chunks, or leave chunks along only latitude and longitude dimensions, I get the expected behaviour.
I've attached an image of the long-term monthly accumulated mean precipitation for a few grid cells. The yellow and purple lines (chunks along lat-lon and no chunks respectively) perfectly match, and they closely follow the reference values (VCSN, red line). The blue line shows the adjusted values with chunks along quantiles and month dimensions, and the green line shows the raw values.
I can recreate this issue by taking two time series as ref and hist, with the following lines:
I get this plot:
Code of Conduct
The text was updated successfully, but these errors were encountered: