Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt freq in training #1407

Merged
merged 36 commits into from
Oct 18, 2023
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
52361ae
Adapt_freq called in training (fix)
coxipi Apr 27, 2023
e096c8f
LOCI has adapt_freq_thresh argument
coxipi May 8, 2023
58d5b66
adapt_freq_thresh option in Scaling
coxipi May 29, 2023
d1d96fe
Merge branch 'master' of https://github.com/Ouranosinc/xclim into ada…
coxipi May 29, 2023
0471767
Merge branch 'master' of https://github.com/Ouranosinc/xclim into ada…
coxipi Jun 29, 2023
1be0d1d
CHANGES update
coxipi Jun 29, 2023
b304326
Better description _adapt_freq_s
coxipi Jun 29, 2023
3162c32
more comments
coxipi Jun 29, 2023
55ec622
pull number
coxipi Jun 29, 2023
f6ce69b
Remove adapt_freq_thresh, useless in LOCI and Scaling
coxipi Jun 29, 2023
c95e3c9
remove unused import
coxipi Jun 29, 2023
79bdc32
incorporate changes from _adapt_freq_s to _adapt_freq
coxipi Jun 29, 2023
e9a64ae
Revert useless formatting changes
coxipi Jun 29, 2023
36ef886
Only used stacked ranking if len(dim)>1
coxipi Jun 29, 2023
79f7ca7
Merge branch 'master' of https://github.com/Ouranosinc/xclim into ada…
coxipi Jun 30, 2023
2d118cd
Example explaining use of adapt_freq_thresh
coxipi Aug 2, 2023
341e53d
move rolling_window example the advanced notebook
coxipi Aug 2, 2023
40f07ca
Merge branch 'master' into adapt_freq_in_training
Zeitsperre Aug 2, 2023
09b3fb8
Merge branch 'master' into adapt_freq_in_training
coxipi Aug 17, 2023
e594905
Merge branch 'master' into adapt_freq_in_training
Zeitsperre Aug 25, 2023
ad43e39
Merge branch 'master' into adapt_freq_in_training
Zeitsperre Aug 28, 2023
22f4eb8
Merge branch 'master' into adapt_freq_in_training
Zeitsperre Sep 5, 2023
d3af806
Merge branch 'master' of github.com:Ouranosinc/xclim into adapt_freq_…
coxipi Oct 16, 2023
75eee24
Docstrings for `adapt_freq_thresh`
coxipi Oct 16, 2023
ff97686
rank now supports multi-dimensional ranking
coxipi Oct 16, 2023
94cffa2
add the modifications of rank (duh)
coxipi Oct 16, 2023
3e2e8a3
Merge branch 'adapt_freq_in_training' of github.com:Ouranosinc/xclim …
coxipi Oct 16, 2023
5aa31da
Update CHANGES.rst
coxipi Oct 16, 2023
67f7c95
Merge branch 'master' into adapt_freq_in_training
Zeitsperre Oct 16, 2023
1c84d4a
rank variable assignment forgotten
coxipi Oct 17, 2023
7923083
Merge branch 'master' of github.com:Ouranosinc/xclim into adapt_freq_…
coxipi Oct 17, 2023
944b628
revert changes in test
coxipi Oct 17, 2023
650d118
clean notebook, strip 'metadata.kernelspec.display_name' in pre-commit
coxipi Oct 17, 2023
f932a9c
really revert change in test
coxipi Oct 17, 2023
13e2702
try removing all kernelspec
coxipi Oct 17, 2023
2b7debd
remove kernelspec in nb metadata
coxipi Oct 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Changelog

v0.46.0 (unreleased)
--------------------
Contributors to this version: Éric Dupuis (:user:`coxipi`), Trevor James Smith (:user:`Zeitsperre`), David Huard (:user:`huard`) and Pascal Bourgault (:user:`aulemahal`).
Contributors to this version: David Huard (:user:`huard`), Trevor James Smith (:user:`Zeitsperre`), Pascal Bourgault (:user:`aulemahal`), Éric Dupuis (:user:`coxipi`).

New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -13,6 +13,7 @@ New features and enhancements
* The testing suite now offers a means of running tests in "offline" mode (using `pytest-socket <https://github.com/miketheman/pytest-socket>`_ to block external connections). This requires a local copy of `xclim-testdata` to be present in the user's home cache directory and for certain `pytest` options and markers to be set when invoked. For more information, see the contributing documentation section for `Running Tests in Offline Mode`. (:issue:`1468`, :pull:`1473`).
* The `SKIP_NOTEBOOKS` flag to speed up docs builds is now documented. See the contributing documentation section `Get Started!` for details. (:issue:`1470`, :pull:`1476`).
* Refactored the indicators page with the addition of a search bar.
* `adapt_freq_thresh` argument added `sdba` training functions, allowing to perform frequency adaptation appropriately in each map block. (:pull:`1407`).

Bug fixes
^^^^^^^^^
Expand Down Expand Up @@ -46,6 +47,7 @@ Internal changes

v0.45.0 (2023-09-05)
--------------------
=======
coxipi marked this conversation as resolved.
Show resolved Hide resolved
Contributors to this version: David Huard (:user:`huard`), Trevor James Smith (:user:`Zeitsperre`), Pascal Bourgault (:user:`aulemahal`), Juliette Lavoie (:user:`juliettelavoie`), Gabriel Rondeau-Genesse (:user:`RondeauG`), Marco Braun (:user:`vindelico`), Éric Dupuis (:user:`coxipi`).

Announcements
Expand Down
121 changes: 119 additions & 2 deletions docs/notebooks/sdba-advanced.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -692,6 +692,118 @@
"plt.legend()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Frequency adaption with a rolling window"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the previous example, we performed bias adjustment with a rolling window. Here we show how to include frequency adaptation (see `sdba.ipynb` for the simple case `group=\"time\"`). We first generate the same precipitation dataset used in `sdba.ipynb`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import xarray as xr\n",
"\n",
"t = xr.cftime_range(\"2000-01-01\", \"2030-12-31\", freq=\"D\", calendar=\"noleap\")\n",
"\n",
"vals = np.random.randint(0, 1000, size=(t.size,)) / 100\n",
"vals_ref = (4 ** np.where(vals < 9, vals / 100, vals)) / 3e6\n",
"vals_sim = (\n",
" (1 + 0.1 * np.random.random_sample((t.size,)))\n",
" * (4 ** np.where(vals < 9.5, vals / 100, vals))\n",
" / 3e6\n",
")\n",
"\n",
"pr_ref = xr.DataArray(\n",
" vals_ref, coords={\"time\": t}, dims=(\"time\",), attrs={\"units\": \"mm/day\"}\n",
")\n",
"pr_ref = pr_ref.sel(time=slice(\"2000\", \"2015\"))\n",
"pr_sim = xr.DataArray(\n",
" vals_sim, coords={\"time\": t}, dims=(\"time\",), attrs={\"units\": \"mm/day\"}\n",
")\n",
"pr_hist = pr_sim.sel(time=slice(\"2000\", \"2015\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Bias adjustment on a rolling window can be performed in the same way as shown in `sdba.ipynb`, but instead of being a single string precising the time grouping (e.g. `time.month`), the `group` argument is built with `sdba.Grouper` function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"# adapt_freq with a sdba.Grouper\n",
"from xclim import sdba\n",
"\n",
"group = sdba.Grouper(\"time.dayofyear\", window=31)\n",
"sim_ad, pth, dP0 = sdba.processing.adapt_freq(\n",
" pr_ref, pr_sim, thresh=\"0.05 mm d-1\", group=group\n",
")\n",
"QM_ad = sdba.EmpiricalQuantileMapping.train(\n",
" pr_ref, sim_ad, nquantiles=15, kind=\"*\", group=group\n",
")\n",
"scen_ad = QM_ad.adjust(pr_sim)\n",
"\n",
"pr_ref.sel(time=\"2010\").plot(alpha=0.9, label=\"Reference\")\n",
"pr_sim.sel(time=\"2010\").plot(alpha=0.7, label=\"Model - biased\")\n",
"scen_ad.sel(time=\"2010\").plot(alpha=0.6, label=\"Model - adjusted\")\n",
"plt.legend()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the figure above, `scen` occasionally has small peaks where `sim` is 0, indicating that there are more \"dry days\" (days with almost no precipitation) in `hist` than in `ref`. The frequency-adaptation [Themeßl et al. (2010)](https://doi.org/10.1007/s10584-011-0224-4) performed in the step above only worked partially. \n",
"\n",
"The reason for this is the following. The first step above combines precipitations in 365 overlapping blocks of 31 days * Y years, one block for each day of the year. Each block is adapted, and the 16th day-of-year slice (at the center of the block) is assigned to the corresponding day-of-year in the adapted dataset `sim_ad`. As we proceed to the training, we re-form those 31 days * Y years blocks, but this step does not invert the last one: There can still be more zeroes in the simulation than in the reference. \n",
"\n",
"To alleviate this issue, another way of proceeding is to perform a frequency adaptation on the blocks, and then use the same blocks in the training step, as we show below.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# adapt_freq directly in the training step\n",
"group = sdba.Grouper(\"time.dayofyear\", window=31)\n",
"\n",
"QM_ad = sdba.EmpiricalQuantileMapping.train(\n",
" pr_ref,\n",
" sim_ad,\n",
" nquantiles=15,\n",
" kind=\"*\",\n",
" group=group,\n",
" adapt_freq_thresh=\"0.05 mm d-1\",\n",
")\n",
"scen_ad = QM_ad.adjust(pr_sim)\n",
"\n",
"pr_ref.sel(time=\"2010\").plot(alpha=0.9, label=\"Reference\")\n",
"pr_sim.sel(time=\"2010\").plot(alpha=0.7, label=\"Model - biased\")\n",
"scen_ad.sel(time=\"2010\").plot(alpha=0.6, label=\"Model - adjusted\")\n",
"plt.legend()"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -835,7 +947,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand All @@ -849,7 +961,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.11.4"
},
"toc": {
"base_numbering": 1,
Expand All @@ -863,6 +975,11 @@
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"vscode": {
"interpreter": {
"hash": "7bfb2ff12c8d2ac85e75a42433bff767a429e1b9def8e10da354335359f03dc7"
}
}
},
"nbformat": 4,
Expand Down
9 changes: 7 additions & 2 deletions docs/notebooks/sdba.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -677,7 +677,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand All @@ -691,7 +691,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.11.4"
},
"toc": {
"base_numbering": 1,
Expand All @@ -705,6 +705,11 @@
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"vscode": {
"interpreter": {
"hash": "7bfb2ff12c8d2ac85e75a42433bff767a429e1b9def8e10da354335359f03dc7"
}
}
},
"nbformat": 4,
Expand Down
33 changes: 28 additions & 5 deletions xclim/sdba/_adjustment.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,38 +9,56 @@
import numpy as np
import xarray as xr

from xclim.core.units import convert_units_to, infer_context, units
from xclim.indices.stats import _fitfunc_1d # noqa

from . import nbutils as nbu
from . import utils as u
from ._processing import _adapt_freq
from .base import Grouper, map_blocks, map_groups
from .detrending import PolyDetrend
from .processing import escore


def _adapt_freq_hist(ds, adapt_freq_thresh):
"""Adapt frequency of null values of `hist` in order to match `ref`."""
with units.context(infer_context(ds.ref.attrs.get("standard_name"))):
thresh = convert_units_to(adapt_freq_thresh, ds.ref)
dim = ["time"] + ["window"] * ("window" in ds.hist.dims)
return _adapt_freq.func(
xr.Dataset(dict(sim=ds.hist, ref=ds.ref)), thresh=thresh, dim=dim
).sim_ad


@map_groups(
af=[Grouper.PROP, "quantiles"],
hist_q=[Grouper.PROP, "quantiles"],
scaling=[Grouper.PROP],
)
def dqm_train(ds, *, dim, kind, quantiles) -> xr.Dataset:
def dqm_train(ds, *, dim, kind, quantiles, adapt_freq_thresh) -> xr.Dataset:
"""Train step on one group.

Notes
-----
Dataset must contain the following variables:
ref : training target
hist : training data

adapt_freq_thresh : str | None
Threshold for frequency adaptation. See :py:class:`xclim.sdba.processing.adapt_freq` for details.
Default is None, meaning that frequency adaptation is not performed.
"""
hist = _adapt_freq_hist(ds, adapt_freq_thresh) if adapt_freq_thresh else ds.hist

refn = u.apply_correction(ds.ref, u.invert(ds.ref.mean(dim), kind), kind)
histn = u.apply_correction(ds.hist, u.invert(ds.hist.mean(dim), kind), kind)
histn = u.apply_correction(hist, u.invert(hist.mean(dim), kind), kind)

ref_q = nbu.quantile(refn, quantiles, dim)
hist_q = nbu.quantile(histn, quantiles, dim)

af = u.get_correction(hist_q, ref_q, kind)
mu_ref = ds.ref.mean(dim)
mu_hist = ds.hist.mean(dim)
mu_hist = hist.mean(dim)
scaling = u.get_correction(mu_hist, mu_ref, kind=kind)

return xr.Dataset(data_vars=dict(af=af, hist_q=hist_q, scaling=scaling))
Expand All @@ -50,15 +68,20 @@ def dqm_train(ds, *, dim, kind, quantiles) -> xr.Dataset:
af=[Grouper.PROP, "quantiles"],
hist_q=[Grouper.PROP, "quantiles"],
)
def eqm_train(ds, *, dim, kind, quantiles) -> xr.Dataset:
def eqm_train(ds, *, dim, kind, quantiles, adapt_freq_thresh) -> xr.Dataset:
"""EQM: Train step on one group.

Dataset variables:
ref : training target
hist : training data

adapt_freq_thresh : str | None
Threshold for frequency adaptation. See :py:class:`xclim.sdba.processing.adapt_freq` for details.
Default is None, meaning that frequency adaptation is not performed.
"""
hist = _adapt_freq_hist(ds, adapt_freq_thresh) if adapt_freq_thresh else ds.hist
ref_q = nbu.quantile(ds.ref, quantiles, dim)
hist_q = nbu.quantile(ds.hist, quantiles, dim)
hist_q = nbu.quantile(hist, quantiles, dim)

af = u.get_correction(hist_q, ref_q, kind)

Expand Down
14 changes: 4 additions & 10 deletions xclim/sdba/_processing.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,11 @@

import numpy as np
import xarray as xr
from xarray.core.utils import get_temp_dimname

from . import nbutils as nbu
from .base import Grouper, map_groups
from .utils import ADDITIVE, apply_correction, ecdf, invert
from .utils import ADDITIVE, apply_correction, ecdf, invert, rank


@map_groups(
Expand Down Expand Up @@ -74,16 +75,9 @@ def _adapt_freq(
pth = nbu.vecquantiles(ds.ref, P0_sim, dim).where(dP0 > 0)

# Probabilities and quantiles computed within all dims, but correction along the first one only.
if "window" in dim:
# P0_sim was computed using the window, but only the original time series is corrected.
# Grouper.apply does this step, but if done here it makes the code faster.
sim = ds.sim.isel(window=(ds.sim.window.size - 1) // 2)
else:
sim = ds.sim
dim = dim[0]

sim = ds.sim
# Get the percentile rank of each value in sim.
rank = sim.rank(dim, pct=True)
rank(sim, dim=dim, pct=True)

# Frequency-adapted sim
sim_ad = sim.where(
Expand Down
10 changes: 10 additions & 0 deletions xclim/sdba/adjustment.py
Original file line number Diff line number Diff line change
Expand Up @@ -328,6 +328,9 @@ class EmpiricalQuantileMapping(TrainAdjust):
group : Union[str, Grouper]
The grouping information. See :py:class:`xclim.sdba.base.Grouper` for details.
Default is "time", meaning an single adjustment group along dimension "time".
adapt_freq_thresh : str | None
Threshold for frequency adaptation. See :py:class:`xclim.sdba.processing.adapt_freq` for details.
Default is None, meaning that frequency adaptation is not performed.

Adjust step:

Expand All @@ -352,6 +355,7 @@ def _train(
nquantiles: int | np.ndarray = 20,
kind: str = ADDITIVE,
group: str | Grouper = "time",
adapt_freq_thresh: str | None = None,
coxipi marked this conversation as resolved.
Show resolved Hide resolved
):
if np.isscalar(nquantiles):
quantiles = equally_spaced_nodes(nquantiles).astype(ref.dtype)
Expand All @@ -363,6 +367,7 @@ def _train(
group=group,
kind=kind,
quantiles=quantiles,
adapt_freq_thresh=adapt_freq_thresh,
)

ds.af.attrs.update(
Expand Down Expand Up @@ -417,6 +422,9 @@ class DetrendedQuantileMapping(TrainAdjust):
group : Union[str, Grouper]
The grouping information. See :py:class:`xclim.sdba.base.Grouper` for details.
Default is "time", meaning a single adjustment group along dimension "time".
adapt_freq_thresh : str | None
Threshold for frequency adaptation. See :py:class:`xclim.sdba.processing.adapt_freq` for details.
Default is None, meaning that frequency adaptation is not performed.

Adjust step:

Expand Down Expand Up @@ -445,6 +453,7 @@ def _train(
nquantiles: int | np.ndarray = 20,
kind: str = ADDITIVE,
group: str | Grouper = "time",
adapt_freq_thresh: str | None = None,
):
if group.prop not in ["group", "dayofyear"]:
warn(
Expand All @@ -461,6 +470,7 @@ def _train(
group=group,
quantiles=quantiles,
kind=kind,
adapt_freq_thresh=adapt_freq_thresh,
)

ds.af.attrs.update(
Expand Down
Loading
Loading