Heat wave magnitude indicator (#1926)

Closes #994 ### Pull Request Checklist: - [x] This PR addresses an already opened issue (for bug fixes / features) - This PR fixes #994 - [x] Tests for the changes have been added (for bug fixes / features) - [ ] (If applicable) Documentation has been added / updated (for bug fixes / features) - [x] CHANGELOG.rst has been updated (with summary of main changes) - [x] Link to issue (:issue:`number`) and pull request (:pull:`number`) has been added ### What kind of change does this PR introduce? * Adds a heat_wave_magnitude indicator based on a variable definition on heat waves (consecutive days of maximum temperature above a given threshold), which yields the highest magnitude in the form of accumulated maximum temperature differences above a threshold along a consecutive heat wave * Adds a windowed_max_run_sum function to indices.run_length which yields the maximum accumulated values of runs per `freq` period * Adds a rse (termed run sum encoding) function which sums positive values for runs ### Does this PR introduce a breaking change? No ### TODO: - [x] one of the test doesn't convert units properly - [x] french translation (need help with that one)
Ouranosinc · Oct 7, 2024 · 5263632 · 5263632
2 parents 0644f3a + bfa2576
commit 5263632
Show file tree

Hide file tree

Showing 11 changed files with 235 additions and 5 deletions.
diff --git a/.zenodo.json b/.zenodo.json
@@ -148,6 +148,11 @@
       "name": "Wang, Hui-Min",
       "affiliation": "National University of Singapore, Singapore, Singapore",
       "orcid": "0000-0002-5878-7542"
+    },
+    {
+      "name": "Lehner, Sebastian",
+      "affiliation": "GeoSphere Austria, Vienna, Austria",
+      "orcid": "0000-0002-7562-8172"
     }
   ],
   "keywords": [

diff --git a/AUTHORS.rst b/AUTHORS.rst
@@ -46,3 +46,4 @@ Contributors
 * Javier Diez-Sierra <[email protected]> `@JavierDiezSierra <https://github.com/JavierDiezSierra>`_
 * Hui-Min Wang `@Hem-W <https://github.com/Hem-W>`_
 * Adrien Lamarche `@LamAdr <https://github.com/LamAdr>`_
+* Sebastian Lehner <[email protected]> `@seblehner <https://github.com/seblehner>`_
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -14,12 +14,14 @@ Announcements
 New indicators
 ^^^^^^^^^^^^^^
 * New ``heat_spell_frequency``, ``heat_spell_max_length`` and ``heat_spell_total_length`` : spell length statistics on a bivariate condition that uses the average over a window by default. (:pull:`1885`).
+* New ``hot_spell_max_magnitude`` : yields the magnitude of the most intensive heat wave. (:pull:`1926`).
 * New ``chill_portion`` and ``chill_unit``: chill portion based on the Dynamic Model and chill unit based on the Utah model indicators. (:issue:`1753`, :pull:`1909`).
 
 New features and enhancements
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 * New generic ``xclim.indices.generic.spell_mask``  that returns a mask of which days are part of a spell. Supports multivariate conditions and weights. Used in new generic index ``xclim.indices.generic.bivariate_spell_length_statistics`` that extends ``spell_length_statistics`` to two variables.  (:pull:`1885`).
 * Indicator parameters can now be assigned a new name, different from the argument name in the compute function. (:pull:`1885`).
+* ``xclim.indices.run_length.windowed_max_run_sum`` accumulates positive values across runs and yields the the maximum valued run. (:pull:`1926`).
 * Helper function ``xclim.indices.helpers.make_hourly_temperature`` to estimate hourly temperatures from daily min and max temperatures. (:pull:`1909`).
 
 Bug fixes

diff --git a/docs/references.bib b/docs/references.bib
@@ -2232,7 +2232,7 @@ @article{fishman_chill_1987
 	abstract = {A two-step model describing the thermal dependence of the dormancy breaking phenomenon is developed. The model assumes that the level of dormancy completion is proportional to the amount of a certain dormancy breaking factor which accumulates in plants by a two-step process. The first step represents a reversible process of formation of a precursor for the dormancy breaking factor at low temperatures and its destruction at high temperatures. The rate constants of this process are assumed to be dependent upon the temperature according to the Arrhenius law. The second step is an irreversible cooperative transition from the unstable precursor to a stable dormancy breaking factor. The transition is assumed to occur when a critical level of the precursor is accumulated. The two-step scheme is analysed mathematically. This model explains qualitatively the main observations on dormancy completion made under controlled temperature conditions and relates the parameters of the theory to the measurable characteristics of the system.}
 }
 
-@article { richardson_chill_1974,
+@article {richardson_chill_1974,
 	author = "E. Arlo Richardson and Schuyler D. Seeley and David R. Walker",
 	title = "A Model for Estimating the Completion of Rest for ‘Redhaven’ and ‘Elberta’ Peach Trees1",
 	journal = "HortScience",
@@ -2245,3 +2245,33 @@ @article { richardson_chill_1974
 	pages=      "331 - 332",
 	url = "https://journals.ashs.org/hortsci/view/journals/hortsci/9/4/article-p331.xml"
 }
+
+@article{russo_magnitude_2014,
+  title = {Magnitude of Extreme Heat Waves in Present Climate and Their Projection in a Warming World},
+  author = {Russo, Simone and Dosio, Alessandro and Graversen, Rune G. and Sillmann, Jana and Carrao, Hugo and Dunbar, Martha B. and Singleton, Andrew and Montagna, Paolo and Barbola, Paulo and Vogt, J{\"u}rgen V.},
+  year = {2014},
+  journal = {Journal of Geophysical Research: Atmospheres},
+  volume = {119},
+  number = {22},
+  pages = {12,500--12,512},
+  issn = {2169-8996},
+  doi = {10.1002/2014JD022098},
+  urldate = {2024-10-03},
+  langid = {english},
+  keywords = {climate extremes,climate indices,global models,heat waves}
+}
+
+@article{zhang_high_2022,
+  title = {High {{Sensitivity}} of {{Compound Drought}} and {{Heatwave Events}} to {{Global Warming}} in the {{Future}}},
+  author = {Zhang, Qin and She, Dunxian and Zhang, Liping and Wang, Gangsheng and Chen, Jie and Hao, Zengchao},
+  year = {2022},
+  month = nov,
+  journal = {Earth's Future},
+  volume = {10},
+  number = {11},
+  pages = {e2022EF002833},
+  issn = {2328-4277, 2328-4277},
+  doi = {10.1029/2022EF002833},
+  urldate = {2024-10-03},
+  langid = {english}
+}
diff --git a/tests/test_indices.py b/tests/test_indices.py
@@ -1500,6 +1500,18 @@ def test_1d(self, tasmax_series, thresh, window, op, expected):
         np.testing.assert_allclose(hsml.values, expected)
 
 
+class TestHotSpellMaxMagnitude:
+    def test_simple(self, tasmax_series):
+        a = np.zeros(365)
+        a[15:20] += 30  # 5 days
+        a[40:42] += 50  # too short -> 0
+        a[86:96] += 30  # at the end and beginning
+        da = tasmax_series(a + K2C)
+
+        out = xci.hot_spell_max_magnitude(da, thresh="25 C", freq="ME")
+        np.testing.assert_array_equal(out, [25, 0, 30, 20, 0, 0, 0, 0, 0, 0, 0, 0])
+
+
 class TestTnDays:
     def test_above_simple(self, tasmin_series):
         a = np.zeros(365)

diff --git a/tests/test_run_length.py b/tests/test_run_length.py
@@ -370,6 +370,16 @@ def test_simple(self, index):
         ) + len(a[34:45])
 
 
+class TestWindowedMaxRunSum:
+    @pytest.mark.parametrize("index", ["first", "last"])
+    def test_simple(self, index):
+        a = xr.DataArray(np.zeros(50, float), dims=("time",))
+        a[4:6] = 5  # too short
+        a[25:30] = 5  # long enough, but not max
+        a[35:45] = 5  # max sum => yields 10*5
+        assert rl.windowed_max_run_sum(a, 3, dim="time", index=index) == 50
+
+
 class TestLastRun:
     @pytest.mark.parametrize(
         "coord,expected",

diff --git a/tests/test_temperature.py b/tests/test_temperature.py
@@ -1597,6 +1597,46 @@ def test_simple(self, tasmax_series):
         np.testing.assert_array_equal(out, 8)
 
 
+class TestHotSpellMaxMagnitude:
+    def test_simple(self, tasmax_series):
+        tx = np.zeros(366)
+        tx[:5] = np.array([30, 30, 30, 30, 30])
+        tx = tasmax_series(tx + K2C, start="1/1/2000")
+        hwm = atmos.hot_spell_max_magnitude(tx, freq="YS")
+        np.testing.assert_array_equal(hwm, [25])
+
+    def test_small_window_single_day(self, tasmax_series):
+        tx = np.zeros(366)
+        tx[5:8] = np.array([30, 0, 30])
+        tx = tasmax_series(tx + K2C, start="1/1/2000")
+        hwm = atmos.hot_spell_max_magnitude(tx, window=1, freq="YS")
+        np.testing.assert_array_equal(hwm, [5])
+
+    def test_small_window_double_day(self, tasmax_series):
+        tx = np.zeros(366)
+        tx[5:7] = np.array([30, 30])
+        tx = tasmax_series(tx + K2C, start="1/1/2000")
+        hwm = atmos.hot_spell_max_magnitude(tx, window=1, freq="YS")
+        np.testing.assert_array_equal(hwm, [10])
+
+    def test_convert_units(self, tasmax_series):
+        tx = np.zeros(366)
+        tx[:5] = np.array([30, 30, 30, 30, 30])
+        tx = tasmax_series(tx, start="1/1/2000")
+        tx.attrs["units"] = "C"
+        hwm = atmos.hot_spell_max_magnitude(tx, freq="YS")
+        np.testing.assert_array_equal(hwm, [25])
+
+    def test_nan_presence(self, tasmax_series):
+        tx = np.zeros(366)
+        tx[:5] = np.array([30, 30, 30, 30, 30])
+        tx[-1] = np.nan
+        tx = tasmax_series(tx + K2C, start="1/1/2000")
+
+        hwm = atmos.hot_spell_max_magnitude(tx, freq="YS")
+        np.testing.assert_array_equal(hwm, [np.nan])
+
+
 class TestColdSpellFrequency:
     def test_simple(self, tas_series):
         a = np.zeros(366)

diff --git a/xclim/data/fr.json b/xclim/data/fr.json
@@ -377,6 +377,12 @@
     "title": "Indice de vague de chaleur",
     "abstract": "Nombre de jours de vagues de chaleur. Une vague de chaleur se produit lorsque la température maximale quotidienne excède un seuil donné durant un certain nombre de jours."
   },
+  "HOT_SPELL_MAX_MAGNITUDE": {
+    "long_name": "Différence cumulative maximale entre la température maximale quotidienne et {thresh} pour les jours faisant partie d'une vague de chaleur. Une vague de chaleur est définie comme une série d'au moins {window} jours consécutifs où la température maximale quotidien est au-dessus de {thresh}",
+    "description": "Cumul {freq:m} maximal de la différence entre la température et {thresh} pour les jours faisant partie d'une vague de chaleur. Une vague de chaleur est définie comme une série d'au moins {window} jours consécutifs où la température maximale quotidien est au-dessus de {thresh}",
+    "title": "Magnitude de la vague de chaleur {freq:m} la plus intense. Une vague de chaleur est définie comme une série d'au moins {window} jours consécutifs où la température maximale quotidien est au-dessus de {thresh}.",
+    "abstract": "Magnitude de la vague de chaleur {freq:m} la plus intense. Une vague de chaleur est définie comme une série d'au moins {window} jours consécutifs où la température maximale quotidien est au-dessus de {thresh}."
+  },
   "TG": {
     "long_name": "Moyenne des températures maximale et minimale quotidennes",
     "description": "Température moyenne quotidienne moyenne estimée par la moyenne des températures maximale et minimale quotidiennnes.",

diff --git a/xclim/indicators/atmos/_temperature.py b/xclim/indicators/atmos/_temperature.py
@@ -62,6 +62,7 @@
     "heating_degree_days",
     "hot_spell_frequency",
     "hot_spell_max_length",
+    "hot_spell_max_magnitude",
     "hot_spell_total_length",
     "huglin_index",
     "ice_days",
@@ -235,6 +236,21 @@ class TempHourlyWithIndexing(ResamplingIndicatorWithIndexing):
     compute=indices.heat_wave_frequency,
 )
 
+hot_spell_max_magnitude = Temp(
+    title="Hot spell maximum magnitude",
+    identifier="hot_spell_max_magnitude",
+    units="K d",
+    long_name="Maximum cumulative difference between daily maximum temperature and {thresh} for days within a heat wave. "
+    "A heat wave is defined as a series of at least {window} consecutive days with daily maximum temperature above {thresh}.",
+    description="Magnitude of the most intensive heat wave per {freq}. The magnitude is the cumulative exceedance of daily "
+    "maximum temperature over {thresh}. A heat wave is defined as a series of at least {window} consecutive days with daily "
+    "maximum temperature above {thresh}",
+    abstract="Magnitude of the most intensive heat wave per {freq}. A heat wave occurs when daily maximum "
+    "temperatures exceed given thresholds for a number of days.",
+    cell_methods="",
+    compute=indices.hot_spell_max_magnitude,
+)
+
 heat_wave_max_length = Temp(
     title="Heat wave maximum length",
     identifier="heat_wave_max_length",

diff --git a/xclim/indices/_threshold.py b/xclim/indices/_threshold.py
@@ -67,6 +67,7 @@
     "heating_degree_days",
     "hot_spell_frequency",
     "hot_spell_max_length",
+    "hot_spell_max_magnitude",
     "hot_spell_total_length",
     "last_snowfall",
     "last_spring_frost",
@@ -1928,6 +1929,59 @@ def heat_wave_index(
     return to_agg_units(out, tasmax, "count")
 
 
+@declare_units(tasmax="[temperature]", thresh="[temperature]")
+def hot_spell_max_magnitude(
+    tasmax: xarray.DataArray,
+    thresh: Quantified = "25.0 degC",
+    window: int = 3,
+    freq: str = "YS",
+    op: str = ">",
+    resample_before_rl: bool = True,
+) -> xarray.DataArray:
+    """Hot spell maximum magnitude.
+
+    Magnitude of the most intensive heat wave event as sum of differences between tasmax
+    and the given threshold for Heat Wave days, defined as three or more consecutive days
+    over the threshold.
+
+    Parameters
+    ----------
+    tasmax : xarray.DataArray
+        Maximum daily temperature.
+    thresh : xarray.DataArray
+        Threshold temperature on which to designate a heatwave.
+    window : int
+        Minimum number of days with temperature above threshold to qualify as a heatwave.
+    freq : str
+        Resampling frequency.
+    op : {">", ">=", "gt", "ge"}
+        Comparison operation. Default: ">".
+    resample_before_rl : bool
+        Determines if the resampling should take place before or after the run
+        length encoding (or a similar algorithm) is applied to runs.
+
+    References
+    ----------
+    :cite:cts:`russo_magnitude_2014,zhang_high_2022`
+
+    Returns
+    -------
+    DataArray, [time]
+        Hot spell maximum magnitude.
+    """
+    thresh = convert_units_to(thresh, tasmax)
+    over_values = (tasmax - thresh).clip(0)
+
+    out = rl.resample_and_rl(
+        over_values,
+        resample_before_rl,
+        rl.windowed_max_run_sum,
+        window=window,
+        freq=freq,
+    )
+    return to_agg_units(out, tasmax, op="integral")
+
+
 @declare_units(tas="[temperature]", thresh="[temperature]")
 def heating_degree_days(
     tas: xarray.DataArray,

diff --git a/xclim/indices/run_length.py b/xclim/indices/run_length.py
@@ -163,7 +163,15 @@ def rle(
     dim: str = "time",
     index: str = "first",
 ) -> xr.DataArray:
-    """Generate basic run length function.
+    """Run length
+
+    Despite its name, this is not an actual run-length encoder : it returns an array of the same shape
+    as the input with 0 where the input was <= 0, nan where the input was > 0, except on the first (or last) element
+    of each run of consecutive > 0 values, where it is set to the sum of the elements within the run.
+    For an actual run length encoder, see :py:func:`rle_1d`.
+
+    Usually, the input would be a boolean mask and the first element of each run would then be set to the run's length (thus the name).
+    But the function also accepts int and float inputs.
 
     Parameters
     ----------
@@ -178,9 +186,9 @@ def rle(
     Returns
     -------
     xr.DataArray
-        Values are 0 where da is False (out of runs).
     """
-    da = da.astype(int)
+    if da.dtype == bool:
+        da = da.astype(int)
 
     # "first" case: Algorithm is applied on inverted array and output is inverted back
     if index == "first":
@@ -192,7 +200,7 @@ def rle(
     # Keep total length of each series (and also keep 0's), e.g. 100120123 -> 100N20NN3
     # Keep numbers with a 0 to the right and also the last number
     cs_s = cs_s.where(da.shift({dim: -1}, fill_value=0) == 0)
-    out = cs_s.where(da == 1, 0)  # Reinsert 0's at their original place
+    out = cs_s.where(da > 0, 0)  # Reinsert 0's at their original place
 
     # Inverting back if needed e.g. 100N20NN3 -> 3NN02N001. This is the output of
     # `rle` for 111011001 with index == "first"
@@ -409,6 +417,50 @@ def windowed_run_count(
     return out
 
 
+def windowed_max_run_sum(
+    da: xr.DataArray,
+    window: int,
+    dim: str = "time",
+    freq: str | None = None,
+    index: str = "first",
+) -> xr.DataArray:
+    """Return the maximum sum of consecutive float values for runs at least as long as the given window length.
+
+    Parameters
+    ----------
+    da : xr.DataArray
+        Input N-dimensional DataArray (boolean).
+    window : int
+        Minimum run length.
+        When equal to 1, an optimized version of the algorithm is used.
+    dim : str
+      Dimension along which to calculate consecutive run (default: 'time').
+    freq : str
+      Resampling frequency.
+    index : {'first', 'last'}
+        If 'first', the run length is indexed with the first element in the run.
+        If 'last', with the last element in the run.
+
+    Returns
+    -------
+    xr.DataArray, [int]
+        Total number of `True` values part of a consecutive runs of at least `window` long.
+    """
+    if window == 1 and freq is None:
+        out = rle(da, dim=dim, index=index).max(dim=dim)
+
+    else:
+        d_rse = rle(da, dim=dim, index=index)
+        d_rle = rle((da > 0).astype(bool), dim=dim, index=index)
+
+        d = d_rse.where(d_rle >= window, 0)
+        if freq is not None:
+            d = d.resample({dim: freq})
+        out = d.max(dim=dim)
+
+    return out
+
+
 def _boundary_run(
     da: xr.DataArray,
     window: int,
@@ -1203,6 +1255,8 @@ def rle_1d(
 ) -> tuple[np.array, np.array, np.array]:
     """Return the length, starting position and value of consecutive identical values.
 
+    In opposition to py:func:`rle`, this is an actuel run length encoder.
+
     Parameters
     ----------
     arr : Sequence[Union[int, float, bool]]