Improve SPI performance #1311

coxipi · 2023-02-28T19:46:18Z

Pull Request Checklist:

This PR addresses an already opened issue (for bug fixes / features)
- This PR fixes Xclim Standardised Precipitation Index always throws error #1270 and fixes xclim.indices.standardized_precipitation_index() return infinite values #1416 and fixes _agro.standardized_precipitation_evapotranspiration_index creates NaNs #1474
Tests for the changes have been added (for bug fixes / features)
- (If applicable) Documentation has been added / updated (for bug fixes / features)
HISTORY.rst has been updated (with summary of main changes)
- Link to issue (:issue:number) and pull request (:pull:number) has been added

What kind of change does this PR introduce?

Make SPI/SPEI faster
fit params are now modular, can be computed before computing SPI/SPEI. This allows more options to segment computations and allow to obtain the fitting params if troubleshooting is needed.
time indexing now possible
dist_method now avoids vectorize=True in its xr.apply_ufunc. This is the main improvement in SPI/SPEI.
Better document the limits of usage of standardized indices. Now standardized indices are capped at extreme values ±8.21. The upper bound is a limit resulting of the use of float64.

Does this PR introduce a breaking change?

Yes.

pr_cal or wb_cal will not be input options in the future:

Inputing pr_cal will be deprecated in xclim==0.46.0. If pr_cal is a subset of pr, then instead of:
standardized_precipitation_index(pr=pr,pr_cal=pr.sel(time=slice(t0,t1)),...), one can call:
standardized_precipitation_index(pr=pr,cal_range=(t0,t1),...).
If for some reason pr_cal is not a subset of pr, then the following approach will still be possible:
params = standardized_index_fit_params(da=pr_cal, freq=freq, window=window, dist=dist, method=method).
spi = standardized_precipitation_index(pr=pr, params=params).
This approach can be used in both scenarios to break up the computations in two, i.e. get params, then compute
standardized indice

I could revert this breaking change if we prefer. This was a first attempt to make the computation faster, but the improvements are now independent of this change. We could also keep the modular structure for params, but revert to pr_cal instead of cal_range. It's a bit less efficient when pr_cal is simply a subset of pr, because you end up doing resampling/rolling two times on the calibration range for nothing. When first computing params, then obtaining spi in two steps, then it makes no difference

Other information:

xclim/indices/_agro.py

coxipi · 2023-03-02T20:44:34Z

I think the computation is still quite heavy. A few things should help. There is less redundant computation (we can either reuse params, or at least we don't resample/roll more or less twice (like it used to do for pr and pr_cal)).

The user can now dissect the computation more easily:

Produce params before hand, reuse them for many computations
Use of indexer to obtain SPI only for certain periods. This not like selecting, say month of June, from the onset. The indexing must be done after the rolling. Said another way, you still need months "Jan to June" even if you only want SPI-6 for June. So the indexer must be used mid-computation, this is what is done.

xclim/indices/_agro.py

…clim into fix_spi_performance

xclim/indices/_agro.py

aulemahal · 2023-10-18T19:39:55Z

xclim/indices/_agro.py

 @declare_units(
    pr="[precipitation]",
    pr_cal="[precipitation]",
+    params="[]",


But yes, params needs an entry in declare_units if it is a Quantified.

aulemahal · 2023-10-18T19:45:00Z

xclim/indices/_agro.py

+
+    def wrap_fit(da):
+        # if all values are null, obtain a template of the fit and broadcast
+        if indexer != {} and da.isnull().all():


I'm confused, how does this work with dask ? Won't da.isnull().all() as a conditional trigger the computation ?

And why indexer != {} ? Why would this fastpath only be used with indexing ?

Good point about Dask. There is no way then to achieve a fastpath with lazy computing, no?

For the other question: Yes, indeed, spatial regions with only NaNs independently of time selecting would also benefit from this speedup (ignoring the Dask issue). I wanted to reserve this check for cases when I was sure there was a potential of all-NaN slices, e.g. when there was time-selecting. I'm not sure how costly it is to check .isnull.all, but probably very small in comparison to the whole algorithm.

xclim/indices/_agro.py

aulemahal · 2023-10-18T19:54:39Z

xclim/indices/_agro.py

+            return np.zeros_like(da) * np.NaN
+
+        spi = np.zeros_like(da)
+        # loop over groups (similar to np.groupies idea, but np.groupies can't be used with custom functions)


According to the doc (https://github.com/ml31415/numpy-groupies/), numpy groupies accepts a callable as func would it make sense to use that ?

I had tried but got a confusing error message leading me to believe this claim was maybe false. Maybe I just messed up. But I will retry, it would be worth it to clean the function.

xclim/indices/_agro.py

aulemahal · 2023-10-18T20:47:39Z

xclim/indices/_agro.py

+    params: xarray.DataArray
+        Fit parameters. The `params` can be computed using ``xclim.indices.standardized_index_fit_params`` in advance.
+        The ouput can be given here as input, and it overrides other options.
+    offset: Quantified


Suggestion:

Suggested change

offset: Quantified

offset: Quantified | None = None,

# pseudo-code if offset is None: if params is not None: if dist in bounded_distributions: offset = 1000 mm / d else: offset = 0 else: offset = params.offset else: if params is not None and params.offset != offset: warning

C'est à dire : put a default of None and clearly explain in the docstring what the default behaviour is. I know I just suggested something different for the snow thing, but that was in a "temporary" perspective!

Also, could there some option where offset = -wb.min() ?

I will check, but I think actually, there is not a restriction in scipy to use gamma/fisk distrubutions with 0 bounded data. A parameter can also be fitted for this offset. This is was an idea for another PR though. I think the problem could be: if there are more negative values outside of the calibration period, then the offset could be too weak.

The problem with offset = -wb.min() is that you can become sensitive to what data is included. Imagine your reference data is 1980-2010 and you fit two cases:

1980-2020

1980-2050

You could have a different minimum in cases 1 & 2, so different offset. Ideally, since this is a trick, we would like that our computation are not too sensitive on this. But in any case, it would be nice that the results of 1. are en exact subset of 2. if you use the same calibration data / methods for both computations.

Ideally, I would eventually like to get rid of the default 1 mm/day default. It was just copied on monocongo's climate indices. It's a good rule of thumb, but users have found cases where a bigger offset was needed. Maybe we can keep it in but warn users about having NaNs? Maybe just in the docstring. Not sure. I could check the R implementation to see if I can dig ideas.

Co-authored-by: Pascal Bourgault <[email protected]>

coxipi · 2023-10-19T18:44:35Z

Showing problems with `dist_method` that explain the previous slow behaviour

I think it may be worthwhile to benchmark a bit the previous implementation, where is the bottleneck. The main culprit I believe is dist_method. For instance, let us first obtain some parameters.

from xclim.testing import open_dataset
import xclim
import scipy

pr = open_dataset("sdba/CanESM2_1950-2100.nc").pr
freq, window, dist, method = "MS", 1, "gamma", "APP"
params = xclim.indices.stats.standardized_index_fit_params(
    pr.sel(time=slice("1950", "1980")), freq=freq, window=window, dist=dist, method=method
)
# broadcast params like pr
params = params.rename(month="time").reindex(time=pr.time.dt.month)
params["time"] = pr.time

Now, we either use a numpy/ scipy approach (fast):

# ~0.5 seconds
import xarray as xr
from xclim.indices.stats import get_dist


dist = get_dist(params.attrs["scipy_dist"])
def wrap_cdf(da, pars):
    return dist.cdf(da[:], *pars)

out1 = xr.apply_ufunc(
    wrap_cdf,
    pr.where(pr>0), 
    params,
    input_core_dims=[["time"], ["dparams","time"]],
    output_core_dims=[["time"]],
    vectorize=True,
)
out1.values

or we use dist_method (50 times slower)

# about 25 seconds
from xclim.indices.stats import dist_method
out2 = dist_method("cdf", params, pr.where(pr > 0))
out2.values

Both methods give the same results. It remains to be seen if the speed-up persists as we scale things up, but this is already a weird result.

full spi computation

I can use the approach outlined above for the full spi computation. It is much faster than the previous implementation before this PR. It looks like:

    probs_of_zero = da.groupby(group).map(lambda x: (x == 0).sum("time") / x.notnull().sum("time"))
    params, probs_of_zero = [resample_to_time(dax, da) for dax in [params, probs_of_zero]]

    def wrap_cdf_ppf(da, pars, probs_of_zero):
        dist_probs = get_dist(params.attrs["scipy_dist"]).cdf(da[:], *pars)
        probs = probs_of_zero + ((1 - probs_of_zero) * dist_probs)
        return norm.ppf(probs)

    std_index = xr.apply_ufunc(
        wrap_cdf_ppf,
        da, 
        params,
        probs_of_zero,
        input_core_dims=[["time"], ["dparams","time"],["time"]],
        output_core_dims=[["time"]],
        vectorize=True,
        dask="parallelized",
    )

It remains about 2x slower than the approach with group_idxs, I think mainly because of the probs_of_zero computation, I will try a different approach

2nd try:

Using flox, I almost get as fast as the weird loop on group_idx I implemented:

probs_of_zero = flox.xarray.xarray_reduce(da, idxs,func="sum")/floxx.xarray_reduce(da.notnull(), idxs,func="sum")

weird loop with group_idx : 45 s
using floxx : 53 s
no floxx : 80 s

I don't think flox supports custom function. That would be ideal because then I could write:

def func(da):
    return (da==0).sum(dim="time")/da.notnull().sum(dim="time")

and have only one flox call. There is this Aggregation thing I'm trying to get the hang of...

3rd try

A little dirty trick that seems performant, but a bit more difficult to understand

floxx.xarray_reduce((pr+1)*(pr==0), idxs,func="mean")

It's equivalent to :

floxx.xarray_reduce((pr==0), idxs,func="sum")/floxx.xarray_reduce(pr.notnull(), idxs,func="sum")
floxx.xarray_reduce((pr==0).where(pr.notnull()), idxs,func="mean")

but avoids the double-flox, or the "where" call. Doesn't seem much faster than the 2nd try though.

By the way ...

I should try to improve dist_method instead of having a specific method for SPI/SPEI computations

xclim/indices/stats.py

coxipi · 2023-10-20T15:14:37Z

Sabotaging my previous example to mimick `dist_method` performance

The performance issue seems to stem from the input_core_dims used in the xr.apply_ufunc call of dist_method. With my little example, I can reproduce the slowness of dist_method:

    # like `dist_method`, about 25 s
    def wrap_cdf(da, pars):
        return dist.cdf(da, *pars)

    out = xr.apply_ufunc(
        wrap_cdf,
        pr.where(pr>0), 
        params,
        input_core_dims=[[], ["dparams"]],
        output_core_dims=[[]],
        vectorize=True,
    )

while using "time" in the core dims goes much faster. This remains true if I try using dask etc., but maybe I'm not using this well. At this point, we could chunk the time dimension, but earlier steps in stats usually require unchunked time dimension. If using "time" as a core dimension is always appropriate in this scenario, this could be way to improve dist_method easily

…_spi_performance

…i_performance

Zeitsperre

Well done!

aulemahal

Two small comments, but it looks good!

CHANGES.rst

xclim/indices/stats.py

Co-authored-by: Pascal Bourgault <[email protected]>

Zeitsperre · 2023-10-23T18:52:17Z

Ignore the cancelled builds, merging when docs clear.

pr_cal now created within SPI + refactoring

be3b9c6

github-actions bot added the indicators Climate indices and indicators label Feb 28, 2023

coxipi commented Feb 28, 2023

View reviewed changes

xclim/indices/_agro.py Outdated Show resolved Hide resolved

Better variable name

2d5b942

coxipi commented Feb 28, 2023

View reviewed changes

xclim/indices/_agro.py Show resolved Hide resolved

coxipi added 2 commits March 1, 2023 11:30

params can be used as input, and computed beforehand (get_params)

a939edb

Clearer var names and better doc

aeb8038

Zeitsperre reviewed Mar 1, 2023

View reviewed changes

xclim/indices/_agro.py Outdated Show resolved Hide resolved

coxipi added 6 commits March 1, 2023 12:58

Remove uses_range

3c5ac71

cal_range type changed, new warnings

ed4164d

cal_range type change 2/2

8f22ff1

fitting/rolling/resampling defined in a separate function

cbca142

resampling/rolling/fitting out of SPI

16f4b6b

Update doc and formatting

e990406

coxipi commented Mar 2, 2023

View reviewed changes

xclim/indices/_agro.py Outdated Show resolved Hide resolved

coxipi commented Mar 2, 2023

View reviewed changes

xclim/indices/_agro.py Show resolved Hide resolved

indexer support

ca577c2

coxipi commented Mar 2, 2023

View reviewed changes

xclim/indices/_agro.py Outdated Show resolved Hide resolved

update doc

fde2441

coxipi commented Mar 2, 2023

View reviewed changes

xclim/indices/_agro.py Outdated Show resolved Hide resolved

Zeitsperre and others added 5 commits March 8, 2023 15:21

Merge branch 'master' into fix_spi_performance

f9c28c6

Merge branch 'master' into fix_spi_performance

f9c73f4

typo function isnull

0fa83f4

Merge branch 'fix_spi_performance' of https://github.com/Ouranosinc/x…

5b3bbe6

…clim into fix_spi_performance

More documentation, remove useless step

f747dea

ludwiglierhammer mentioned this pull request Jun 20, 2023

SUNRISE: new indicators climate-service-center/index_calculator#17

Closed

coxipi added 3 commits June 22, 2023 13:05

Refactoring, simplifications, shorter msgs

2b6dfe9

More documentation, some cleaning

09d72b3

SPI accepts da & params as sufficient arguments

17c1463

Zeitsperre added 2 commits October 18, 2023 16:08

Merge branch 'master' into fix_spi_performance

1553208

Merge branch 'master' into fix_spi_performance

2ca3f1c

aulemahal reviewed Oct 18, 2023

View reviewed changes

coxipi and others added 2 commits October 18, 2023 17:40

Better doc & simplifications (review)

c717169

Co-authored-by: Pascal Bourgault <[email protected]>

only accept "D" & "MS" freqs

19deb92

Co-authored-by: Pascal Bourgault <[email protected]>

coxipi added 3 commits October 19, 2023 18:20

put std_index functions in stats.py, get rid of group_idx loop

2aefe81

non-hardcoded definition of group indices

96e320a

correct infer_freq usage

ffa1e10

Zeitsperre reviewed Oct 20, 2023

View reviewed changes

xclim/indices/stats.py Outdated Show resolved Hide resolved

coxipi and others added 12 commits October 20, 2023 21:00

optimized dist_method to simplify std_index functions

3e28bbb

remove uneeded function name in header

b809ab6

fix xci -> xci.stat where appropriate

cc8b3e6

typo

3b4b8f2

remove useless input/output core_dims from xr.apply_ufunc

9ec645a

Merge branch 'master' of https://github.com/Ouranosinc/xclim into fix…

3279515

…_spi_performance

no need to broadcast params_norm

845cac4

Merge branch 'master' into fix_spi_performance

e461450

fix merge of CHANGES.rst

c702e48

docstring and typing adjustments

b3b9d7c

Merge remote-tracking branch 'origin/fix_spi_performance' into fix_sp…

6929929

…i_performance

whitespace

69ab14a

Zeitsperre approved these changes Oct 23, 2023

View reviewed changes

aulemahal approved these changes Oct 23, 2023

View reviewed changes

CHANGES.rst Outdated Show resolved Hide resolved

xclim/indices/stats.py Outdated Show resolved Hide resolved

coxipi and others added 2 commits October 23, 2023 14:36

Update CHANGES.rst (review)

7f46856

Co-authored-by: Pascal Bourgault <[email protected]>

only chunk if necessary, check if dask used (review)

59e7df6

Co-authored-by: Pascal Bourgault <[email protected]>

coxipi merged commit dae1ffd into master Oct 23, 2023
9 checks passed

coxipi deleted the fix_spi_performance branch October 23, 2023 18:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve SPI performance #1311

Improve SPI performance #1311

coxipi commented Feb 28, 2023 •

edited by Zeitsperre

Loading

coxipi commented Mar 2, 2023

aulemahal Oct 18, 2023

aulemahal Oct 18, 2023

coxipi Oct 18, 2023

aulemahal Oct 18, 2023

coxipi Oct 18, 2023 •

edited

Loading

aulemahal Oct 18, 2023

aulemahal Oct 18, 2023

coxipi Oct 18, 2023 •

edited

Loading

coxipi Oct 18, 2023

coxipi commented Oct 19, 2023 •

edited

Loading

coxipi commented Oct 20, 2023 •

edited

Loading

Zeitsperre left a comment

aulemahal left a comment

Zeitsperre commented Oct 23, 2023

Improve SPI performance #1311

Improve SPI performance #1311

Conversation

coxipi commented Feb 28, 2023 • edited by Zeitsperre Loading

Pull Request Checklist:

What kind of change does this PR introduce?

Does this PR introduce a breaking change?

Other information:

coxipi commented Mar 2, 2023

aulemahal Oct 18, 2023

Choose a reason for hiding this comment

aulemahal Oct 18, 2023

Choose a reason for hiding this comment

coxipi Oct 18, 2023

Choose a reason for hiding this comment

aulemahal Oct 18, 2023

Choose a reason for hiding this comment

coxipi Oct 18, 2023 • edited Loading

Choose a reason for hiding this comment

aulemahal Oct 18, 2023

Choose a reason for hiding this comment

aulemahal Oct 18, 2023

Choose a reason for hiding this comment

coxipi Oct 18, 2023 • edited Loading

Choose a reason for hiding this comment

coxipi Oct 18, 2023

Choose a reason for hiding this comment

coxipi commented Oct 19, 2023 • edited Loading

Showing problems with dist_method that explain the previous slow behaviour

full spi computation

2nd try:

3rd try

By the way ...

coxipi commented Oct 20, 2023 • edited Loading

Sabotaging my previous example to mimick dist_method performance

Zeitsperre left a comment

Choose a reason for hiding this comment

aulemahal left a comment

Choose a reason for hiding this comment

Zeitsperre commented Oct 23, 2023

coxipi commented Feb 28, 2023 •

edited by Zeitsperre

Loading

coxipi Oct 18, 2023 •

edited

Loading

coxipi Oct 18, 2023 •

edited

Loading

coxipi commented Oct 19, 2023 •

edited

Loading

Showing problems with `dist_method` that explain the previous slow behaviour

coxipi commented Oct 20, 2023 •

edited

Loading

Sabotaging my previous example to mimick `dist_method` performance