Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bucket resampler get_max, get_min are not optimally parallelised #432

Open
gerritholl opened this issue May 9, 2022 · 1 comment
Open
Labels
enhancement performance improves speed or decreases memory consumption, but does not otherwise change functionality

Comments

@gerritholl
Copy link
Collaborator

gerritholl commented May 9, 2022

Despite impressive improvements in #368, there remains room to improve the parallelisation of the BucketResampler methods get_max and get_min. Much of the method is spent in the @dask.delayed-function _get_statistics, which is not parallelised, even though it could be in principle.

Perhaps the resample_blocks function (to be) introduced in #341 could be of help.

Code Sample, a minimal, complete, and verifiable piece of code

"""Test bucket resampler performance."""

from pyresample import create_area_def
from pyresample.bucket import BucketResampler
import dask.array as da
from dask.diagnostics import Profiler, ResourceProfiler, CacheProfiler, visualize
import numpy as np
from math import prod
import xarray as xr

src_area = create_area_def(
        "src", 4087, area_extent=[-3_200_000, -3_200_000, 3_200_000, 3_200_000], resolution=1000)

dest_area = create_area_def(
        "dest", 4087, area_extent=[-3_000_000, -3_000_000, 3_000_000, 3_000_000], resolution=5000)

data = xr.DataArray(
       da.arange(prod(src_area.shape)).reshape(src_area.shape)-10000,
       dims=("y", "x"),
       attrs={"area": src_area})

resampler = BucketResampler(dest_area, *src_area.get_lonlats(chunks=1000))

m = resampler.get_max(data)

with Profiler() as prof, ResourceProfiler(dt=0.25) as rprof:
    dask.compute(m)
visualize([prof, rprof], show=False, save=True, filename="br-max-dask.html")

Problem description

The bucket resampler methods get_max and get_min are spending most of the wall clock time in unparallelised code. This means the run takes longer than it needs to.

Expected Output

I expect a dask visualisation that illustrates that 800% CPU is used 100% of the time.

Actual Result, Traceback if applicable

In reality, 800% CPU is used less than 40% of the time. More than 60% of the time is spent in the task _get_statistics, which is exactly the @dask.delayed-decorated function used to calculate the maximum:

image

Versions of Python, package at hand and relevant dependencies

pyresample main: v1.23.0-46-g0cb8914

@gerritholl gerritholl added the performance improves speed or decreases memory consumption, but does not otherwise change functionality label May 9, 2022
@djhoese
Copy link
Member

djhoese commented May 9, 2022

The key difficulty in the algorithm, as much as I understand of it at least, is that is needs to do an argsort and it does this over the entire input data array. The reason the resample_blocks function might help is that it runs your function for a single sub-array of the input data that will cover the output chunk/block. This is really want the algorithm "wants" in its current implementation, but coding up that logic is complicated in a basic map_blocks call. We can let resample_blocks do that for us in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement performance improves speed or decreases memory consumption, but does not otherwise change functionality
Projects
None yet
Development

No branches or pull requests

2 participants