-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using dask="allowed" is slightly faster #207
Comments
please add these tests to |
Apply ufunc is definitely faster for numpy ufunc. not so much for map blocks
…On Thu, Oct 8, 2020, 3:18 AM Aaron Spring ***@***.***> wrote:
please add these tests to asv and for small 1D, and larger 3D arrays and
for chunked and not chunked. otherwise the timings are less robust. would
it be possible to set map_blocks or ufunc as a keyword, or even via config
?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#207 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADU7FFSAF7SYU5KHTTO72CLSJVYVPANCNFSM4SIBALSA>
.
|
are you sure that the second run isnt faster than the first because some parts of the data are already in memory? can reverse ordering. |
Any ideas why asv is failing on me?
|
|
FYI I think I found with |
We don’t use factorize anyways. |
Yep was just mentioning that here. We don't have vectorize anywhere in xskillscore so should be fine on that front. But something to keep in mind. |
From """ dask="allowed" Dask arrays are passed to the user function. This is a good choice if your function can handle dask arrays and won’t call compute explicitly. dask="parallelized". This applies the user function over blocks of the dask array using dask.array.blockwise. This is useful when your function cannot handle dask arrays natively (e.g. scipy API). Since squared_error can handle dask arrays without computing them, we specify dask="allowed". |
Please ping me if you're waiting for me on a comment or any thoughts regarding this. Clearing up my git notifications and finishing up dissertation writing this week. Don't want any progress impeded here! |
I would implement this with defaults from #315 (comment) |
From:
https://xarray.pydata.org/en/stable/dask.html
Tip
For the majority of NumPy functions that are already wrapped by Dask, it’s usually a better idea to use the pre-existing dask.array function, by using either a pre-existing xarray methods or apply_ufunc() with dask='allowed'. Dask can often have a more efficient implementation that makes use of the specialized structure of a problem, unlike the generic speedups offered by dask='parallelized'.
So, I simply set dask="allowed"
And a decent speedup
Almost 2x with bigger arrays!
The text was updated successfully, but these errors were encountered: