Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a benchmark for dask.array.block comparing it to numpy and straight copy #18

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

hmaarrfk
Copy link
Contributor

@hmaarrfk hmaarrfk commented Sep 18, 2018

@mrocklin I guess it wasn't as slow as I remember :S.

Dask is doing better than numpy, at least for now 👯‍♀️

===== ========== ================= =============== ========================= ============= ========== ==========
--                                                       mode                                                   
----- ----------------------------------------------------------------------------------------------------------
  n     block     block optimized   block persist   block optimized persist   concatenate   np_block   np_copy  
===== ========== ================= =============== ========================= ============= ========== ==========
  1    4.50±0ms       2.59±0ms         1.82±0ms             2.46±0ms            3.20±0ms    139±0μs    62.5±0μs 
  10   3.01±0ms       2.13±0ms         2.69±0ms             2.73±0ms            4.41±0ms    1.41±0ms   427±0μs  
 100   164±0ms        163±0ms          1.84±0ms             1.72±0ms            139±0ms     440±0ms    141±0ms  
===== ========== ================= =============== ========================= ============= ========== ==========

@jakirkham
Copy link
Member

Is this ready to merge?

self.da_block = da.block(self.block)
self.da_concatenate = da.concatenate(self.arr_list)
if mode.startswith('block optimized'):
self.da_block.dask, _ = fuse_linear(self.da_block.dask)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakirkham is there a better way to optimize the graph here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty typical strategy for optimizing. Does a few different things for Dask Arrays in particular.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I would like to get a

  1. Typical dask user (i.e. no optimization called)
  2. Call to block with recommended optimizations to the graph.
  3. Calling persist with the above combos.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling persist will trigger the standard Dask Array graph optimizations in either case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I'll remove that "mode".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants