Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client.map should be using a HLG Layer #8706

Closed
fjetter opened this issue Jun 19, 2024 · 0 comments · Fixed by #8740
Closed

Client.map should be using a HLG Layer #8706

fjetter opened this issue Jun 19, 2024 · 0 comments · Fixed by #8740
Assignees

Comments

@fjetter
Copy link
Member

fjetter commented Jun 19, 2024

Client.map is currently creating a low level dask graph for every element in the provided iterable. This low level task graph is then being submitted to the scheduler. This is wasteful on many levels and we should replace this with a more efficient representation.

This is exacerbated by quite severe inefficiencies in our protocol (which is a different subject entirely)

iterables = [i for i in range(100_000)]
def func(arg):
    b = b"0" * 1024
    return arg

dsk = {
    f"func-{x}": (func, x)
    for x in iterables
}

import pickle
from dask.utils import format_bytes
from distributed.protocol import dumps

## Plain pickle

%time format_bytes(len(pickle.dumps(dsk)))
CPU times: user 23.2 ms, sys: 7.82 ms, total: 31 ms
Wall time: 31.3 ms
'1.96 MiB'

## Distributed dumps
%time format_bytes(sum(map(len, dumps(dsk))))
CPU times: user 1.72 s, sys: 95.4 ms, total: 1.82 s
Wall time: 1.97 s
'151.56 MiB'

## Efficient encoding
%time format_bytes(sum(map(len, dumps((func, iterables)))))
CPU times: user 3 µs, sys: 1e+03 ns, total: 4 µs
Wall time: 6.91 µs
'361.45 kiB'

The simplest approach currently is to write a HLG Layer that is encoding this information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants