Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion Error #17

Open
furkankaradas opened this issue Nov 13, 2022 · 1 comment
Open

Assertion Error #17

furkankaradas opened this issue Nov 13, 2022 · 1 comment

Comments

@furkankaradas
Copy link

Hi,

When I was run this code (https://saturncloud.io/docs/examples/python/pytorch/qs-03-pytorch-gpu-dask-single-model/), I get this error:

daskcluster-worker-1     | 2022-11-13 17:01:17,386 - distributed.worker - WARNING - Compute Failed
daskcluster-worker-1     | Key:       dispatch_with_ddp-cbbbf432f092a3807b25cc40c48f7660
daskcluster-worker-1     | Function:  dispatch_with_ddp
daskcluster-worker-1     | args:      ()
daskcluster-worker-1     | kwargs:    {'pytorch_function': <function train at 0x7f06b9bba040>, 'master_addr': '172.23.0.4', 'master_port': 12345, 'rank': 1, 'world_size': 2, 'backend': 'nccl'}
daskcluster-worker-1     | Exception: 'AssertionError()'
daskcluster-worker-1     | 
daskcluster-worker-2     | 2022-11-13 17:01:17,387 - distributed.worker - WARNING - Compute Failed
daskcluster-worker-2     | Key:       dispatch_with_ddp-9ce4ce0b9f5f85ff8ead8f6f2e9a9bcf
daskcluster-worker-2     | Function:  dispatch_with_ddp
daskcluster-worker-2     | args:      ()
daskcluster-worker-2     | kwargs:    {'pytorch_function': <function train at 0x7fa21dd38940>, 'master_addr': '172.23.0.4', 'master_port': 12345, 'rank': 0, 'world_size': 2, 'backend': 'nccl'}
daskcluster-worker-2     | Exception: 'AssertionError()'

Why I did get this error? Can you help me?

Thank you.

@pandalanax
Copy link

It's been a while but do you by any chance try to train on cpu?
If so you have to set the backend to gloo like so:

futures = dispatch.run(client, train_function, backend='gloo')

by default it is nccl which is for GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants