-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model.to_gpu is not usable #713
Labels
feat / ops
Backends and maths
Comments
Update: the culprit in my case is: which should always call cupy.asarray because that function copies between devices, but only when it has to. |
Thanks for reporting this issue! We currently only support using a single GPU with |
Closed
Edit: sorry, posted a reply in the wrong issue. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I am attempting to assign individual layers to separate GPUs in order to conserve memory. However, the Model.to_gpu function takes an all or nothing approach which prevents this from working.
While diagnosing the origin of memory access error during training, (
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
), I noticed thatCupyOps.device_id
is never usedor set.
Ideally, all the CupyOps would run inside a
cp.cuda.Device(device_id)
context, but that is not the case. Instead, thexp
attribute is (ab)used in many places. That will try and run everything through GPU 0, so errors won't appear until something was moved to another GPU.Two other difficulties are the initialization step, which doesn't declare memory in the right places,
and the
finish_update
step, where the optimizer does arithmetic on parameters outside of a context.The text was updated successfully, but these errors were encountered: