You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, with time-slicing or MPS GPU-sharing technology, multiple processes simultaneously occupy GPU memory, preventing a single process from utilizing all the memory. Is there any technology or configuration that allows these GPU-sharing modes to swap the memory occupied by processes to host-memory when they are not using the GPU? This way, process that is running on the GPU can utilize all the memory.
I want to achieve a scenario where N GPUs can be shared by M developers' containers, generally with M>=N. However, the M developers will not use the GPU simultaneously and will only use it intermittently. I hope that developers will only occupy GPU memory when they need the GPU. Even if the debugging process has not ended, it should not occupy GPU memory when the GPU is not needed. This way, the memory can be freed up for other users. Can the current GPU-sharing technology support this implementation?
The text was updated successfully, but these errors were encountered:
Currently, with time-slicing or MPS GPU-sharing technology, multiple processes simultaneously occupy GPU memory, preventing a single process from utilizing all the memory. Is there any technology or configuration that allows these GPU-sharing modes to swap the memory occupied by processes to host-memory when they are not using the GPU? This way, process that is running on the GPU can utilize all the memory.
I want to achieve a scenario where N GPUs can be shared by M developers' containers, generally with M>=N. However, the M developers will not use the GPU simultaneously and will only use it intermittently. I hope that developers will only occupy GPU memory when they need the GPU. Even if the debugging process has not ended, it should not occupy GPU memory when the GPU is not needed. This way, the memory can be freed up for other users. Can the current GPU-sharing technology support this implementation?
The text was updated successfully, but these errors were encountered: