Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding option to disable DCGM when in remote mode #952

Merged
merged 2 commits into from
Dec 20, 2024

Conversation

nv-braf
Copy link
Contributor

@nv-braf nv-braf commented Dec 19, 2024

Add a new CLI option to disable DCGM: --disable-dcgm

This will disable MA's ability to get info about the GPUs, which will prevent us from loading models in local, docker, or c_api, as well as verifying that the current GPUs match the ones in a checkpoint.

I've added a check to ensure this new option is only available in remote mode and added a warning message informing the user that MA cannot verify GPUs if a checkpoint is provided.

@nv-braf nv-braf requested a review from lkomali December 19, 2024 16:46
@nv-braf nv-braf marked this pull request as ready for review December 19, 2024 16:46
Copy link
Contributor

@lkomali lkomali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nv-braf nv-braf merged commit 92c8386 into main Dec 20, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants