-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment version race condition error when using slurm #45
Comments
A small delay would not be a proper fix for a race condition. |
I ran into this same problem. The workaround I found is to set the So, for example, in the pytorch_hpc_example, I'd add between lines 41-42: parser.add_argument('--hpc_exp_number', type=int) And then, between lines 18-19: version=hparams.hpc_exp_number There's probably a better way that handles this automatically, but in the meantime this is the solution I found. I'll open a PR if I find a better way to do it. What do you think @williamFalcon? Anyway, I hope this helps! |
Sometimes, there's a chance test-tube will try to create an experiment version which already exists. Need to add a small delay to avoid the race condition.
The text was updated successfully, but these errors were encountered: