Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example/tensorflow_example.py had an exception (about dir making) when nb_workers > 1 #31

Open
leocnj opened this issue Jul 18, 2018 · 1 comment

Comments

@leocnj
Copy link

leocnj commented Jul 18, 2018

I tried the tensorflow_example.py to test the function of using multiple GPUs. When setting up nb_workers more than 1, I met an exception as follows. As a result, I just got nb_trials - 1 tuning results, rather than the expected nb_trials. Note that I am using Python 3.6.

Caught exception in worker thread [Errno 17] File exists: 'logs/multigpu/test_tube_data/dense_model/version_0'
Traceback (most recent call last):
File "/home/lchen/.local/lib/python3.6/site-packages/test_tube/argparse_hopt.py", line 30, in optimize_parallel_gpu_private
results = train_function(trial_params)
File "test_tube_multigpu.py", line 22, in train
autosave=False
File "/home/lchen/.local/lib/python3.6/site-packages/test_tube/log.py", line 58, in init
self.__init_cache_file_if_needed()
File "/home/lchen/.local/lib/python3.6/site-packages/test_tube/log.py", line 121, in __init_cache_file_if_needed
os.makedirs(exp_cache_file)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/os.py", line 220, in makedirs
mkdir(name, mode)

@leocnj
Copy link
Author

leocnj commented Jul 18, 2018

Just tested on Python 2.7 and met the same issue if using more than one GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant