-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The commond line "python launchers/run_mnist_exp.py" couldn't be excuted successfully #19
Comments
Perhaps it is the version of tensorflow and prettytensor is different. Update(06.09.2017): Bellow TF and PT version works for me: (Thanks to @tornadomeet )
|
I have the same issue. Just followed the instructions from the readme. |
I have the same issue, using the docker instructions. An additional problem with the docker instructions is that I needed to:
in order to find the |
Same issue. Seems like the infogan code was written for a slightly different tensorflow api. $ git clone https://github.com/openai/InfoGAN $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:r0.9rc0-devel root@0de0ea9fa724:/InfoGAN# pip install -r requirements.txt root@0de0ea9fa724:/InfoGAN# python launchers/run_mnist_exp.py root@0de0ea9fa724:/InfoGAN# python launchers/run_mnist_exp.py root@0de0ea9fa724:/InfoGAN# ls root@0de0ea9fa724:/InfoGAN# PYTHONPATH='.' python launchers/run_mnist_exp.py root@0de0ea9fa724:/InfoGAN# export PYTHONPATH= I can then change all instances of tf.zeros_initializer() to tf.constant_initializer(0.0) per tensorflow/tensorflow#6202 but then a different error occurs: root@0de0ea9fa724:/InfoGAN# python launchers/run_mnist_exp.py |
Thanks so much @tlbtlbtlb! I can go for the current devel branch that has these api changes, or i could change the code back to use the api for the docker image i ran. Below, I try the first then the second. Even further below, I try tensorflow 1.0.0-devel-gpu and latest-devel-gpu which appear to have api and cuda issues, respectively. If nothing else I'd just like to document this. tf.stack and tf.unstack are more recent the tf.pack and tf.unpack, so changing stack to pack fixes the AttributeError i reported in the previous comment. See also https://www.tensorflow.org/install/migration Then i can reach:
ipdb> Control+D out of the debugger leaves me with some MNIST data. Now I'll try getting a tensorflow 1.0 instead, since prettytensor's readme suggests some changes were recently made 'in anticipation of TF1.0' google/prettytensor@75daa0b Not sure what I'm doing, but https://hub.docker.com/r/tensorflow/tensorflow/tags/ suggests a variety of tags, including 1.0.0-devel-gpu $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:1.0.0-devel-gpu root@7611f5c869c3:/InfoGAN# pip install -r requirements.txt root@7611f5c869c3:/InfoGAN# export PYTHONPATH= Trying again, moving the prior (and possibly incomplete) MNIST download off to the side. root@7611f5c869c3:/InfoGAN# ls root@7611f5c869c3:/InfoGAN# mv MNIST MNIST.00 root@7611f5c869c3:/InfoGAN# ls -hal MNIST.00 Trying again just downloads MNIST again and reproduces the problem. root@7611f5c869c3:/InfoGAN# python launchers/run_mnist_exp.py root@7611f5c869c3:/InfoGAN# ls root@7611f5c869c3:/InfoGAN# ls -hal MNIST The bit about "TypeError: Expected int32, got list containing Tensors of type '_Message' instead." strikes me either as a bug or another old-software problem, e.g. https://stackoverflow.com/questions/37098155/tensorflow-typeerror-expected-int32-got-list-containing-tensors-of-type-me mentions upgrading Keras versions solves the problem. So, perhaps the nightly dev gpu build of tensforflow's the solution. $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:nightly-devel-gpu Ok, how about latest dev gpu build? $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:latest-devel-gpu root@b5dc9d6adc93:/InfoGAN# ls root@b5dc9d6adc93:/InfoGAN# mv MNIST MNIST.01 root@b5dc9d6adc93:/InfoGAN# pip install -r requirements.txt root@b5dc9d6adc93:/InfoGAN# export PYTHONPATH= root@b5dc9d6adc93:/InfoGAN# python launchers/run_mnist_exp.py Failed to load the native TensorFlow runtime. See https://www.tensorflow.org/install/install_sources#common_installation_problems for some common reasons and solutions. Include the entire stack trace So latest-devel-gpu seems to have a CUDA problem, whereas 1.0.0-devel-gpu found CUDA but had presumably api problems. |
Ideally, use the particular commit ID referenced in the README. But latest-devel-gpu should work too. Installing |
Thanks @tlbtlbtlb! It's really great to get some guidance from you! I tried checking out that specific commit but hit a v2 registry error, and then got api errors when trying various cpu-only images. This page https://docs.docker.com/engine/reference/commandline/pull/ mentions how to check out a specific commit by sha256, and the mentioned commit is tensorflow/tensorflow@79174af so: $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow@sha256:79174afa30046ecdc437b531812f2cb41a32695e My read of openshift/origin#4567 is that this error means I need to register with some authentication system to do the pull. My read of https://bugzilla.redhat.com/show_bug.cgi?id=1255502 is that this means my docker client should be updated. For completeness, trying variations of the pull command don't work either. $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow@sha256:79174a $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:79174a How about a CPU-only attempt with latest-devel? $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:latest-devel root@40bdfdb1318b:/InfoGAN# ls root@40bdfdb1318b:/InfoGAN# pip install -r requirements.txt root@40bdfdb1318b:/InfoGAN# export PYTHONPATH= root@40bdfdb1318b:/InfoGAN# ls So latest-devel and latest-devel-gpu both have some sort of api issue. Let's try 1.0.0-devel rather than 1.0.0-devel-gpu. $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:1.0.0-devel root@48269e969ce3:/InfoGAN# pip install -r requirements.txt root@48269e969ce3:/InfoGAN# export PYTHONPATH= Another api issue. Maybe I should try this all on a different system. |
On a different system, the api issue reproduces (per comment above) for this docker image. root# docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:1.0.0-devel On this different system, when trying to checkout a specific commit, the v2 registry error does not reproduce, but the 'tag cannot be found'. Maybe I can't check out specific commits? root# docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow@sha256:79174afa30046ecdc437b531812f2cb41a32695e Could someone please provide a working docker commandline? |
Where did you get |
Many thanks @tlbtlbtlb for such sustained help! The InfoGAN README says "As of the release, the latest commit is 79174a.", which links to tensorflow/tensorflow@79174af so I used that sha256. I'm not sure how to read the github page tensorflow/tensorflow@79174af for that commit, but maybe this suggests the commit is in tagged releases v1.1.0-rc1 through 0.12.0-rc0. Should I try the latest tag with this commit: $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:1.1.0-rc1-devel How about rc0 instead of rc1? $ docker run -v $(pwd)/InfoGAN:/InfoGAN -w /InfoGAN -it -p 8888:8888 gcr.io/tensorflow/tensorflow:1.1.0-rc0-devel root@e3c5e3fbc391:/InfoGAN# pip install -r requirements.txt root@e3c5e3fbc391:/InfoGAN# export PYTHONPATH= Same issue. |
That sha is for the source on github. So you have to build from source, following the instructions at https://www.tensorflow.org/install/install_sources, replacing |
Thanks again @tlbtlbtlb! Sorry for all my questions. I now checked out tensorflow and that specific commit as you suggest. Worked through these instructions for a CPU-only build: Installed bazel via Google's apt repo: Configured Tensorflow: root:/path/to/tensorflow # ./configure root:/path/to/tensorflow # bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package There is this outstanding issue with "name 'DATA_CFG' is not defined.", but if this Tensorflow version was indeed used for InfoGAN it must not be an issue. I have a typescript and can provide more info if desired. Separately, is there a Docker image of this specific version of Tensorflow with an InfoGAN build already at OpenAI? Should I try building a GPU-accelerated version instead of CPU-only? |
Could some one help me with the following problem? Thank you so much! During handling of the above exception, another exception occurred: Traceback (most recent call last): originally defined at: |
@aday00 This is really a tensorflow problem, try asking on their issue tracker. As a guess, it may require a different version of Bazel. |
@howardgriffin Your version of tensorflow doesn't match the version specified in the README |
solved this problem by: |
I fixed this problem by making several updates to this code and one to prettytensor to accommodate the new TensorFlow API. You can get those changes in my fork of this repo. |
@zjost I have tried your fork. It still gives me an error: This is the same as what I get after I changed a few tensor flow API calls in the code. How did you change the prettytensor code? Thanks! |
@hope-yao Traceback (most recent call last): This appears to be a tensorflow version issue, but I thought the point of your tip is to obtain a correct version of tensorflow and prettytensor? Thanks. |
@metatl here's what you need to change in prettytensor: https://github.com/google/prettytensor/pull/57/files This is linked to in my fork's ReadMe. |
root@1b0611ffe472:/InfoGAN# PYTHONPATH='.' python launchers/run_mnist_exp.py
Extracting MNIST/train-images-idx3-ubyte.gz
Extracting MNIST/train-labels-idx1-ubyte.gz
Extracting MNIST/t10k-images-idx3-ubyte.gz
Extracting MNIST/t10k-labels-idx1-ubyte.gz
--Return--
None
ipdb>
The text was updated successfully, but these errors were encountered: