Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dockerfile setup #30

Merged
merged 6 commits into from
Apr 15, 2019
Merged

Add dockerfile setup #30

merged 6 commits into from
Apr 15, 2019

Conversation

sallamander
Copy link
Collaborator

This PR adds two dockerfiles (a base dockerfile and an additional one where the conda environment is installed), along with a script that can be used to have the dockerfiles assembled (there a couple of useful flags that can be used during assembly). I've updated the main README to incorporate these changes, although I'd like to update it a bit more before merging.

I'm not sure if this is exactly what we had in mind with #18, so I'm open to using this as a starting point and discussing it further to see if we want to change this up. This is just pretty close to the setup I use in my personal repository, so I figured I'd adjust it for this repo and we could use it as a starting point. Depending on how this develops, we eventually might want to move to something a bit more intricate, maybe a services based approach (e.g. where one container handles the DB, one handles inference, etc.) But, I don't think we're there yet (nor do I have much experience with stitching multiple Docker containers together).

@typicalTYLER
Copy link
Collaborator

Nice work!

Getting an error when running exactly what was in the readme, not sure if I need to be in setup as my pwd or a path inside the dockerfile needs to be more specific.

(venv) tyler@tyler-MS-7821:~/PycharmProjects/SolarPanelDataWrangler$ bash setup/build_docker_images.sh --docker_user tyler
--branch not specified, using the default of master.
--conda_version not specified, using the default of 4.5.11.
Creating images with docker username tyler and miniconda 
version 4.5.11...
unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /home/tyler/PycharmProjects/SolarPanelDataWrangler/Dockerfile.base: no such file or directory
unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /home/tyler/PycharmProjects/SolarPanelDataWrangler/Dockerfile: no such file or directory

I'm building it with my pwd being setup now though and it seems to be going okay. I'll let you know if I run into any other problems. I had to use sudo to build it but I think I may have just installed docker incorrectly a while ago.

@typicalTYLER
Copy link
Collaborator

Got this error at the end:

Step 12/14 : RUN cd $HOME/repos/SolarPanelDataWrangler &&    git checkout $branch &&     cd $HOME/repos/SolarPanelDataWrangler/setup &&     conda install conda=$conda_version &&     conda env create -f $fname_environment_yml &&     cd $HOME
 ---> Running in 9579d233b598
Already on 'master'
Your branch is up-to-date with 'origin/master'.
/bin/sh: 1: cd: can't cd to /home/tyler/repos/SolarPanelDataWrangler/setup
The command '/bin/sh -c cd $HOME/repos/SolarPanelDataWrangler &&    git checkout $branch &&     cd $HOME/repos/SolarPanelDataWrangler/setup &&     conda install conda=$conda_version &&     conda env create -f $fname_environment_yml &&     cd $HOME' returned a non-zero code: 2

@sallamander
Copy link
Collaborator Author

Ahh, both of these are my bad. I should add to the README that it has to be run from the setup directory. I've never run into that first error, but I think that's because I've just by default always run it from the setup directory.

For the second one, there is an additional flag to the build_docker_images.sh script that is needed when there are setup changes needed that aren't merged to master (by default the script uses the master branch to build everything, and in this particular case setup doesn't exist there). I think the following should resolve that issue:

bash setup/build_docker_images.sh --docker_user tyler --branch docker-setup

@typicalTYLER
Copy link
Collaborator

Ahh, thanks! I had a hunch it was something like that but figured it was better to ask you than spend time trying to debug if it was something obvious. I'll retry the process again this evening.

@typicalTYLER
Copy link
Collaborator

It built successfully but I'm still having a little trouble switching over to using docker for my entire workflow. A couple questions:

  • How do you activate the conda environment? I was just trying to run this command sudo docker run spdw/final:latest python repos/SolarPanelDataWrangler/run_entire_process.py --state California --city "Daly City" but this doesn't work before activating the conda environment.

  • How would I get uncommitted changes into my docker image? Or do I need to somehow point PyCharm at the docker container?

  • How would I get an existing sqlite db file into the docker filesystem?

I can work on figuring these out on my own if you don't immediately know the answer though. I just figured maybe you might! The PR seems good to go though if you want to merge!

@sallamander
Copy link
Collaborator Author

sallamander commented Apr 12, 2019

Good questions! I think I can answer two of them. For context, I actually just work inside the docker container itself. So I do something like docker run -it [image ID] bash, and then I'm in a bash session inside a container created from the image and can work as if I was just logged into a remote instance. There will be a container saved when you do this, and it will persist changes just like a remote instance would. After the container is created, in the future you'll have to start / stop it and then log in, which consists of something like:

docker start [container ID]
docker exec -it [container ID] bash
docker stop [container ID]  # stop when you're finished

RE your questions...

(1) could be solved by using that workflow, but alternatively I think we could just add a source activate spdw inside the Dockerfile so that the environment is always active in the image (I'm not 100% sure this would work, but I think it might).

(2) I'm a little unsure of this one, but I think with the current setup the only way to do it would be to use the workflow described above and connect PyCharm to the container via some protocol (which I'm not sure PyCharm exposes, but would assume it does). Then hopefully you could simply load the repository that is present in the container. Alternatively, though (and I think this is the common way to do it), you could mount your code as a volume and then changes outside of the image / container should be mirrored inside the image / container.

(3) I think this is one where you'd have to mount it as a volume, too.

I'm going to touch up the README a tad to try to clarify instructions and then merge this. Happy to change around this workflow in the future! I don't actually know the most common workflow when using docker containers for development (or deployment for that matter), so this may very well not be the best way to do this.

@typicalTYLER
Copy link
Collaborator

  1. Either sounds good to me! I'm just unfamiliar with conda so I wasn't sure. Your bash example plus activating the environment worked!

  2. I think mounting is the better way to go here, pycharm requires professional edition to attach to docker containers and a) I don't have professional b) building a workflow around a professional edition seems antithetical to open source design. I'll have to try to figure out the volume mounting for this and 3.

I also have a change to the run_entire_process.py file to make the experience a bit better when using a docker container. (changing the polygon preview to print out a link instead of attempting to open a window, which doesn't work inside a docker container) Would you prefer I add it to this PR or just push to master?

@typicalTYLER
Copy link
Collaborator

Also, followup issue:

I realized that I had built the cpu version of the image so I was going to rebuild as the gpu version to see how the performance was, but it seems to just be hanging with no output when I attempt it

tyler@tyler-MS-7821:~/PycharmProjects/SolarPanelDataWrangler/setup$ sudo bash build_docker_images.sh --rebuild_image --docker_user tyler --branch docker-setup --gpu_build

Does this combination of flags not do what I think it does? Or do I need to maybe delete the existing images if I'm using the gpu image? Thanks and sorry for having so much trouble with it!

@sallamander
Copy link
Collaborator Author

I think mounting is the better way to go here, pycharm requires professional edition to attach to docker containers and a) I don't have professional b) building a workflow around a professional edition seems antithetical to open source design. I'll have to try to figure out the volume mounting for this and 3.

I agree! I didn't know the pycharm requires professional edition to attach to a container - that's too bad 😿 . I think mounting the code might be as simple as adding something like --volume [path to local repo]:[path to repo location in the container] when you're running the docker run command for the first time. Then you should be able to load the code in your local PyCharm and edit it there.

I also have a change to the run_entire_process.py file to make the experience a bit better when using a docker container. (changing the polygon preview to print out a link instead of attempting to open a window, which doesn't work inside a docker container) Would you prefer I add it to this PR or just push to master?

I'm good with you just pushing it to master!

RE the --gpu_build flag... this is also my fault, and I should have clarified this in the PR description and labeled the PR as preliminary (next time I'll do that, sorry!). That needs a separate GPU environment yaml file (where the only real difference is the use of tensorflow-gpu instead of tensorflow). I'll add that in here, too, and test it tonight before merging. Since we're only running inference right now, which is working reasonably fast with the CPU, I was holding off on the GPU YML. But with the --gpu_build flag exposed, we should have the GPU YML available. Sorry again for this!

@typicalTYLER
Copy link
Collaborator

Sounds good! No worries about the gpu-build, I was just trying to help test everything out as a naive docker user :) Your plan sounds good! And no rush on the merge or anything, I have a solid workflow for the time being.

@sallamander
Copy link
Collaborator Author

Oh, actually using the --gpu_build flag also requires installing nvidia-docker, and it might end up needing some alignment between the host GPU drivers and the driver in the container. I'll dig into this more, though, because I'm not 100% sure.

@typicalTYLER
Copy link
Collaborator

typicalTYLER commented Apr 12, 2019

Whoops just accidentally pushed to your branch, I'll back out my changes. (But I guess I'll just leave my docker related changes in if they're already there)

@sallamander
Copy link
Collaborator Author

sallamander commented Apr 15, 2019

I've added an environment_gpu.yml file that can be used to create an environment that makes use of the GPU. I noted it in the main README, but setup of the drivers and such is required before hand, even if using the docker containers (although there is less setup, there is still some). I tested setting up the environment using the environment_gpu.yml (by passing the --gpu_build flag to the build_docker_images.sh script) and running python run_entire_process.py --city Austin --state Texas and it seem to behave as expected.

I've updated the main README, too, to make a distinction between the GPU and no GPU builds, as well as specifying that the build_docker_images.sh script needs to be run from the setup directory itself.

@sallamander
Copy link
Collaborator Author

I think this is good to merge, so I'm going to go ahead and do so 👍

@sallamander sallamander merged commit 9d0d56d into master Apr 15, 2019
@sallamander sallamander deleted the docker-setup branch April 15, 2019 15:47
@typicalTYLER
Copy link
Collaborator

Thanks again for getting this all set up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants