Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom notebooks #150

Merged
merged 24 commits into from
May 11, 2021
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
3d4d9cf
remove notebook volume
cwcummings Mar 15, 2021
8c8f715
bump images version
cwcummings Mar 31, 2021
1d5164f
revert change on jupyterhub_config
cwcummings Apr 7, 2021
85b50a0
new deploy script to download notebooks in specific images
cwcummings Apr 9, 2021
e33f841
add log file output and var replacement in deployment script
cwcummings Apr 12, 2021
914bdc6
improve logging in new deploy script
cwcummings Apr 14, 2021
fc2e504
only install yq if necessary in deploy script
cwcummings Apr 16, 2021
04b46a9
add custom tutorial notebooks volume with a subfolder of the same nam…
cwcummings Apr 19, 2021
a3d5a24
fix typo
cwcummings Apr 19, 2021
27e5c7f
update doc
cwcummings Apr 19, 2021
a91a145
change dest_dir option from config to a env var
cwcummings Apr 21, 2021
a081a7c
fix volume mounts for notebooks for backward compatibility
cwcummings Apr 21, 2021
f19af5e
fix scheduler log variable
cwcummings Apr 21, 2021
51846fd
reorder variables in env.local file
cwcummings Apr 21, 2021
f8e2b12
multiple fixes on script and env.local after PR feedback
cwcummings Apr 22, 2021
55234f3
add cronjob generation and moved related deploy script to external repo
cwcummings Apr 23, 2021
31d5027
fix logdir for script and remove unnecessary volume mount
cwcummings May 3, 2021
fe7348c
remove common directory volume mount
cwcummings May 3, 2021
cfe4634
move env file for deploy script to external repo + minor fixes for pr…
cwcummings May 4, 2021
5c1e56a
rename mount directory to avoid conflict with other deploy jobs
cwcummings May 11, 2021
c934e3d
add todo comment
cwcummings May 11, 2021
927f090
fix error commented line in config
cwcummings May 11, 2021
fe79bc5
Merge branch 'master' of https://github.com/bird-house/birdhouse-depl…
cwcummings May 11, 2021
8990073
bump pavics jupyter images versions
cwcummings May 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 24 additions & 9 deletions birdhouse/config/jupyterhub/jupyterhub_config.py.template
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ from os.path import join
import logging
import subprocess

from dockerspawner import DockerSpawner

c = get_config() # noqa # can be called directy without import because injected by IPython

c.JupyterHub.bind_url = 'http://:8000/jupyter'
Expand All @@ -20,7 +22,27 @@ c.JupyterHub.db_url = '/persist/jupyterhub.sqlite'

c.JupyterHub.template_paths = ['/custom_templates']

c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'
class CustomDockerSpawner(DockerSpawner):
def start(self):
if(os.environ['MOUNT_IMAGE_SPECIFIC_NOTEBOOKS'] == 'true'):
host_dir = join(os.environ['JUPYTERHUB_USER_DATA_DIR'], 'tutorial-notebooks-specific-images')

# Mount a volume with a tutorial-notebook subfolder corresponding to the image name, if it exists
image_name = self.user_options.get('image')
if(os.path.isdir(join(host_dir, image_name))):
self.volumes[join(host_dir, image_name)] = {
"bind": '/notebook_dir/tutorial-notebooks',
"mode": "ro"
}
else:
# Mount the entire tutorial-notebooks directory
self.volumes[join(os.environ['JUPYTERHUB_USER_DATA_DIR'], "tutorial-notebooks")] = {
"bind": "/notebook_dir/tutorial-notebooks",
"mode": "ro"
}
return super().start()

c.JupyterHub.spawner_class = CustomDockerSpawner

# Selects the first image from the list by default
c.DockerSpawner.image = os.environ['DOCKER_NOTEBOOK_IMAGES'].split()[0]
Expand Down Expand Up @@ -49,13 +71,6 @@ if len(host_gdrive_settings_path) > 0:
"mode": "ro"
}

host_tutorial_notebooks_dir = join(jupyterhub_data_dir, "tutorial-notebooks")
c.DockerSpawner.volumes[host_tutorial_notebooks_dir] = {
"bind": join(notebook_dir, "tutorial-notebooks"),
"mode": "ro"
}


readme = os.environ['JUPYTERHUB_README']
if os.path.exists(readme):
c.DockerSpawner.volumes[readme] = {
Expand Down Expand Up @@ -86,7 +101,7 @@ c.Spawner.disable_user_config = True
c.DockerSpawner.default_url = '/lab'
c.DockerSpawner.remove = True # delete containers when servers are stopped
${ENABLE_JUPYTERHUB_MULTI_NOTEBOOKS} # noqa
c.DockerSpawner.pull_policy = "always" # for images not using pinned version
# c.DockerSpawner.pull_policy = "always" # for images not using pinned version
ChaamC marked this conversation as resolved.
Show resolved Hide resolved
c.DockerSpawner.debug = True
c.JupyterHub.log_level = logging.DEBUG

Expand Down
12 changes: 12 additions & 0 deletions birdhouse/default.env
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,25 @@
# Jupyter single-user server images, can be overriden in env.local to have a space separated list of multiple images
export DOCKER_NOTEBOOK_IMAGES="pavics/workflow-tests:210216"

# Name of the image displayed on the JupyterHub image selection page
# Can be overriden in env.local to have a space separated list of multiple images, the name order must correspond
# to the order of the DOCKER_NOTEBOOK_IMAGES variable
export JUPYTERHUB_IMAGE_SELECTION_NAMES="pavics"

export FINCH_IMAGE="birdhouse/finch:version-0.7.1"

export THREDDS_IMAGE="unidata/thredds-docker:4.6.15"

# Folder on the host to persist Jupyter user data (noteboooks, HOME settings)
export JUPYTERHUB_USER_DATA_DIR="/data/jupyterhub_user_data"

# Log directory used for the various scheduler tasks
# TODO: use this variable for other references of the log path (only used in the pavics-jupyter-base's .env file for now)
export PAVICS_LOG_DIR=/var/log/PAVICS

# Activates mounting a tutorial-notebooks subfolder that has the same name as the spawned image on JupyterHub
export MOUNT_IMAGE_SPECIFIC_NOTEBOOKS=false
ChaamC marked this conversation as resolved.
Show resolved Hide resolved

# Path to the file containing the clientID for the google drive extension for jupyterlab
export JUPYTER_GOOGLE_DRIVE_SETTINGS=""

Expand Down
2 changes: 2 additions & 0 deletions birdhouse/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,7 @@ services:
- "8800:8000"
environment:
DOCKER_NOTEBOOK_IMAGES: ${DOCKER_NOTEBOOK_IMAGES}
JUPYTERHUB_IMAGE_SELECTION_NAMES: ${JUPYTERHUB_IMAGE_SELECTION_NAMES}
DOCKER_NETWORK_NAME: jupyterhub_network
JUPYTERHUB_USER_DATA_DIR: ${JUPYTERHUB_USER_DATA_DIR}
JUPYTERHUB_ADMIN_USERS: ${JUPYTERHUB_ADMIN_USERS}
Expand All @@ -358,6 +359,7 @@ services:
JUPYTER_DEMO_USER_CPU_LIMIT: ${JUPYTER_DEMO_USER_CPU_LIMIT}
JUPYTER_GOOGLE_DRIVE_SETTINGS: ${JUPYTER_GOOGLE_DRIVE_SETTINGS}
JUPYTERHUB_README: ${JUPYTERHUB_README}
MOUNT_IMAGE_SPECIFIC_NOTEBOOKS: ${MOUNT_IMAGE_SPECIFIC_NOTEBOOKS}
volumes:
- ./config/jupyterhub/jupyterhub_config.py:/srv/jupyterhub/jupyterhub_config.py:ro
- ./config/jupyterhub/custom_templates:/custom_templates:ro
Expand Down
37 changes: 33 additions & 4 deletions birdhouse/env.local.example
Original file line number Diff line number Diff line change
Expand Up @@ -211,12 +211,19 @@ export POSTGRES_MAGPIE_PASSWORD=postgres-qwerty
# pavics/crim-jupyter-eo:0.1.0 \
# pavics/crim-jupyter-nlp:0.1.0"

# Name of the images displayed on the JupyterHub image selection page
# The name order must correspond to the order of the DOCKER_NOTEBOOK_IMAGES variable,
# and both variables should have the same number of entries.
#export JUPYTERHUB_IMAGE_SELECTION_NAMES="pavics \
# eo-crim \
# nlp-crim"

# allow jupyterhub user selection of which notebook image to run
# see https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html
#export ENABLE_JUPYTERHUB_MULTI_NOTEBOOKS="
#c.DockerSpawner.image_whitelist = {'pavics': os.environ['DOCKER_NOTEBOOK_IMAGES'].split()[0],
# 'eo-crim': os.environ['DOCKER_NOTEBOOK_IMAGES'].split()[1],
# 'nlp-crim': os.environ['DOCKER_NOTEBOOK_IMAGES'].split()[2],
#c.DockerSpawner.image_whitelist = {os.environ['JUPYTERHUB_IMAGE_SELECTION_NAMES'].split()[0]: os.environ['DOCKER_NOTEBOOK_IMAGES'].split()[0],
# os.environ['JUPYTERHUB_IMAGE_SELECTION_NAMES'].split()[1]: os.environ['DOCKER_NOTEBOOK_IMAGES'].split()[1],
# os.environ['JUPYTERHUB_IMAGE_SELECTION_NAMES'].split()[2]: os.environ['DOCKER_NOTEBOOK_IMAGES'].split()[2],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these lines, I am aware it is not super clean. I haven't found an easier way to implement it yet, since the concerned variables have to be working both on shellscript and python.

About DOCKER_NOTEBOOK_IMAGES and JUPYTERHUB_IMAGE_SELECTION_NAMES, we could have use those names in just one variable instead of 2, using a string "formatted similarly to a dictionary" such as "pavics;pavics/workflow-tests:210216 eo-crim;pavics/crim-jupyter-eo:0.1.0 ...." but it would require additional string splitting, so I don't think it would be cleaner.

Another thing is we could probably simplify this by using the same name all the time, instead of having 2 different name format. We would use the image name such as pavics/crim-jupyter-eo:0.1.0 everywhere though, meaning it would be what we see in the jupyterhub list, and it would be the name of the directories for the tutorial-notebooks, instead of the shorter version eo-crim. I don't think it would be necessarily cleaner either. (Potential problem with the '/' found in the image names, clashing with using it as a directory name?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it's ugly, but also agree other alternative either complexify the implemenation a lot or look less great. Open to other alternative as well, if someone see something we both missed ...

# 'jupyter/scipy-notebook': 'jupyter/scipy-notebook',
# 'jupyter/r-notebook': 'jupyter/r-notebook',
# 'jupyter/tensorflow-notebook': 'jupyter/tensorflow-notebook',
Expand All @@ -226,6 +233,28 @@ export POSTGRES_MAGPIE_PASSWORD=postgres-qwerty
# }
#"

ChaamC marked this conversation as resolved.
Show resolved Hide resolved
# Load jobs to automatically deploy the custom notebooks from the specific images
#
# Ensure we always use the "latest" version of the "cronjob generation code"
# Path to a checked out repo of "pavics-jupyter-base" (https://github.com/bird-house/pavics-jupyter-base)
# which contains the config required for the cronjob generation
#CHECKOUT_PAVICS_JUPYTER_BASE="/path/to/checkout/pavics-jupyter-base"
#export AUTODEPLOY_EXTRA_REPOS="$AUTODEPLOY_EXTRA_REPOS $CHECKOUT_PAVICS_JUPYTER_BASE"

# Config for the generation of cronjobs, found on external repo
#DEPLOY_DATA_PAVICS_JUPYTER_ENV="$CHECKOUT_PAVICS_JUPYTER_BASE/scheduler-jobs/deploy_data_pavics_jupyter.env"

# Generates a cronjob for each image found in DOCKERNOTEBOOK_IMAGES
#if [ -f "$DEPLOY_DATA_PAVICS_JUPYTER_ENV" ]; then
# . $DEPLOY_DATA_PAVICS_JUPYTER_ENV
#fi

# Activates mounting a tutorial-notebooks subfolder that has the same name as the spawned image on JupyterHub
# This variable is only useful if there are more than one image in DOCKER_NOTEBOOK_IMAGES
# and ENABLE_JUPYTERHUB_MULTI_NOTEBOOKS is set with a proper c.DockerSpawner.image_whitelist
# matching the images in DOCKER_NOTEBOOK_IMAGES and their corresponding JUPYTERHUB_IMAGE_SELECTION_NAMES.
# export MOUNT_IMAGE_SPECIFIC_NOTEBOOKS=true
ChaamC marked this conversation as resolved.
Show resolved Hide resolved

# The parent folder where all the user notebooks will be stored.
# For example, a user named "bob" will have his data in $JUPYTERHUB_USER_DATA_DIR/bob
# and this folder will be mounted when he logs into JupyterHub.
Expand Down Expand Up @@ -307,4 +336,4 @@ export POSTGRES_MAGPIE_PASSWORD=postgres-qwerty
# To enable emu: add './optional-components/emu' to EXTRA_CONF_DIRS above.

# Emu WPS service image if that testing component is enabled
#export EMU_IMAGE="tlvu/emu:watchdog"
#export EMU_IMAGE="tlvu/emu:watchdog"