Skip to content

Analytical Platform Jupyter Notebook • This repository is defined and managed in Terraform

License

Notifications You must be signed in to change notification settings

ministryofjustice/analytics-platform-jupyter-notebook

Repository files navigation

analytics-platform-jupyter-notebook

JupyterLab Docker images for Analytical Platform.

CI/CD:

themselves. This releases the helm chart above.

About Jupyter Notebook

From Jupter:

The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more."

Docker images

We currently have 3 flavours of JupyterLab:

  • datascience-notebook is the standard image currently used by most of our users
  • allspark-notebook is similar to the datascience-notebook one but it includes Spark. This is currently used mainly by the Data Engineer team and it's deployed manually. Long term we may use this image by default instead of the datascience-notebook one.
  • oracle-datascience-notebook is a temporary image which contains drivers to connect to Oracle databases as part of some niche and temporary work. This image is hopefully going to disappear very soon.

These images are derived from jupyter/docker-stacks.

NOTE: There is a page with recipes in the docker-stacks repository. This may be useful, Jupyter Docker Recipes

Build and Run

From the sub-directory for the image you want to build

make build

Releasing a new image version

Once your changes are approved and merged into the main branch, create a new release tag from the GitHub interface.

This will trigger a new run of the GitHub Actions worflow which will build the images and push to our private AWS ECR registry.

Once the image is correctly pushed to this registry you can update the relevant JupyterLab helm chart values and/or update a specific user's kubernetes Deployment .

If you're releasing a new version of the JupyterLab helm chart, talk with a Control Panel admin to be sure this new version is added to the Tools catalogue and users can deploy/upgrade JupyterLab and use it.

Disabling authentication

In order to disable the authentication, we append --NotebookApp.token='' as an argument

docker container run -d --rm -p 8888:8888 jupyter/datascience-notebook start.sh jupyter lab --NotebookApp.token=''

Grant sudo with disabled authentication

docker container run -d --rm -p 8888:8888 -e GRANT_SUDO=yes jupyter/datascience-notebook start.sh jupyter lab --NotebookApp.token=''

Known issues

These images work, but with questions to resolve:

  • The image creates a 'jovyan' user with UID 1000, we'll need to figure it out how this will work with the NFS home (rename/change user UID?)
  • anaconda/jupyter dashboard are installed as this user, this may need to change