Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Singularity/Apptainer container #516

Merged
merged 4 commits into from
Sep 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 22 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@


## Installation
You can install this package and dependences in a [python virtual environment](#virtual-environment) or use a [Docker container](#install-in-docker) (recommended).
You can install this package and dependences in a [python virtual environment](#virtual-environment) or use a [Docker/Singularity/Apptainer container](#install-in-docker) (recommended).

*TL;DR to install the Jax version for GPU run:*

Expand Down Expand Up @@ -89,7 +89,8 @@ pip3 install -e '.[full]'
</details>

## Docker
We recommend using a Docker container to ensure a similar environment to our scoring and testing environments.
We recommend using a Docker container to ensure a similar environment to our scoring and testing environments.
Alternatively, a Singularity/Apptainer container can also be used (see instructions below).


**Prerequisites for NVIDIA GPU set up**: You may have to install the NVIDIA Container Toolkit so that the containers can locate the NVIDIA drivers and GPUs.
Expand Down Expand Up @@ -133,6 +134,25 @@ To use the Docker container as an interactive virtual environment, you can run a
### Running Docker Container (End-to-end)
To run a submission end-to-end in a containerized environment see [Getting Started Document](./getting_started.md#run-your-submission-in-a-docker-container).

### Using Singularity/Apptainer instead of Docker
Since many compute clusters don't allow the usage of Docker due to securtiy concerns and instead encourage the use of [Singularity/Apptainer](https://github.com/apptainer/apptainer) (formerly Singularity, now called Apptainer), we also provide instructions on how to build an Apptainer container based on the here provided Dockerfile.

To convert the Dockerfile into an Apptainer definition file, we will use [spython](https://github.com/singularityhub/singularity-cli):
```bash
pip3 install spython
cd algorithmic-efficiency/docker
spython recipe Dockerfile &> Singularity.def
```
Now we can build the Apptainer image by running
```bash
singularity build --fakeroot <singularity_image_name>.sif Singularity.def
```
To start a shell session with GPU support (by using the `--nv` flag), we can run
```bash
singularity shell --nv <singularity_image_name>.sif
```
Similarly to Docker, Apptainer allows you to bind specific paths on the host system and the container by specifying the `--bind` flag, as explained [here](https://docs.sylabs.io/guides/3.7/user-guide/bind_paths_and_mounts.html).

# Getting Started
For instructions on developing and scoring your own algorithm in the benchmark see [Getting Started Document](./getting_started.md).
## Running a workload
Expand Down
23 changes: 11 additions & 12 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,12 @@

# To build Docker image
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04
ARG DEBIAN_FRONTEND=noninteractive

# Installing machine packages
RUN echo "Setting up machine"
RUN apt-get update
RUN apt-get install -y curl tar
RUN apt-get install -y git python3 pip wget ffmpeg
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y git python3 pip wget ffmpeg
RUN apt-get install libtcmalloc-minimal4
RUN apt-get install unzip
RUN apt-get install pigz
Expand All @@ -34,38 +33,38 @@ RUN echo "Setting up algorithmic_efficiency repo"
ARG branch="main"
ARG framework="both"
ARG git_url=https://github.com/mlcommons/algorithmic-efficiency.git
RUN git clone $git_url && cd algorithmic-efficiency
RUN cd algorithmic-efficiency && git checkout $branch
RUN git clone $git_url && cd /algorithmic-efficiency
RUN cd /algorithmic-efficiency && git checkout $branch

RUN cd algorithmic-efficiency && pip install -e '.[full]'
RUN cd /algorithmic-efficiency && pip install -e '.[full]'

RUN if [ "$framework" = "jax" ] ; then \
echo "Installing Jax GPU" \
&& cd algorithmic-efficiency \
&& cd /algorithmic-efficiency \
&& pip install -e '.[jax_gpu]' -f 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html' \
&& pip install -e '.[pytorch_cpu]' -f 'https://download.pytorch.org/whl/torch_stable.html'; \
elif [ "$framework" = "pytorch" ] ; then \
echo "Installing Pytorch GPU" \
&& cd algorithmic-efficiency \
&& cd /algorithmic-efficiency \
&& pip install -e '.[jax_cpu]' \
&& pip install -e '.[pytorch_gpu]' -f 'https://download.pytorch.org/whl/torch_stable.html'; \
elif [ "$framework" = "both" ] ; then \
echo "Installing Jax GPU and Pytorch GPU" \
&& cd algorithmic-efficiency \
&& cd /algorithmic-efficiency \
&& pip install -e '.[jax_gpu]' -f 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html' \
&& pip install -e '.[pytorch_gpu]' -f 'https://download.pytorch.org/whl/torch_stable.html'; \
else \
echo "Invalid build-arg $framework: framework should be either jax, pytorch or both." >&2 \
&& exit 1 ; \
fi

RUN cd algorithmic-efficiency && pip install -e '.[wandb]'
RUN cd /algorithmic-efficiency && pip install -e '.[wandb]'

RUN cd algorithmic-efficiency && git fetch origin
RUN cd algorithmic-efficiency && git pull
RUN cd /algorithmic-efficiency && git fetch origin
RUN cd /algorithmic-efficiency && git pull

# Todo: remove this, this is temporary for developing
COPY scripts/startup.sh /algorithmic-efficiency/docker/scripts/startup.sh
RUN chmod a+x /algorithmic-efficiency/docker/scripts/startup.sh

ENTRYPOINT ["bash", "algorithmic-efficiency/docker/scripts/startup.sh"]
ENTRYPOINT ["bash", "/algorithmic-efficiency/docker/scripts/startup.sh"]