Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEST: use Github based GPU instance for CI #183

Closed
wants to merge 22 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 43 additions & 40 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,35 +1,38 @@
name: Build Project [using jupyter-book]
on: [pull_request]
jobs:
deploy-runner:
runs-on: ubuntu-latest
steps:
- uses: iterative/setup-cml@v2
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Deploy runner on EC2
env:
REPO_TOKEN: ${{ secrets.QUANTECON_SERVICES_PAT }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run: |
cml runner launch \
--cloud=aws \
--cloud-region=us-west-2 \
--cloud-type=p3.2xlarge \
--labels=cml-gpu \
--cloud-hdd-size=40
preview:
needs: deploy-runner
runs-on: [self-hosted, cml-gpu]
container:
image: docker://mmcky/quantecon-lecture-python:cuda-12.3.1-anaconda-2024-02-py311
options: --gpus all
runs-on: ubuntu-latest-gpu
env:
DOCKER_IMG_NAME: mmcky/quantecon-lecture-python:cuda-12.3.1-anaconda-2024-02-py311
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Restore Docker Image Cache if it exists
id: cache-docker-mmcky
uses: actions/cache@v4
with:
path: ci/cache/docker/cuda
key: cache-docker-${{ env.DOCKER_IMG_NAME }}

- name: Update Docker Image Cache if cache miss
if: steps.cache-docker-mmcky.outputs.cache-hit != 'true'
run: |
docker pull $DOCKER_IMG_NAME
mkdir -p ci/cache/docker/cuda
docker image save $DOCKER_IMG_NAME --output ./ci/cache/docker/cuda/docker-img.tar

- name: Use Docker Image Cache if cache hit
if: steps.cache-docker-mmcky.outputs.cache-hit == 'true'
run: docker image load --input ./ci/cache/docker/cuda/docker-img.tar

# - name: Install JAX[CUDA] and Numpyro[CUDA]
# shell: bash -l {0}
# run: |
# pip install --upgrade "jax[cuda]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
# pip install --upgrade "numpyro[cuda]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

- name: Check nvidia drivers
shell: bash -l {0}
run: |
Expand All @@ -38,9 +41,9 @@ jobs:
shell: bash -l {0}
run: |
python --version
- name: Display Conda Environment Versions
shell: bash -l {0}
run: conda list
# - name: Display Conda Environment Versions
# shell: bash -l {0}
# run: conda list
- name: Display Pip Versions
shell: bash -l {0}
run: pip list
Expand All @@ -52,18 +55,18 @@ jobs:
name: build-cache
path: _build
# Build Assets (Download Notebooks and PDF via LaTeX)
- name: Build Download Notebooks (sphinx-tojupyter)
shell: bash -l {0}
run: |
jb build lectures --path-output ./ --builder=custom --custom-builder=jupyter -n -W --keep-going
mkdir -p _build/html/_notebooks
cp -u _build/jupyter/*.ipynb _build/html/_notebooks
- name: Build PDF from LaTeX
shell: bash -l {0}
run: |
jb build lectures --builder pdflatex --path-output ./ -n -W --keep-going
mkdir _build/html/_pdf
cp -u _build/latex/*.pdf _build/html/_pdf
# - name: Build Download Notebooks (sphinx-tojupyter)
# shell: bash -l {0}
# run: |
# jb build lectures --path-output ./ --builder=custom --custom-builder=jupyter -n -W --keep-going
# mkdir -p _build/html/_notebooks
# cp -u _build/jupyter/*.ipynb _build/html/_notebooks
# - name: Build PDF from LaTeX
# shell: bash -l {0}
# run: |
# jb build lectures --builder pdflatex --path-output ./ -n -W --keep-going
# mkdir _build/html/_pdf
# cp -u _build/latex/*.pdf _build/html/_pdf
# Final Build of HTML
- name: Build HTML
shell: bash -l {0}
Expand All @@ -90,4 +93,4 @@ jobs:
deploy-message: "Preview Deploy from GitHub Actions"
env:
NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}
NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}
NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}
23 changes: 11 additions & 12 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -1,27 +1,26 @@
name: lecture-jax
name: quantecon
channels:
- default
dependencies:
- python=3.11
- anaconda=2023.09
- anaconda=2024.02
- pip
- pip:
- jupyter-book==0.15.1
- docutils==0.17.1
- quantecon-book-theme==0.7.1
- sphinx-tojupyter==0.3.0
- sphinxext-rediraffe==0.2.7
- sphinx-reredirects==0.1.3
- sphinx-exercise==0.4.1
- ghp-import==1.1.0
- sphinxcontrib-youtube==1.1.0
- sphinx-togglebutton==0.3.1
- arviz==0.13.0
- array-to-latex
- prettytable
- kaleido
# Sandpit Requirements
# - quantecon
# - array-to-latex
# - PuLP
# - cvxpy
# - cvxopt
# - cylp
# - prettytable
- arviz
# Docker Requirements
- pytz
# Docutils Issue (https://github.com/mcmtroffaes/sphinxcontrib-bibtex/issues/322)
- docutils==0.17.1

5 changes: 2 additions & 3 deletions lectures/status.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,8 @@ This table contains the latest execution statistics.

(status:machine-details)=

These lectures are built on `linux` instances through `github actions` and `amazon web services (aws)` to
enable access to a `gpu`. These lectures are built on a [p3.2xlarge](https://aws.amazon.com/ec2/instance-types/p3/)
that has access to `8 vcpu's`, a `V100 NVIDIA Tesla GPU`, and `61 Gb` of memory.
These lectures are built on `linux` instances through `github actions` that has
access to a `gpu`. These lectures make use of the nvidia `T4` card.

You can check the backend used by JAX using:

Expand Down
51 changes: 51 additions & 0 deletions setup_cuda.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#!/bin/bash

# Exit script on any error
set -e

# Define variables
NV_CUDA_LIB_VERSION="12.4.1-1"
NV_CUDA_CUDART_DEV_VERSION="12.4.127-1"
NV_NVML_DEV_VERSION="12.4.127-1"
NV_LIBCUSPARSE_DEV_VERSION="12.3.1.170-1"
NV_LIBNPP_DEV_VERSION="12.2.5.30-1"
NV_LIBNPP_DEV_PACKAGE="libnpp-dev-12-4=${NV_LIBNPP_DEV_VERSION}"
NV_LIBCUBLAS_DEV_PACKAGE_NAME="libcublas-dev-12-4"
NV_LIBCUBLAS_DEV_VERSION="12.4.5.8-1"
NV_LIBCUBLAS_DEV_PACKAGE="${NV_LIBCUBLAS_DEV_PACKAGE_NAME}=${NV_LIBCUBLAS_DEV_VERSION}"
NV_CUDA_NSIGHT_COMPUTE_VERSION="12.4.1-1"
NV_CUDA_NSIGHT_COMPUTE_DEV_PACKAGE="cuda-nsight-compute-12-4=${NV_CUDA_NSIGHT_COMPUTE_VERSION}"
NV_NVPROF_VERSION="12.4.127-1"
NV_NVPROF_DEV_PACKAGE="cuda-nvprof-12-4=${NV_NVPROF_VERSION}"
NV_LIBNCCL_DEV_PACKAGE_NAME="libnccl-dev"
NV_LIBNCCL_DEV_PACKAGE_VERSION="2.21.5-1"
NCCL_VERSION="2.21.5-1"
NV_LIBNCCL_DEV_PACKAGE="${NV_LIBNCCL_DEV_PACKAGE_NAME}=${NV_LIBNCCL_DEV_PACKAGE_VERSION}+cuda12.4"

# Update package lists
sudo apt-get update

# Install CUDA development packages
sudo apt-get install -y --no-install-recommends \
cuda-cudart-dev-12-4=${NV_CUDA_CUDART_DEV_VERSION} \
cuda-command-line-tools-12-4=${NV_CUDA_LIB_VERSION} \
cuda-minimal-build-12-4=${NV_CUDA_LIB_VERSION} \
cuda-libraries-dev-12-4=${NV_CUDA_LIB_VERSION} \
cuda-nvml-dev-12-4=${NV_NVML_DEV_VERSION} \
${NV_NVPROF_DEV_PACKAGE} \
${NV_LIBNPP_DEV_PACKAGE} \
libcusparse-dev-12-4=${NV_LIBCUSPARSE_DEV_VERSION} \
${NV_LIBCUBLAS_DEV_PACKAGE} \
${NV_LIBNCCL_DEV_PACKAGE} \
${NV_CUDA_NSIGHT_COMPUTE_DEV_PACKAGE}

# Clean up
sudo rm -rf /var/lib/apt/lists/*

# Prevent auto-upgrade of specific packages
sudo apt-mark hold ${NV_LIBCUBLAS_DEV_PACKAGE_NAME} ${NV_LIBNCCL_DEV_PACKAGE_NAME}

# Set environment variable
export LIBRARY_PATH=/usr/local/cuda/lib64/stubs

echo "CUDA development environment setup is complete."
Loading