Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add PR CI for cugraph-pyg and cugraph-dgl #59

Merged
merged 20 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
74b5c95
add PR CI for cugraph-pyg and cugraph-dgl
jameslamb Oct 21, 2024
8b80338
remove some duplication in dependencies.yaml
jameslamb Oct 21, 2024
fb5e9e0
fix env for conda builds, stopping have builds depend on tests
jameslamb Oct 21, 2024
84fccbc
fix wheel metadata
jameslamb Oct 21, 2024
74ecbe5
cugraph-pyg and cugraph-dgl have runtime dependencies on pandas... th…
jameslamb Oct 21, 2024
2ed9905
remove test_wheel.sh script
jameslamb Oct 21, 2024
802e46c
declare torch testing dependency for cugraph-dgl, combine pip install…
jameslamb Oct 21, 2024
4033210
use pytorch_geometric instead of pyg
jameslamb Oct 21, 2024
fad3e8b
the pip package is called 'torch-geometric' not 'pytorch_geometric'
jameslamb Oct 21, 2024
887e32f
update dgl doc, include karata.csv
jameslamb Oct 21, 2024
4344fd0
update to pytorch and dgl pins used in cugraph repo
jameslamb Oct 21, 2024
b86bdd3
update files
alexbarghi-nv Oct 22, 2024
7897f2f
update view.py
alexbarghi-nv Oct 22, 2024
2c0e654
Merge branch 'branch-24.12' into add-pyg-and-dgl-ci
jameslamb Oct 28, 2024
e718840
set RAPIDS_DATASET_ROOT_DIR
jameslamb Oct 28, 2024
7442922
re-enable all CI, remove unintentionally-included cufile.log
jameslamb Oct 28, 2024
6cc3dec
print sccache stats in builds
jameslamb Oct 29, 2024
a421fc9
Merge branch 'add-pyg-and-dgl-ci' of github.com:jameslamb/cugraph-gnn…
jameslamb Oct 29, 2024
1202c36
remove unnecessary channel for builds, use CI instead of CI_RUN, cons…
jameslamb Oct 29, 2024
e6c07fa
more dataset to datasets
jameslamb Oct 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ jobs:
- conda-python-tests
- wheel-build-pylibwholegraph
- wheel-tests-pylibwholegraph
- wheel-build-cugraph-dgl
- wheel-tests-cugraph-dgl
- wheel-build-cugraph-pyg
- wheel-tests-cugraph-pyg
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
if: always()
Expand Down Expand Up @@ -104,3 +108,35 @@ jobs:
build_type: pull-request
script: ci/test_wheel_pylibwholegraph.sh
matrix_filter: map(select(.ARCH == "amd64"))
wheel-build-cugraph-dgl:
needs: checks
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
with:
build_type: pull-request
script: ci/build_wheel_cugraph-dgl.sh
wheel-tests-cugraph-dgl:
needs: [wheel-build-pylibwholegraph, wheel-build-cugraph-dgl, changed-files]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
if: fromJSON(needs.changed-files.outputs.changed_file_groups).test_python
with:
build_type: pull-request
script: ci/test_wheel_cugraph-dgl.sh
matrix_filter: map(select(.ARCH == "amd64"))
wheel-build-cugraph-pyg:
needs: checks
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
with:
build_type: pull-request
script: ci/build_wheel_cugraph-pyg.sh
wheel-tests-cugraph-pyg:
needs: [wheel-build-pylibwholegraph, wheel-build-cugraph-pyg, changed-files]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
if: fromJSON(needs.changed-files.outputs.changed_file_groups).test_python
with:
build_type: pull-request
script: ci/test_wheel_cugraph-pyg.sh
matrix_filter: map(select(.ARCH == "amd64"))
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,11 @@ cpp/thirdparty/googletest/
*.iws

## Datasets
datasets/*
dataset/
datasets/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need both of these paths? Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, there's an opportunity to simplify here.

I added dataset/ because I saw files created at python/cugraph-pyg/cugraph_pyg/dataset/ when running the cugraph-pyg examples locally.

Looks like that comes from here: https://github.com/snap-stanford/ogb/blob/f631af76359c9687b2fe60905557bbb241916258/ogb/nodeproppred/dataset.py#L13

I just pushed changes writing all those datasets to datasets/ instead, so we don't need to have both rules in gitignore.

!datasets/cyber.csv
!datasets/get_test_data.sh
!datasets/karate.csv
!datasets/karate-data.csv
!datasets/karate_undirected.csv
!datasets/karate-disjoint.csv
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ repos:
setup[.]cfg$
- id: verify-alpha-spec
- repo: https://github.com/rapidsai/dependency-file-generator
rev: v1.15.1
rev: v1.16.0
hooks:
- id: rapids-dependency-file-generator
args: ["--clean"]
4 changes: 4 additions & 0 deletions ci/build_cpp.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ version=$(rapids-generate-version)

rapids-logger "Begin cpp build"

sccache --zero-stats

RAPIDS_PACKAGE_VERSION=${version} rapids-conda-retry mambabuild conda/recipes/libwholegraph

sccache --show-adv-stats

rapids-upload-conda-to-s3 cpp
16 changes: 16 additions & 0 deletions ci/build_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)

rapids-generate-version > ./VERSION

sccache --zero-stats

# TODO: Remove `--no-test` flags once importing on a CPU
# node works correctly
rapids-logger "Begin pylibwholegraph build"
Expand All @@ -25,4 +27,18 @@ RAPIDS_PACKAGE_VERSION=$(head -1 ./VERSION) rapids-conda-retry mambabuild \
--channel "${CPP_CHANNEL}" \
conda/recipes/pylibwholegraph

sccache --show-adv-stats

RAPIDS_PACKAGE_VERSION=$(head -1 ./VERSION) rapids-conda-retry mambabuild \
--no-test \
--channel "${CPP_CHANNEL}" \
--channel "${RAPIDS_CONDA_BLD_OUTPUT_DIR}" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need ${RAPIDS_CONDA_BLD_OUTPUT_DIR} here? What packages is it supplying? I don't see pylibwholegraph in the dependencies of cugraph-pyg or cugraph-dgl.

Same for cugraph-dgl below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're totally right. pylibwholegraph is a testing dependency of both cugraph-pyg and cugraph-dgl, but not a build-time dependency, so --channel "${RAPIDS_CONDA_BLD_OUTPUT_DIR}" is unnecessary. I just pushed 1202c36 removing it.

conda/recipes/cugraph-pyg

RAPIDS_PACKAGE_VERSION=$(head -1 ./VERSION) rapids-conda-retry mambabuild \
--no-test \
--channel "${CPP_CHANNEL}" \
--channel "${RAPIDS_CONDA_BLD_OUTPUT_DIR}" \
conda/recipes/cugraph-dgl

rapids-upload-conda-to-s3 python
2 changes: 2 additions & 0 deletions ci/build_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"

cd "${package_dir}"

sccache --zero-stats

rapids-logger "Building '${package_name}' wheel"
python -m pip wheel \
-w dist \
Expand Down
2 changes: 1 addition & 1 deletion ci/run_cugraph_dgl_pytests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@
set -euo pipefail

# Support invoking run_cugraph_dgl_pytests.sh outside the script directory
cd "$(dirname "$(realpath "${BASH_SOURCE[0]}")")"/../python/cugraph-dgl/tests
cd "$(dirname "$(realpath "${BASH_SOURCE[0]}")")"/../python/cugraph-dgl/cugraph_dgl

pytest --cache-clear --ignore=mg "$@" .
48 changes: 19 additions & 29 deletions ci/test_wheel_cugraph-dgl.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,15 @@
set -eoxu pipefail

package_name="cugraph-dgl"
package_dir="python/cugraph-dgl"

python_package_name=$(echo ${package_name}|sed 's/-/_/g')

mkdir -p ./dist
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"

# Download wheels built during this job.
RAPIDS_PY_WHEEL_NAME="pylibcugraph_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-deps
RAPIDS_PY_WHEEL_NAME="cugraph_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-deps
python -m pip install ./local-deps/*.whl

# use 'ls' to expand wildcard before adding `[extra]` requires for pip
# Download the pylibwholegraph and cugraph-dgl built in the previous step
RAPIDS_PY_WHEEL_NAME="pylibwholegraph_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-deps
RAPIDS_PY_WHEEL_NAME="${package_name}_${RAPIDS_PY_CUDA_SUFFIX}" RAPIDS_PY_WHEEL_PURE="1" rapids-download-wheels-from-s3 ./dist
# pip creates wheels using python package names
python -m pip install $(ls ./dist/${python_package_name}*.whl)[test]


# determine pytorch and DGL sources
PKG_CUDA_VER="$(echo ${CUDA_VERSION} | cut -d '.' -f1,2 | tr -d '.')"
PKG_CUDA_VER_MAJOR=${PKG_CUDA_VER:0:2}
if [[ "${PKG_CUDA_VER_MAJOR}" == "12" ]]; then
Expand All @@ -30,20 +21,19 @@ else
PYTORCH_CUDA_VER=$PKG_CUDA_VER
fi
PYTORCH_URL="https://download.pytorch.org/whl/cu${PYTORCH_CUDA_VER}"
DGL_URL="https://data.dgl.ai/wheels/cu${PYTORCH_CUDA_VER}/repo.html"

# Starting from 2.2, PyTorch wheels depend on nvidia-nccl-cuxx>=2.19 wheel and
# dynamically link to NCCL. RAPIDS CUDA 11 CI images have an older NCCL version that
# might shadow the newer NCCL required by PyTorch during import (when importing
# `cupy` before `torch`).
if [[ "${NCCL_VERSION}" < "2.19" ]]; then
PYTORCH_VER="2.1.0"
else
PYTORCH_VER="2.3.0"
fi

rapids-logger "Installing PyTorch and DGL"
rapids-retry python -m pip install "torch==${PYTORCH_VER}" --index-url ${PYTORCH_URL}
rapids-retry python -m pip install dgl==2.0.0 --find-links ${DGL_URL}

python -m pytest python/cugraph-dgl/tests
DGL_URL="https://data.dgl.ai/wheels/torch-2.3/cu${PYTORCH_CUDA_VER}/repo.html"

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
-v \
--extra-index-url "${PYTORCH_URL}" \
--find-links "${DGL_URL}" \
"$(echo ./local-deps/pylibwholegraph_${RAPIDS_PY_CUDA_SUFFIX}*.whl)" \
"$(echo ./dist/cugraph_dgl_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]" \
'dgl==2.4.0' \
'torch>=2.0,<2.4.0a0'

# RAPIDS_DATASET_ROOT_DIR is used by test scripts
export RAPIDS_DATASET_ROOT_DIR="$(realpath datasets)"

python -m pytest python/cugraph-dgl/cugraph_dgl/tests
jameslamb marked this conversation as resolved.
Show resolved Hide resolved
48 changes: 19 additions & 29 deletions ci/test_wheel_cugraph-pyg.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,46 +4,36 @@
set -eoxu pipefail

package_name="cugraph-pyg"
package_dir="python/cugraph-pyg"

python_package_name=$(echo ${package_name}|sed 's/-/_/g')

mkdir -p ./dist
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"

# Download wheels built during this job.
RAPIDS_PY_WHEEL_NAME="pylibcugraph_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-deps
RAPIDS_PY_WHEEL_NAME="cugraph_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-deps
python -m pip install ./local-deps/*.whl

# use 'ls' to expand wildcard before adding `[extra]` requires for pip
# Download the pylibwholegraph and cugraph-pyg built in the previous step
RAPIDS_PY_WHEEL_NAME="pylibwholegraph_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-deps
RAPIDS_PY_WHEEL_NAME="${package_name}_${RAPIDS_PY_CUDA_SUFFIX}" RAPIDS_PY_WHEEL_PURE="1" rapids-download-wheels-from-s3 ./dist
# pip creates wheels using python package names
python -m pip install $(ls ./dist/${python_package_name}*.whl)[test]

# RAPIDS_DATASET_ROOT_DIR is used by test scripts
export RAPIDS_DATASET_ROOT_DIR="$(realpath datasets)"

# Used to skip certain examples in CI due to memory limitations
export CI_RUN=1

# determine pytorch and pyg sources
if [[ "${CUDA_VERSION}" == "11.8.0" ]]; then
PYTORCH_URL="https://download.pytorch.org/whl/cu118"
PYG_URL="https://data.pyg.org/whl/torch-2.1.0+cu118.html"
PYG_URL="https://data.pyg.org/whl/torch-2.3.0+cu118.html"
else
PYTORCH_URL="https://download.pytorch.org/whl/cu121"
PYG_URL="https://data.pyg.org/whl/torch-2.1.0+cu121.html"
PYG_URL="https://data.pyg.org/whl/torch-2.3.0+cu121.html"
fi
rapids-logger "Installing PyTorch and PyG dependencies"
rapids-retry python -m pip install torch==2.1.0 --index-url ${PYTORCH_URL}
rapids-retry python -m pip install "torch-geometric>=2.5,<2.6"
rapids-retry python -m pip install \
ogb \
pyg_lib \
torch_scatter \
torch_sparse \
tensordict \
-f ${PYG_URL}

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
-v \
--extra-index-url "${PYTORCH_URL}" \
--find-links "${PYG_URL}" \
"$(echo ./local-deps/pylibwholegraph_${RAPIDS_PY_CUDA_SUFFIX}*.whl)" \
"$(echo ./dist/cugraph_pyg_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]"

# RAPIDS_DATASET_ROOT_DIR is used by test scripts
export RAPIDS_DATASET_ROOT_DIR="$(realpath datasets)"

# Used to skip certain examples in CI due to memory limitations
export CI_RUN=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically the environment variable CI is set to true in CI contexts. https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/store-information-in-variables#default-environment-variables

We could use that to avoid introducing a separate variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah great point! Looks to me like this was carried over from cugraph, but there it's only used for the narrow purpose of cugraph-pyg, so we can just simplify it here.

Did that in 1202c36


rapids-logger "pytest cugraph-pyg (single GPU)"
pushd python/cugraph-pyg/cugraph_pyg
Expand Down
9 changes: 2 additions & 7 deletions ci/test_wheel_pylibwholegraph.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,8 @@ mkdir -p "${RAPIDS_TESTS_DIR}" "${RAPIDS_COVERAGE_DIR}"
rapids-logger "Installing Packages"
rapids-retry python -m pip install \
--extra-index-url ${INDEX_URL} \
"$(echo ./dist/pylibwholegraph*.whl)[test]"

# install torch separately, to be sure we get a CUDA build
python -m pip install \
--index-url "${INDEX_URL}" \
-v \
'torch>=2.0,<2.4.0a0'
"$(echo ./dist/pylibwholegraph*.whl)[test]" \
'torch>=2.0,<2.4.0a0'

rapids-logger "pytest pylibwholegraph"
cd python/pylibwholegraph/pylibwholegraph/tests
Expand Down
10 changes: 6 additions & 4 deletions conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pyg
- conda-forge
- nvidia
dependencies:
Expand All @@ -18,7 +17,7 @@ dependencies:
- cupy>=12.0.0
- cython>=3.0.0
- dask-cudf==24.12.*,>=0.0.0a0
- dglteam/label/th21_cu118::dgl
- dglteam/label/th23_cu118::dgl>=2.4.0.th23.cu*
- doxygen
- graphviz
- ipython
Expand All @@ -33,7 +32,6 @@ dependencies:
- pre-commit
- pydantic
- pydata-sphinx-theme
- pyg::pyg
- pylibcugraphops==24.12.*,>=0.0.0a0
- pylibraft==24.12.*,>=0.0.0a0
- pytest
Expand All @@ -42,16 +40,20 @@ dependencies:
- pytest-forked
- pytest-xdist
- pytorch-cuda=11.8
- pytorch::pytorch>=2.0,<2.4.0a0
- pytorch::pytorch>=2.3,<2.4.0a0
- pytorch_geometric>=2.5,<2.6
- raft-dask==24.12.*,>=0.0.0a0
- rapids-build-backend>=0.3.0,<0.4.0.dev0
- recommonmark
- rmm==24.12.*,>=0.0.0a0
- scikit-build-core>=0.10.0
- scipy
- setuptools>=61.0.0
- sphinx-copybutton
- sphinx-markdown-tables
- sphinx<6
- sphinxcontrib-websupport
- torchdata
- wget
- wheel
name: all_cuda-118_arch-x86_64
10 changes: 6 additions & 4 deletions conda/environments/all_cuda-121_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pyg
- conda-forge
- nvidia
dependencies:
Expand All @@ -19,7 +18,7 @@ dependencies:
- cupy>=12.0.0
- cython>=3.0.0
- dask-cudf==24.12.*,>=0.0.0a0
- dglteam/label/th21_cu121::dgl
- dglteam/label/th23_cu121::dgl>=2.4.0.th23.cu*
- doxygen
- graphviz
- ipython
Expand All @@ -38,7 +37,6 @@ dependencies:
- pre-commit
- pydantic
- pydata-sphinx-theme
- pyg::pyg
- pylibcugraphops==24.12.*,>=0.0.0a0
- pylibraft==24.12.*,>=0.0.0a0
- pytest
Expand All @@ -47,16 +45,20 @@ dependencies:
- pytest-forked
- pytest-xdist
- pytorch-cuda=12.1
- pytorch::pytorch>=2.0,<2.4.0a0
- pytorch::pytorch>=2.3,<2.4.0a0
- pytorch_geometric>=2.5,<2.6
- raft-dask==24.12.*,>=0.0.0a0
- rapids-build-backend>=0.3.0,<0.4.0.dev0
- recommonmark
- rmm==24.12.*,>=0.0.0a0
- scikit-build-core>=0.10.0
- scipy
- setuptools>=61.0.0
- sphinx-copybutton
- sphinx-markdown-tables
- sphinx<6
- sphinxcontrib-websupport
- torchdata
- wget
- wheel
name: all_cuda-121_arch-x86_64
Loading