Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: introduce libcugraph wheels #4804

Draft
wants to merge 38 commits into
base: branch-25.02
Choose a base branch
from

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Dec 5, 2024

Replaces #4340, contributes to rapidsai/build-planning#33.

Blocked by rapidsai/raft#2531

Proposes packaging libcugraph as a wheel, which is then re-used by cugraph-cu{11,12} and pylibcugraph-cu{11,12} wheels.

Notes for Reviewers

If you see this note, that means this is not ready for review.

Dependency Flows

---
title: Build & runtime dependencies
---
flowchart TD
    A[libcugraph] --> B[pylibcugraph]
    A --> C[cugraph]
    B --> C
Loading

Size changes (CUDA 12, Python 3.12, x86_64)

wheel num files (before) num files (after) size (before) size (this PR)
libcugraph --- --- --- ---
pylibcugraph 190 --- 900M ---
cugraph 314 --- 901M ---
TOTAL 504 --- 1.8G ---

NOTES: size = compressed, "before" = 2024-12-06 nightlies (5c8f850)

how I calculated those (click me)
docker run \
    --rm \
    -v $(pwd):/opt/work:ro \
    -w /opt/work \
    --network host \
    --env RAPIDS_NIGHTLY_DATE=2024-12-06 \
    --env RAPIDS_NIGHTLY_SHA=5c8f850 \
    --env RAPIDS_PR_NUMBER=4804 \
    --env RAPIDS_PY_CUDA_SUFFIX=cu12 \
    --env RAPIDS_REPOSITORY=rapidsai/cugraph \
    --env WHEEL_DIR_BEFORE=/tmp/wheels-before \
    --env WHEEL_DIR_AFTER=/tmp/wheels-after \
    -it rapidsai/ci-wheel:cuda12.5.1-rockylinux8-py3.12 \
    bash

mkdir -p "${WHEEL_DIR_BEFORE}"
mkdir -p "${WHEEL_DIR_AFTER}"

cpp_projects=(
    libcugraph
)
py_projects=(
    cugraph
    pylibcugraph
)

for project in "${py_projects[@]}"; do
    # before
    RAPIDS_BUILD_TYPE=nightly \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="branch-25.02" \
    RAPIDS_SHA=${RAPIDS_NIGHTLY_SHA} \
        rapids-download-wheels-from-s3 python "${WHEEL_DIR_BEFORE}"

    # after
    #RAPIDS_BUILD_TYPE=pull-request \
    #RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    #RAPIDS_REF_NAME="pull-request/${RAPIDS_PR_NUMBER}" \
    #    rapids-download-wheels-from-s3 python "${WHEEL_DIR_AFTER}"
done

for project in "${cpp_projects[@]}"; do    
    # after
    RAPIDS_BUILD_TYPE=pull-request \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="pull-request/${RAPIDS_PR_NUMBER}" \
        rapids-download-wheels-from-s3 cpp "${WHEEL_DIR_AFTER}"
done

pip install pydistcheck
pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ${WHEEL_DIR_BEFORE}/*.whl

pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ${WHEEL_DIR_AFTER}/*.whl

This comment was marked as resolved.

@jameslamb
Copy link
Member Author

/ok to test

@jameslamb
Copy link
Member Author

/ok to test

TARGETS raft
EXPORT cugraph-exports
FILE_SET raft_headers
DESTINATION ${CMAKE_INSTALL_PREFIX}
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, this is getting closer! But I need some CMake help. Here's what I'm trying to do in libcugraph wheels:

  • build static RAFT (libraft.a) and statically link it into libcugraph.so
  • do not install libraft.a in libcugraph wheels
  • do not built libraft.so at all (or at least, don't install it into libcugraph wheels)
  • install the RAFT headers into the libcugraph wheels, and export them
  • use libcugraph wheels in the build env for pylibcugraph and cugraph wheels, and have libcugraph's PUBLIC raft::raft dependency be served by the RAFT headers it vendors

I've got all the RAFT headers installing into include/raft/ in the wheel the way I want, thanks to the combination of:

But I can't figure out the exporting piece.

The current state of this branch results in stuff like this at configure time for libcugraph:

CMake Error: install(EXPORT "raft-compiled-exports" ...) includes target "raft_compiled" which requires target "raft" that is not in this export set, but in multiple other export sets: lib64/cmake/cugraph/cugraph-targets.cmake, lib64/cmake/raft/raft-targets.cmake.
An exported target cannot depend upon another target which is exported multiple times. Consider consolidating the exports of the "raft" target to a single export.

CMake Error: install(EXPORT "raft-compiled-exports" ...) includes target "raft_compiled_static" which requires target "raft" that is not in this export set, but in multiple other export sets: lib64/cmake/cugraph/cugraph-targets.cmake, lib64/cmake/raft/raft-targets.cmake.

CMake Error: install(EXPORT "raft-compiled-lib-exports" ...) includes target "raft_lib" which requires target "raft" that is not in this export set, but in multiple other export sets: lib64/cmake/cugraph/cugraph-targets.cmake, lib64/cmake/raft/raft-targets.cmake.

If I remove the EXPORT line, libcugraph builds successfully but then the pylibcugraph builds cannot find those RAFT headers (because no raft-config.cmake is written out):

-- Configuring done (3.9s)
CMake Error at /pyenv/versions/3.12.7/lib/python3.12/site-packages/libcugraph/lib64/cmake/cugraph/cugraph-targets.cmake:61 (set_target_properties):
The link interface of target "cugraph::cugraph" contains:
raft::raft
but the target was not found. Possible reasons include:

  • There is a typo in the target name.
  • A find_package call is missing for an IMPORTED target.
  • An ALIAS target is missing.
    Call Stack (most recent call first):
    /pyenv/versions/3.12.7/lib/python3.12/site-packages/libcugraph/lib64/cmake/cugraph/cugraph-config.cmake:73 (include)
    CMakeLists.txt:30 (find_package)

If I remove this install() and remove EXCLUDE_FROM_ALL TRUE from the rapids_cpm_find(raft) call, then the RAFT headers, libraft.a, and libraft.so are all packaged in the libcugraph wheel and exported successfully, such that pylibcugraph and cugraph wheel builds find them and succeed (without runtime issues, at least in the one test job I ran).

I think I'm just fundamentally misunderstanding some things about dependency tracking in CMake / rapids-cmake. I've tried various forms of:

  • rapids_export()
  • rapids_export_package()
  • install(EXPORT

And haven't yet found the right combination.

If what I'm trying to do is not possible or just undesirable for something other reason, then other approaches to try (some of which are from an offline discussion with @bdice and @vyasr) include:

  • omit EXCLUDE_FROM_ALL TRUE and just add a hack to the libcugraph CMakeLists.txt that manually deletes libraft.a and libraft.so
  • have pyblibcugraph and cugraph wheel builds re-download all the RAFT headers via CPM at build time
  • stop on this and first get libraft wheels set up, then use them at build time for libcugraph, pylibcugraph, and cugraph

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started exploring whether we can avoid needing to worry about this by just creating libraft wheels: rapidsai/raft#2531

We can come back to this thread if that doesn't work out.

@github-actions github-actions bot removed the cuGraph label Dec 17, 2024
@jameslamb
Copy link
Member Author

/ok to test

rapids-bot bot pushed a commit that referenced this pull request Dec 20, 2024
In #4804, I've started working on adding `libcugraph` wheels.

This includes a few fixes for things I noticed while doing that:

* removes `wget` from `test_notebook` environment
  - *our CI images already have this system-installed, and removing it helps to remove it from the environment solve for the unified RAPIDS devcontainers*
* `dependencies.yaml` re-organization:
  - breaks `librmm` dependency out into a `depends_on_librmm` to reduce duplication, and for consistency with other RAPIDS dependencies
  - uses `depends_on_pylibwholegraph` group everywhere instead of repeating `pylibwholegraph` in multiple places
  - removes unused YAML anchors
  - alphabetizes lists

## Notes for Reviewers

I'd originally wanted to also add a `librmm` wheel dependency for wheel builds here, but looks like doing that resulted in a lot more `sccache` misses, I guess because of building with build isolation. That change will happen in #4804 .

Authors:
  - James Lamb (https://github.com/jameslamb)
  - Ralph Liu (https://github.com/nv-rliu)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #4805
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CMake DO NOT MERGE Hold off on merging; see PR for details python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants