Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CUDA 12.6. #286

Merged
merged 1 commit into from
Sep 5, 2024
Merged

Support CUDA 12.6. #286

merged 1 commit into from
Sep 5, 2024

Conversation

paulstansifer
Copy link
Contributor

I updated build.rs and Cargo.toml and ran the bindgen script.

Local testing mostly passed; all of the errors are the same, which is probably indicating that I still haven't set everything up correctly on my machine.


---- driver::safe::launch::tests::test_launch_with_views stdout ----
thread 'driver::safe::launch::tests::test_launch_with_views' panicked at src/lib.rs:98:5:
Unable to dynamically load the "cuda" shared library - searched for library names: ["cuda", "nvcuda"]. Ensure that `LD_LIBRARY_PATH` has the correct path to the installed library. If the shared library is present on the system under a different name than one of those listed above, please open a GitHub issue.

---- driver::safe::launch::tests::test_par_launch stdout ----
thread 'driver::safe::launch::tests::test_par_launch' panicked at src/lib.rs:98:5:
Unable to dynamically load the "cuda" shared library - searched for library names: ["cuda", "nvcuda"]. Ensure that `LD_LIBRARY_PATH` has the correct path to the installed library. If the shared library is present on the system under a different name than one of those listed above, please open a GitHub issue.

failures:
    cublas::safe::tests::test_dgemm
    cublas::safe::tests::test_dgemv
    cublas::safe::tests::test_sgemm
    cublas::safe::tests::test_sgemv
    cublaslt::safe::tests::test_matmul_f32
    cudnn::safe::tests::test_conv1d
    cudnn::safe::tests::test_conv2d_pick_algorithms
    cudnn::safe::tests::test_conv3d
    cudnn::safe::tests::test_create_descriptors
    cudnn::safe::tests::test_reduction
    curand::safe::tests::test_different_seeds_neq
    curand::safe::tests::test_log_normal_f32
    curand::safe::tests::test_log_normal_f64
    curand::safe::tests::test_normal_f32
    curand::safe::tests::test_normal_f64
    curand::safe::tests::test_rc_counts
    curand::safe::tests::test_seed_reproducible
    curand::safe::tests::test_set_offset
    curand::safe::tests::test_uniform_f32
    curand::safe::tests::test_uniform_f64
    curand::safe::tests::test_uniform_u32
    driver::safe::alloc::tests::test_copy_uses_correct_context
    driver::safe::alloc::tests::test_device_copy_to_views
    driver::safe::alloc::tests::test_leak_and_upgrade
    driver::safe::alloc::tests::test_post_alloc_arc_counts
    driver::safe::alloc::tests::test_post_build_arc_count
    driver::safe::alloc::tests::test_post_clone_arc_slice_counts
    driver::safe::alloc::tests::test_post_clone_counts
    driver::safe::alloc::tests::test_post_release_counts
    driver::safe::alloc::tests::test_post_take_arc_counts
    driver::safe::alloc::tests::test_slice_is_freed_with_correct_context
    driver::safe::core::tests::test_transmutes
    driver::safe::launch::tests::test_large_launches
    driver::safe::launch::tests::test_launch_with_16bit
    driver::safe::launch::tests::test_launch_with_32bit
    driver::safe::launch::tests::test_launch_with_64bit
    driver::safe::launch::tests::test_launch_with_8bit
    driver::safe::launch::tests::test_launch_with_floats
    driver::safe::launch::tests::test_launch_with_mut_and_ref_cudarc
    driver::safe::launch::tests::test_launch_with_views
    driver::safe::launch::tests::test_mut_into_kernel_param_no_inc_rc
    driver::safe::launch::tests::test_par_launch
    driver::safe::launch::tests::test_ref_into_kernel_param_inc_rc
    driver::safe::threading::tests::test_threading
    nccl::result::tests::multi_thread
    nccl::result::tests::single_thread
    nccl::safe::tests::test_all_reduce

test result: FAILED. 137 passed; 47 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.36s.

@paulstansifer
Copy link
Contributor Author

This should fix #280

@paulstansifer paulstansifer marked this pull request as ready for review August 11, 2024 20:49
Copy link
Owner

@coreylowman coreylowman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good (thanks for the contribution!). Looks like the NCCL/CUDNN 12060 versions weren't included (seems like the 12.5 files were modified). Seems like yu should be able to generate using this docker image: docker pull nvidia/cuda:12.6.0-cudnn-devel-ubuntu22.04

@carlos-verdes
Copy link

@paulstansifer it looks like there are couple of format issues to be fixed in order to merge this PR, are you working on it?

@paulstansifer
Copy link
Contributor Author

I'm currently the parent on parental leave, so my availability is irregular, but I'll probably be able to get back to it in a couple days.

@carlos-verdes
Copy link

I have two kids, second is 5 months so I feel you, take your time

I updated build.rs and Cargo.toml and ran the bindgen script.

Local testing mostly passed; all of the errors are the same, which is probably indicating that I still haven't set everything up correctly on my machine.

```

---- driver::safe::launch::tests::test_launch_with_views stdout ----
thread 'driver::safe::launch::tests::test_launch_with_views' panicked at src/lib.rs:98:5:
Unable to dynamically load the "cuda" shared library - searched for library names: ["cuda", "nvcuda"]. Ensure that `LD_LIBRARY_PATH` has the correct path to the installed library. If the shared library is present on the system under a different name than one of those listed above, please open a GitHub issue.

---- driver::safe::launch::tests::test_par_launch stdout ----
thread 'driver::safe::launch::tests::test_par_launch' panicked at src/lib.rs:98:5:
Unable to dynamically load the "cuda" shared library - searched for library names: ["cuda", "nvcuda"]. Ensure that `LD_LIBRARY_PATH` has the correct path to the installed library. If the shared library is present on the system under a different name than one of those listed above, please open a GitHub issue.

failures:
    cublas::safe::tests::test_dgemm
    cublas::safe::tests::test_dgemv
    cublas::safe::tests::test_sgemm
    cublas::safe::tests::test_sgemv
    cublaslt::safe::tests::test_matmul_f32
    cudnn::safe::tests::test_conv1d
    cudnn::safe::tests::test_conv2d_pick_algorithms
    cudnn::safe::tests::test_conv3d
    cudnn::safe::tests::test_create_descriptors
    cudnn::safe::tests::test_reduction
    curand::safe::tests::test_different_seeds_neq
    curand::safe::tests::test_log_normal_f32
    curand::safe::tests::test_log_normal_f64
    curand::safe::tests::test_normal_f32
    curand::safe::tests::test_normal_f64
    curand::safe::tests::test_rc_counts
    curand::safe::tests::test_seed_reproducible
    curand::safe::tests::test_set_offset
    curand::safe::tests::test_uniform_f32
    curand::safe::tests::test_uniform_f64
    curand::safe::tests::test_uniform_u32
    driver::safe::alloc::tests::test_copy_uses_correct_context
    driver::safe::alloc::tests::test_device_copy_to_views
    driver::safe::alloc::tests::test_leak_and_upgrade
    driver::safe::alloc::tests::test_post_alloc_arc_counts
    driver::safe::alloc::tests::test_post_build_arc_count
    driver::safe::alloc::tests::test_post_clone_arc_slice_counts
    driver::safe::alloc::tests::test_post_clone_counts
    driver::safe::alloc::tests::test_post_release_counts
    driver::safe::alloc::tests::test_post_take_arc_counts
    driver::safe::alloc::tests::test_slice_is_freed_with_correct_context
    driver::safe::core::tests::test_transmutes
    driver::safe::launch::tests::test_large_launches
    driver::safe::launch::tests::test_launch_with_16bit
    driver::safe::launch::tests::test_launch_with_32bit
    driver::safe::launch::tests::test_launch_with_64bit
    driver::safe::launch::tests::test_launch_with_8bit
    driver::safe::launch::tests::test_launch_with_floats
    driver::safe::launch::tests::test_launch_with_mut_and_ref_cudarc
    driver::safe::launch::tests::test_launch_with_views
    driver::safe::launch::tests::test_mut_into_kernel_param_no_inc_rc
    driver::safe::launch::tests::test_par_launch
    driver::safe::launch::tests::test_ref_into_kernel_param_inc_rc
    driver::safe::threading::tests::test_threading
    nccl::result::tests::multi_thread
    nccl::result::tests::single_thread
    nccl::safe::tests::test_all_reduce

test result: FAILED. 137 passed; 47 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.36s.
```
@paulstansifer
Copy link
Contributor Author

I've updated cudnn and nccl. Unfortunately, I think that doing so obliterated the other failing checks that GitHub was reporting. I can see cargo clippy is complaining, but it's about files that this PR doesn't touch, and I tried running cargo fmt, but it didn't seem to modify anything. Should I just roll fixes for the clippy complaints into this? Seems like it's a few minor formatting things.

@coreylowman
Copy link
Owner

This looks good now! Thanks for the update & tagging me with the re-review

@coreylowman coreylowman merged commit 3eaff86 into coreylowman:main Sep 5, 2024
13 checks passed
@paulstansifer paulstansifer deleted the patch-1 branch September 5, 2024 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants