Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Batch 8 #1189

Merged
merged 1 commit into from
Sep 3, 2024
Merged

GPU Batch 8 #1189

merged 1 commit into from
Sep 3, 2024

Conversation

mborland
Copy link
Member

This PR continues the process of workaround removal to expand functionality. Since all of the gamma functions have been marked up and tested in #1187 we are able to now get all of the beta function family working. This includes: beta, betac, ibeta, ibetac, ibeta_derivative, ibeta_inv, ibeta_inva, ibeta_invb, ibetac_inv, ibetac_inva, ibetac_invb and lastly the beta distribution.

Something worth noting is that most of the ibeta functionality seems to be expensive for NVRTC to parse as seen in the timing here: https://github.com/cppalliance/cuda-math/actions/runs/10635959725/job/29486716049?pr=19#step:9:202 The NVCC timing gap is not nearly as large: https://github.com/cppalliance/cuda-math/actions/runs/10635959725/job/29486715730?pr=19#step:9:208

The on device complete CI runs can be found here: cppalliance/cuda-math#19

CC: @steppi, @dschmitz89, @izaid.

Remove NVRTC workaround

Apply GPU markers to ibeta_inverse

Apply GPU markers to t_dist_inv

Fix warning suppression

Add dispatch function and remove workaround

Move disabling block

Make binomial GPU enabled

Add SYCL testing of ibeta

Add SYCL testing of ibeta_inv

Add SYCL testing of ibeta_inv_ab

Add SYCL testing of full beta suite

Add makers to fwd decls

Add special forward decls for NVRTC

Add betac nvrtc testing

Add betac CUDA testing

Add ibeta CUDA testing

Add ibeta NVRTC testing

Add ibetac NVRTC testing

Add ibeta_derviative testing to nvrtc

Add ibeta_derivative CUDA testing

Add cbrt policy overload for NVRTC

Fix NVRTC definition of BOOST_MATH_IF_CONSTEXPR

Add ibeta_inv and ibetac_inv NVRTC testing

Fix make pair helper on device

Add CUDA testing of ibeta_inv* and ibetac_inv*

Move location so that it also works on NVRTC

Add NVRTC testing of ibeta_inv* and ibetac_inv*

Fixup test sets since they ignore the policy

Make the beta dist GPU compatible

Add beta dist SYCL testing

Add beta dist CUDA testing

Add beta dist NVRTC testing
@mborland
Copy link
Member Author

mborland commented Sep 3, 2024

The only CI failure is a pretty typical s390x fail to clone. Should be safe to merge.

@mborland mborland merged commit ab57b20 into develop Sep 3, 2024
76 of 77 checks passed
@mborland mborland deleted the GPU8 branch September 3, 2024 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant