Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to using native support for CUDA instead of FindCUDA macros. #859

Merged
merged 91 commits into from
Nov 12, 2020

Conversation

MarcusGDaniels
Copy link
Contributor

Summary

This switches to native support in CMake (versions >= 3.8) so that CUDA-specific CMake macros aren't used to set up the build. It is to address issue #843 and make it possible to do per-CUDA file build options.

Details and comments

@MarcusGDaniels
Copy link
Contributor Author

I reopened this pull request because the original one had a bad e-mail in it. (One that I no longer have.)
Also a small change to this will address issue #787. It amounts to including the current pybind11 headers in the source tree and adding a conditional in the bindings CMakefile to reference them. It appears that Qiskit's license and pybind11's are compatible, perhaps at least for the purposes of a temporary workaround until there is a newer Conda install for pybind11.

@vvilpas
Copy link
Contributor

vvilpas commented Nov 11, 2020

@MarcusGDaniels can you please merge MarcusGDaniels#3? verify_tools should be working now with a small change I've made.

I did this merge, but now I'm having trouble wit urllib3 being the incorrect version. Not sure what happened.

There is a compatibility problem between 2 conan dependencies: urllib3 and requests. They are fixing it and I've prepared a PR with a workaround in the meantime (#1031 ).

@vvilpas
Copy link
Contributor

vvilpas commented Nov 11, 2020

I have limited CUDA arch to just 5.2 to check that CI works. It seems we are running out of time with all the archs enabled.

@MarcusGDaniels
Copy link
Contributor Author

I have limited CUDA arch to just 5.2 to check that CI works. It seems we are running out of time with all the archs enabled.

Hmm, it's been running for hours. I imagine it is stuck somehow but there is no log reported yet.

@MarcusGDaniels
Copy link
Contributor Author

MarcusGDaniels commented Nov 12, 2020

I have limited CUDA arch to just 5.2 to check that CI works. It seems we are running out of time with all the archs enabled.

Hmm, it's been running for hours. I imagine it is stuck somehow but there is no log reported yet.

Yeah, timed out on verify_wheels.py again.

There is a bug in the code that makes the verify_wheel.py script
run forever when we build the gpu wheel. The problem is the check
of memory available on CUDA devices when there are no CUDA devices
available, as is the case for our GPU wheel build action.
@vvilpas
Copy link
Contributor

vvilpas commented Nov 12, 2020

Yeah, I've already found it and hopefully fixed it (#1035). It was a change already introduced in master.

@MarcusGDaniels
Copy link
Contributor Author

Yeah, I've already found it and hopefully fixed it (#1035). It was a change already introduced in master.

Looked like it worked except for a Mac build that ran out of time. Is there an automatic heuristic that retriggers the build or is there a way to force one without doing a commit?

@vvilpas
Copy link
Contributor

vvilpas commented Nov 12, 2020

Looked like it worked except for a Mac build that ran out of time. Is there an automatic heuristic that retriggers the build or is there a way to force one without doing a commit?

Yeah, that was weird, but the full gpu build wheel action worked at last!!!
I have disabled it so works again only for releases. If CI passes I think we can merge for upcoming release.

@vvilpas vvilpas mentioned this pull request Nov 12, 2020
@MarcusGDaniels
Copy link
Contributor Author

Looked like it worked except for a Mac build that ran out of time. Is there an automatic heuristic that retriggers the build or is there a way to force one without doing a commit?

Yeah, that was weird, but the full gpu build wheel action worked at last!!!
I have disabled it so works again only for releases. If CI passes I think we can merge for upcoming release.

Reminder: There's still a small patch to get Windows working. It involves copying a modified pybind header file into the distribution, though.

@vvilpas
Copy link
Contributor

vvilpas commented Nov 12, 2020

Reminder: There's still a small patch to get Windows working. It involves copying a modified pybind header file into the distribution, though.

No problem. I think we can address Win later as we don't officially support it yet for GPUs. Pybind 2.6 is already released and fixes the error when compiling with nvcc.

Copy link
Contributor

@vvilpas vvilpas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mtreinish mtreinish merged commit 796a12f into Qiskit:master Nov 12, 2020
@mtreinish
Copy link
Member

We probably should add a release note about this because it bumps the minimum cmake version. But we can do that after it's added to #1036

mtreinish pushed a commit to chriseclectic/qiskit-aer that referenced this pull request Nov 12, 2020
…iskit#859)

* Switch to using native support for CUDA instead of FindCUDA macros.

* Move check for AER_THRUST_BACKEND being communicated via an environment variable to the top of the file.  That is because there is a check added in this branch to see if CUDA support should be enabled.  This check is needed to avoid probing for nvcc unless needed, because that check may fail and abort the build.

* Add x86_64 checks for issue 856, pull Qiskit#864.

* Avoid compiling SIMD_SOURCE_FILE if there is not one.

* Use AMD64 as a synonym for x86_64.

* Add AMD64 as a synonym for x86_64.

* Pull fixes from vvilpas's issue-856-build-ppc

* Unconditinally use add_library even with CUDA (not cuda_add_library).

* Unconditinally use add_library, not cuda_add_library for CUDA.  Transform arguments for --compiler-options.

* pulled from revive_CUDA_windows

* no-op commit to force rebuild

* note some places to refactor

* Fix all but the last suggested changes by vvilpas

* Still need AER_COMPILER_FLAGS initialized.

* Use functions for prepending --compiler-options to pass default inherited options to NVCC.  One is for when a list is prepared, and one is for when a list is not prepared.

* Move include of nvcc_add_compiler_options to be adjacent to the FindCUDA call.

* Likewise include nvcc_add_compiler_options inside the CUDA conditional.

* Include nvcc_add_compiler_options.cmake after CMAKE environment variable assignments.

* Added file.

* space -> tab tweaks for local consistency

* fix typo in argument name

* Use an intermediate temporary variable, and an explicit set.

* Remove an indirect

* Follow change to remove an indirect from nvcc functions.

* More quoting needed for list conversion.

* More quoting needed for list conversion.

* Enable gpu wheel to check that wheel building works

* Do not include the whole FindCUDA, just the architecture selection bits.  Assume that CUDA will be found.

* force CMAKE_CUDA_COMPILER to nvcc.  Trying to figure out what is wrong with qiskit-aer-gpu builds.

* force CMAKE_CUDA_COMPILER to /usr/local/cuda/nvcc.  Trying to figure out what is wrong with qiskit-aer-gpu builds.

* remove hardcoded /usr/local/cuda/bin/nvcc in CMakeLists.txt and put a path extension in the wheels.yml workflow.

* Nope, try CUDACXX instead.

* Nope, try adding -- -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc

* Nope, try adding PATH in CIBW_ENVIRONMENT.

* do not run verify_wheels for GPU build

* return to standard file

* Fix terminate call. Enable CI test

* disable the CONAN patch to see if that is the cause of the build failure

* reapply the CONAN patch -- there is something else wrong

* add this back

* Set CUDACXX for CUDA wheel build

* Set AER_CUDA_ARCH to 5.2 to make a quick test on CI wheel build

* Fix gpu wheel build

There is a bug in the code that makes the verify_wheel.py script
run forever when we build the gpu wheel. The problem is the check
of memory available on CUDA devices when there are no CUDA devices
available, as is the case for our GPU wheel build action.

* Enable gpu wheel build temporarily to check that it works

* Revert "Enable gpu wheel build temporarily to check that it works"

This reverts commit 7a9b5f6.

* Set cuda devices to 0 in case of error for consistency

* Recover cmake-optiona-conan release note

* Default CUDA archs to Auto again

* disable gpu build test

Co-authored-by: vvilpas <[email protected]>
Co-authored-by: Victor Villar <[email protected]>
Co-authored-by: Christopher J. Wood <[email protected]>
(cherry picked from commit 796a12f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build system stable-backport-potential The issue or PR might be minimal and/or import enough to backport to stable
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants