Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build modules in new project folder on Frontier #119

Merged
merged 1 commit into from
Feb 12, 2024

Conversation

nkoukpaizan
Copy link
Collaborator

Merge request type

  • New feature
  • Resolves bug
  • Documentation
  • Other

Relates to

  • OPFLOW
  • SOPFLOW
  • SCOPFLOW
  • TCOPFLOW
  • CMake build system
  • Spack configuration
  • Manual
  • Web docs
  • Other

This MR updates

  • Header files
  • Source code
  • CMake build system
  • Spack configuration
  • Web docs
  • Manual
  • Other

Summary

New build of ExaGO and dependencies on Frontier in the new project shared folder.

@cameronrutherford cameronrutherford merged commit 97f5d3c into develop Feb 12, 2024
7 of 8 checks passed
Copy link
Collaborator

@pelesh pelesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are at this, I suggest:

  • We build everything with at least ROCm >= 5.6
  • Build Ginkgo >= 1.7.0; the experimental branch is not supported and will likely be deleted.
  • Consider moving location of modules to world shared. This will allow collaborators outside the project to build our code (e.g. at hackathons)

Comment on lines +25 to +26
module load hip/5.2.0-clang-14.0.0-rocm5.2.0-mixed-6ftqihk
# hsa-rocr-dev@=5.2.0%clang@=14.0.0-rocm5.2.0-mixed~asan+image+shared build_system=cmake build_type=Release generator=make patches=9267179 arch=linux-sles15-zen3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need ROCm >= 5.6. Also, I suggest using Frontier modules rather than building our own.

Copy link
Collaborator Author

@nkoukpaizan nkoukpaizan Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #89 regarding rocm >=5.6.
For the rocm modules, I am using the system modules (see https://github.com/pnnl/ExaGO/blob/nicholson/frontier-eng145/buildsystem/spack/crusher/spack.yaml#L208), but Spack likes to add a wrapper to it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying these are Spack wrappers.

As for #89, this looks like a bug in PETSc Spack package.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a workaround for the PETSc error. I am more concerned about the failing tests, especially in Debug mode.

Comment on lines +49 to +50
module load ginkgo/1.5.0.glu_experimental-clang-14.0.0-rocm5.2.0-mixed-6bembpq
# gmake@=4.4.1%gcc@=12.2.0-mixed~guile build_system=generic arch=linux-sles15-zen3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need Ginkgo >= 1.7. I think glu_experimental branch will be deleted soon anyways.

Copy link
Collaborator Author

@nkoukpaizan nkoukpaizan Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not aware that HiOp's interface to [email protected] has been updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has not. However, glu_experimental is unsupported Ginkgo branch incompatible with mainstream Ginkgo, so it is not clear which solution is worse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should build Ginkgo >= 1.7, but build HiOp without Ginkgo support enabled. This will at least create an environment in which developers could complete HiOp interface updates. Just my $0.02 tip.

Comment on lines +1 to +2
module use -a /lustre/orion/eng145/proj-shared/nkouk/spack-install/modules/linux-sles15-zen3
# exago@=develop%clang@=14.0.0-rocm5.2.0-mixed~cuda+hiop~ipo+ipopt+logging+mpi~python+raja+rocm amdgpu_target=gfx90a build_system=cmake build_type=Release dev_path=/lustre/orion/scratch/nkouk/eng145/ExaGO generator=make arch=linux-sles15-zen3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we install these modules in "world shared" folder?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make a new pull request.

@nkoukpaizan nkoukpaizan deleted the nicholson/frontier-eng145 branch September 24, 2024 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants