Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Kokkos version to 4.4.1 #1191

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from
Open

Update Kokkos version to 4.4.1 #1191

wants to merge 10 commits into from

Conversation

Yurlungur
Copy link
Collaborator

PR Summary

PR Checklist

  • Code passes cpplint
  • New features are documented.
  • Adds a test for any bugs fixed. Adds tests for new features.
  • Code is formatted
  • Changes are summarized in CHANGELOG.md
  • Change is breaking (API, behavior, ...)
    • Change is additionally added to CHANGELOG.md in the breaking section
    • PR is marked as breaking
    • Short summary API changes at the top of the PR (plus optionally with an automated update/fix script)
  • CI has been triggered on Darwin for performance regression tests.
  • Docs build
  • (@lanl.gov employees) Update copyright on changed files

@Yurlungur Yurlungur linked an issue Oct 16, 2024 that may be closed by this pull request
Copy link
Collaborator

@pgrete pgrete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely in favor (plus the bump to 4.4.1).

CHANGELOG.md Outdated
@@ -11,6 +11,7 @@
- [[PR 1161]](https://github.com/parthenon-hpc-lab/parthenon/pull/1161) Make flux field Metadata accessible, add Metadata::CellMemAligned flag, small perfomance upgrades

### Changed (changing behavior/API/variables/...)
- [[PR1191]](https://github.com/parthenon-hpc-lab/parthenon/pull/1191) Update Kokkos version to 4.2.01
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [[PR1191]](https://github.com/parthenon-hpc-lab/parthenon/pull/1191) Update Kokkos version to 4.2.01

And instead update the following line below to point to this PR:

- [[Issue 1165]](https://github.com/parthenon-hpc-lab/parthenon/issues/1165) Bump Kokkos submodule to 4.4.1

->

- [[PR 1191]](https://github.com/parthenon-hpc-lab/parthenon/pull/1191) Bump Kokkos submodule to 4.4.1

which should fix the mess I left ;)

@pdmullen pdmullen changed the title [trivial] bump Kokkos version to 4.2.01 [trivial] Update Kokkos version to 4.4.1 Oct 17, 2024
@BenWibking
Copy link
Collaborator

Just wanted to note that updating to Kokkos 4.4.x causes finalization of parthenon::Mesh to fail: #1193.

@BenWibking
Copy link
Collaborator

@pgrete It looks like that didn't fix it. Are there other view-of-views usage elsewhere in the code?

@pgrete pgrete mentioned this pull request Oct 21, 2024
@pgrete
Copy link
Collaborator

pgrete commented Oct 21, 2024

@pgrete It looks like that didn't fix it. Are there other view-of-views usage elsewhere in the code?

Potentially. But the fix didn't work because I put the view_alloc info to the device view (which triggered a static assert fail).
With the latest version here, I don't see finalize issues in AthenaPK any more (but I still need to check with the CI here does still fail).

@pgrete
Copy link
Collaborator

pgrete commented Oct 21, 2024

$ export KOKKOS_TOOLS_LIBS=/home/pgrete/src/kokkos-tools/build/debugging/vov-bug-finder/libkp_view_of_views_bug_finder.so
[18:02:47][pgrete@haerke: ~/src/parthenon/build-skx]
$ /home/pgrete/src/parthenon/build-skx/tst/unit/unit_tests "Can pull variables from containers based on Metadata"
Filters: Can pull variables from containers based on Metadata
view of views "MakePack::cv" not properly cleared this fence labelled "HostSpace::impl_deallocate before free" will hang

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
unit_tests is a Catch v2.13.8 host application.
Run with -? for options

-------------------------------------------------------------------------------
Can pull variables from containers based on Metadata
      Given: A Container with a set of variables initialized to zero
-------------------------------------------------------------------------------
/home/pgrete/src/parthenon/tst/unit/test_meshblock_data_iterator.cpp:76
...............................................................................

/home/pgrete/src/parthenon/tst/unit/test_meshblock_data_iterator.cpp:364: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGABRT - Abort (abnormal termination) signal

===============================================================================
test cases:  1 |  0 passed | 1 failed
assertions: 48 | 47 passed | 1 failed

I guess all our packs are broken (so the corresponding unit tests fail correctly).

@pgrete
Copy link
Collaborator

pgrete commented Oct 22, 2024

Ugh, this gets uglier with the minute...
The host tests now fail because the create_mirror_view is a noop so the new SequentialHostInit is not used and the ViewOfView on host fails to deallocate.

I'm removing the "trivial" from the PR and adding the bug label as we're leaking memory...

@pgrete pgrete changed the title [trivial] Update Kokkos version to 4.4.1 Update Kokkos version to 4.4.1 Oct 22, 2024
@pgrete pgrete added the bug Something isn't working label Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build configuration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update to Kokkos 4.2?
4 participants