Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{2023.06}[2023a,zen4] LAMMPS 2Aug2023 for zen4 #705

Merged
merged 5 commits into from
Sep 18, 2024

Conversation

laraPPr
Copy link
Collaborator

@laraPPr laraPPr commented Sep 17, 2024

Building LAMMPS 2Aug2023 requires https://github.com/easybuilders/easybuild-easyblocks/pull/3336/files which is part of EasyBuild 4.9.3

Copy link

eessi-bot bot commented Sep 17, 2024

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi-hpc.org-2023.06-compat, eessi-hpc.org-2023.06-software, eessi.io-2023.06-software, eessi.io-2023.06-compat

Instance boegel-bot-deucalion is configured to build for:

  • architectures: aarch64/a64fx
  • repositories: eessi.io-2023.06-software

Copy link

eessi-bot bot commented Sep 17, 2024

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi-hpc.org-2023.06-compat, eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi.io-2023.06-software

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 17, 2024

bot: build repo:eessi.io-2023.06-software arch:zen4

Copy link

eessi-bot bot commented Sep 17, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen4 from laraPPr

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen4
  • handling command build repository:eessi.io-2023.06-software architecture:zen4 resulted in:

    • no jobs were submitted

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account laraPPr has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Sep 17, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen4 from laraPPr

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen4
  • handling command build repository:eessi.io-2023.06-software architecture:zen4 resulted in:

Copy link

eessi-bot bot commented Sep 17, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.09/pr_705/702

date job status comment
Sep 17 11:47:24 UTC 2024 submitted job id 702 awaits release by job manager
Sep 17 11:47:54 UTC 2024 released job awaits launch by Slurm scheduler
Sep 17 11:51:57 UTC 2024 running job 702 is running
Sep 17 13:04:35 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-702.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1726574495.tar.gzsize: 43 MiB (45438068 bytes)
entries: 672
modules under 2023.06/software/linux/x86_64/amd/zen4/modules/all
MDI/1.4.26-gompi-2023a.lua
ScaFaCoS/1.0.4-foss-2023a.lua
Voro++/0.4.6-GCCcore-12.3.0.lua
archspec/0.2.1-GCCcore-12.3.0.lua
kim-api/2.3.0-GCC-12.3.0.lua
tbb/2021.11.0-GCCcore-12.3.0.lua
software under 2023.06/software/linux/x86_64/amd/zen4/software
MDI/1.4.26-gompi-2023a
ScaFaCoS/1.0.4-foss-2023a
Voro++/0.4.6-GCCcore-12.3.0
archspec/0.2.1-GCCcore-12.3.0
kim-api/2.3.0-GCC-12.3.0
tbb/2021.11.0-GCCcore-12.3.0
other under 2023.06/software/linux/x86_64/amd/zen4
no other files in tarball
Sep 17 13:04:35 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 13/13 test case(s) from 13 check(s) (1 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-702.out
❌ found message matching ERROR:
❌ found message matching [\s*FAILED\s*].*Ran .* test case

@laraPPr laraPPr added 2023.06-software.eessi.io 2023.06 version of software.eessi.io zen4 labels Sep 17, 2024
@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 17, 2024

@boegel You can also try to build for A64FX has the same amount of missing dependencies as the zen4 stack

@boegel
Copy link
Contributor

boegel commented Sep 17, 2024

@boegel You can also try to build for A64FX has the same amount of missing dependencies as the zen4 stack

I can, but if I do this for this PR, it will first try to install everything else already included in easystacks/software.eessi.io/2023.06/zen4/*.yml, so we should open a dedicated PR for easystacks/software.eessi.io/2023.06/a64fx/*.yml

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 17, 2024

@boegel You can also try to build for A64FX has the same amount of missing dependencies as the zen4 stack

I can, but if I do this for this PR, it will first try to install everything else already included in easystacks/software.eessi.io/2023.06/zen4/*.yml, so we should open a dedicated PR for easystacks/software.eessi.io/2023.06/a64fx/*.yml

Ah I'll open a seperate pr for easystacks/software.eessi.io/2023.06/a64fx/*.yml

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 17, 2024

since I started a new stack for EasyBuild 4.9.3 it will only pick up the changes in that EasyStack right. So in this case it would only pick up a64fx. But for clarity it might be best that we keep them seperate to some extend.

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 17, 2024

== 2024-09-17 12:01:03,066 run.py:689 DEBUG cmd "python -c 'from archspec.cpu import host; print(host())'" exited with exit code 0 and output:

x86_64_v4

The output should be zen4 not sure why this happened this did not happen when building at Gent with zen4

@boegel
Copy link
Contributor

boegel commented Sep 17, 2024

since I started a new stack for EasyBuild 4.9.3 it will only pick up the changes in that EasyStack right. So in this case it would only pick up a64fx. But for clarity it might be best that we keep them seperate to some extend.

That's true, I overlooked that.

@boegel
Copy link
Contributor

boegel commented Sep 17, 2024

== 2024-09-17 12:01:03,066 run.py:689 DEBUG cmd "python -c 'from archspec.cpu import host; print(host())'" exited with exit code 0 and output:

x86_64_v4

The output should be zen4 not sure why this happened this did not happen when building at Gent with zen4

This probably indicates that one of the CPU features expected by archspec for zen4 can't be found, so it "falls back" to x86_64_v4; see also https://github.com/archspec/archspec-json/blob/master/cpu/microarchitectures.json#L2140

It isn't the first time this happens in AWS or Azure, see for example archspec/archspec-json#38

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 17, 2024

So than set it in a hook or add this to the easyBlock?

@ocaisa
Copy link
Member

ocaisa commented Sep 17, 2024

In this case it would need to be a hook, perhaps we should be doing this anyway given that we use archdetect rather than archspec.

You can use a parse hook that sets the correct value for kokkos_arch, with a table that maps between the environment variable EESSI_SOFTWARE_SUBDIR and the value LAMMPS/Kokkos expects

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 17, 2024

Yeah we did the same in the pilot so I can copy and update that

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 18, 2024

bot: build repo:eessi.io-2023.06-software arch:zen4

Copy link

eessi-bot bot commented Sep 18, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen4 from laraPPr

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen4
  • handling command build repository:eessi.io-2023.06-software architecture:zen4 resulted in:

    • no jobs were submitted

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account laraPPr has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Sep 18, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen4 from laraPPr

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen4
  • handling command build repository:eessi.io-2023.06-software architecture:zen4 resulted in:

Copy link

eessi-bot bot commented Sep 18, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.09/pr_705/801

date job status comment
Sep 18 07:59:17 UTC 2024 submitted job id 801 awaits release by job manager
Sep 18 07:59:22 UTC 2024 released job awaits launch by Slurm scheduler
Sep 18 08:03:25 UTC 2024 running job 801 is running
Sep 18 08:29:54 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-801.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1726647983.tar.gzsize: 43 MiB (45446312 bytes)
entries: 673
modules under 2023.06/software/linux/x86_64/amd/zen4/modules/all
MDI/1.4.26-gompi-2023a.lua
ScaFaCoS/1.0.4-foss-2023a.lua
Voro++/0.4.6-GCCcore-12.3.0.lua
archspec/0.2.1-GCCcore-12.3.0.lua
kim-api/2.3.0-GCC-12.3.0.lua
tbb/2021.11.0-GCCcore-12.3.0.lua
software under 2023.06/software/linux/x86_64/amd/zen4/software
MDI/1.4.26-gompi-2023a
ScaFaCoS/1.0.4-foss-2023a
Voro++/0.4.6-GCCcore-12.3.0
archspec/0.2.1-GCCcore-12.3.0
kim-api/2.3.0-GCC-12.3.0
tbb/2021.11.0-GCCcore-12.3.0
other under 2023.06/software/linux/x86_64/amd/zen4
2023.06/init/easybuild/eb_hooks.py
Sep 18 08:29:54 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-801.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

eb_hooks.py Outdated Show resolved Hide resolved
@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 18, 2024

Made a stupid mistake should be fixed now

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 18, 2024

bot: build repo:eessi.io-2023.06-software arch:zen4

Copy link

eessi-bot bot commented Sep 18, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen4 from laraPPr

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen4
  • handling command build repository:eessi.io-2023.06-software architecture:zen4 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Sep 18, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen4 from laraPPr

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen4
  • handling command build repository:eessi.io-2023.06-software architecture:zen4 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account laraPPr has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Sep 18, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.09/pr_705/802

date job status comment
Sep 18 10:21:20 UTC 2024 submitted job id 802 awaits release by job manager
Sep 18 10:22:09 UTC 2024 released job awaits launch by Slurm scheduler
Sep 18 10:26:12 UTC 2024 running job 802 is running
Sep 18 10:38:29 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-802.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1726655715.tar.gzsize: 43 MiB (45441588 bytes)
entries: 673
modules under 2023.06/software/linux/x86_64/amd/zen4/modules/all
MDI/1.4.26-gompi-2023a.lua
ScaFaCoS/1.0.4-foss-2023a.lua
Voro++/0.4.6-GCCcore-12.3.0.lua
archspec/0.2.1-GCCcore-12.3.0.lua
kim-api/2.3.0-GCC-12.3.0.lua
tbb/2021.11.0-GCCcore-12.3.0.lua
software under 2023.06/software/linux/x86_64/amd/zen4/software
MDI/1.4.26-gompi-2023a
ScaFaCoS/1.0.4-foss-2023a
Voro++/0.4.6-GCCcore-12.3.0
archspec/0.2.1-GCCcore-12.3.0
kim-api/2.3.0-GCC-12.3.0
tbb/2021.11.0-GCCcore-12.3.0
other under 2023.06/software/linux/x86_64/amd/zen4
2023.06/init/easybuild/eb_hooks.py
Sep 18 10:38:29 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-802.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

eb_hooks.py Outdated Show resolved Hide resolved
@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 18, 2024

bot: build repo:eessi.io-2023.06-software arch:zen4

Copy link

eessi-bot bot commented Sep 18, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen4 from laraPPr

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen4
  • handling command build repository:eessi.io-2023.06-software architecture:zen4 resulted in:

    • no jobs were submitted

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account laraPPr has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Sep 18, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen4 from laraPPr

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen4
  • handling command build repository:eessi.io-2023.06-software architecture:zen4 resulted in:

Copy link

eessi-bot bot commented Sep 18, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.09/pr_705/804

date job status comment
Sep 18 10:56:05 UTC 2024 submitted job id 804 awaits release by job manager
Sep 18 10:56:35 UTC 2024 released job awaits launch by Slurm scheduler
Sep 18 10:57:38 UTC 2024 running job 804 is running
Sep 18 11:18:15 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-804.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1726658008.tar.gzsize: 195 MiB (205467400 bytes)
entries: 5134
modules under 2023.06/software/linux/x86_64/amd/zen4/modules/all
LAMMPS/2Aug2023_update2-foss-2023a-kokkos.lua
MDI/1.4.26-gompi-2023a.lua
ScaFaCoS/1.0.4-foss-2023a.lua
Voro++/0.4.6-GCCcore-12.3.0.lua
archspec/0.2.1-GCCcore-12.3.0.lua
kim-api/2.3.0-GCC-12.3.0.lua
tbb/2021.11.0-GCCcore-12.3.0.lua
software under 2023.06/software/linux/x86_64/amd/zen4/software
LAMMPS/2Aug2023_update2-foss-2023a-kokkos
MDI/1.4.26-gompi-2023a
ScaFaCoS/1.0.4-foss-2023a
Voro++/0.4.6-GCCcore-12.3.0
archspec/0.2.1-GCCcore-12.3.0
kim-api/2.3.0-GCC-12.3.0
tbb/2021.11.0-GCCcore-12.3.0
other under 2023.06/software/linux/x86_64/amd/zen4
2023.06/init/easybuild/eb_hooks.py
Sep 18 11:18:15 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 5/5 test case(s) from 5 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-804.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Sep 18 12:14:19 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen4-1726658008.tar.gz to S3 bucket succeeded

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 18, 2024

Succes!!

@laraPPr laraPPr added the ready-to-deploy Mark a PR as ready to deploy label Sep 18, 2024
Copy link
Contributor

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel boegel merged commit 105d191 into EESSI:2023.06-software.eessi.io Sep 18, 2024
35 checks passed
@boegel boegel added the bot:deploy Ask bot to deploy missing software installations to EESSI label Sep 18, 2024
@boegel
Copy link
Contributor

boegel commented Sep 18, 2024

Ah snap, I merged before doing the deploy 🤦

Should still work though, deploy triggered...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2023.06-software.eessi.io 2023.06 version of software.eessi.io bot:deploy Ask bot to deploy missing software installations to EESSI ready-to-deploy Mark a PR as ready to deploy zen4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants