Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenFOAM HPC Motorbike performance test #27

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

stevenvdb
Copy link
Contributor

@stevenvdb stevenvdb commented Jun 1, 2022

This pull requests adds an OpenFOAM benchmark test. The test case is taken from the OpenFOAM HPC Technical Committee (https://develop.openfoam.com/committees/hpc/-/wikis/home), assuming they know how to construct decent test cases.

Compared to the other tests thus far included in this repository, it requires considerably more resources: the case using a single node requires a few hours to complete. If you think this is outside the scope of vsc-test-suite, I could try to reduce the resource usage. The problem is that the meshing (which is an initialization before the actual simulation runs) takes very long for the Motorbike example that is considered. One solution would be to store the (large) meshing files somewhere, another would be to consider an example different from the Motorbike.

I ran the tests on genius-hortense-hydra-vaughan and added performance reference values. On hortense (@boegel) there are still two issues:

  • This test requires a few GB of (temporary) storage space. On most clusters, this is available in ${VSC_SCRATCH}. On hortense however, the quota on ${VSC_SCRATCH} is too restrictive. For now, I hard-coded a project scratch directory to which only I have access. Not sure how this can be dealt with so it would work for everybody.
  • The scripts from the OpenFOAM HPC committee use functions such as runParallel that ship with OpenFOAM (see $WM_PROJECT_DIR/bin/tools/RunFunctions) and call mpirun under the hood. When using an intel toolchain, things seem to be ok but with a foss toolchain, processes do not end up on the correct nodes. I think it would be necessary to ditch the runParallel function and use mympirun directly? Is that the advice you give to OpenFOAM users on the clusters in Ghent? To do such a thing in the proposed test would require modifying the scripts from the OpenFOAM HPC committee, so another workaround would be nice.

@stevenvdb stevenvdb marked this pull request as draft June 1, 2022 14:49
@stevenvdb stevenvdb changed the title WIP: Add OpenFOAM HPC Motorbike performance test Add OpenFOAM HPC Motorbike performance test Jun 10, 2022
@stevenvdb stevenvdb marked this pull request as ready for review June 10, 2022 14:46
@Lewih
Copy link
Collaborator

Lewih commented Jun 13, 2022

The size of the test is not a big issue as long as it is correctly tagged (e.g. "big").

@Lewih
Copy link
Collaborator

Lewih commented Jun 13, 2022

Tests in the range of the hour are anyway easier to handle, but some compute hogs are also necessary.

@Lewih
Copy link
Collaborator

Lewih commented Jun 23, 2022

This test requires a few GB of (temporary) storage space. On most clusters, this is available in ${VSC_SCRATCH}. On hortense however, the quota on ${VSC_SCRATCH} is too restrictive. For now, I hard-coded a project scratch directory to which only I have access. Not sure how this can be dealt with so it would work for everybody.

'astaff', 'badmin', 'gadminforever', 'l_sysadmin' have a project folder in /dodrio/scratch/projects.
For astaff there is 1TB quota limit, one could dynamically point to them if they all have sufficient space.

@stevenvdb
Copy link
Contributor Author

The size of the test is not a big issue as long as it is correctly tagged (e.g. "big").

Fixed in c351137. Also, 12120fc ensures that tests with the resource-intensive tag are not picked automatically by the run.sh script. To run those specific tests, edit the run.sh script by moving the resource-intensive tag from the --exclude-tag to the --tag option.

@stevenvdb
Copy link
Contributor Author

This test requires a few GB of (temporary) storage space. On most clusters, this is available inVSCSCRATCH.Onhortensehowever,thequotaon{VSC_SCRATCH} is too restrictive. For now, I hard-coded a project scratch directory to which only I have access. Not sure how this can be dealt with so it would work for everybody.

'astaff', 'badmin', 'gadminforever', 'l_sysadmin' have a project folder in /dodrio/scratch/projects. For astaff there is 1TB quota limit, one could dynamically point to them if they all have sufficient space.

Fixed in 9802675.

@Lewih
Copy link
Collaborator

Lewih commented Jul 6, 2022

  1. Updated Vaughan setting to use full node (only 36 cores were used before instead of 64)
  2. Added more tags.
  3. added Leibniz

Issues:

  • On Vaughan the 4 nodes job keeps failing in the middle of the simulation. Problem does not shows up on Leibniz.
  • Run on Hortense shows known performance problem with foss. Will try to fix it, I have an idea.
  • The test generates a lot of files which are not canceled after a successful execution TODO
  • in error stream can be foundrm: cannot remove 'log.simpleFoam': No such file or directory rm: cannot remove 'log.potentialFoam': No such file or directory cp: cannot stat 'log.checkMesh': No such file or directory

@Lewih
Copy link
Collaborator

Lewih commented Jul 7, 2022

Tried to run with the exclusive flag on Hortense with Foss on 1 node, but no performance benefit has been obtained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants