-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer dnf5 for fedora CI #2742
Conversation
* support sudo without password * set actually a password * Use --build-args rather then bash script to generate Dockerfile
* Previously the second cmake invocation failed, e.g. after some source changes or after changing a configure time option * It also generated many files, some of which overwrite git-tracked files from autotools, and others where not git-ignored.
Squash optional
Fix some warnings and deprecated headers
Not comprehensive, still fails to run successfully on basic input file (zero pivot in cyclic reduce) Fixes: - unused variables - wrong variable names (including case) - creating laplacians from wrong input sections - some issues with staggering
MPI_DOUBLE is a macro, so this check can break, even if there is no mismatch.
Fix versions after release
Should be faster and more consistent
Both OpenMPI and MPICH are failing with "bus error", but OpenMPI gives a bit more of a clue:
Bus errors can apparently happen due to not having enough physical memory for requested mallocs. The VM specs are:
which I would've thought would be plenty. OpenMPI has a stack trace too:
which looks to be basically the same for each test failure. You can see it's crashing in a call to |
I tried it a bit, and I can reproduce this when I run in a container, without a tty. That of course adds to the fun of debugging ... |
Ouch. Was that just a matter of running a shell in the container and then running the tests, vs running the tests directly via the container? |
Running
|
I'm at a bit of a dead end here, MPICH in the container works locally for me. OpenMPI doesn't, but I think that's a bug with Fedora, and it has a different failure to that in CI. I don't really have time to debug this further, so I think we might have to just disable the Fedora CI for the time being until someone has time to investigate and fix it. |
This is the error you probably saw: Once that is fixed the other error might return, I can go about debugging it. |
Should be fixed now. Failing run was PETSc #2755 Issue was caused by libfabric, downgrading resolves the issue for now. |
This isn't really a blocker for 5.1, so I'll wait for the CI to finish on |
Fedora builds finally working again, merging this into next |
🤦 Need to wait till we've actually merged master into next first |
Shouldn't the fixes / improvements for CI also go to master? |
@ZedThree Any ideas why this are still 200 commits? |
dnf5 is faster, and is sufficiently feature complete for our needs