-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
{2023.06} foss/2022b #309
{2023.06} foss/2022b #309
Conversation
Instance
|
bot: build repo:eessi-2023.06-software arch:x86_64/generic |
Updates by the bot instance
|
bot: build repo:eessi-2023.06-software arch:x86_64/generic |
Updates by the bot instance
|
New job on instance
|
bot: build repo:eessi-2023.06-software arch:x86_64/generic |
Updates by the bot instance
|
New job on instance
|
bot: build repo:eessi-2023.06-software arch:x86_64/intel/haswell |
Updates by the bot instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
The failure on
|
So apparently the increase to 300 here is not enough for this version. Do we increase it a bit more (don't know how much sense it makes to just ignore more and more tests?)? |
I don't know either. I guess we can, but it does make one wonder: if numerical results are different, how different are they, and is that still acceptable? We should at the very least check that all failures are numerical failures, I guess. I think the more fundamental question is: what should are tests guarantee?
It's probably good to at least report an issue upstream, see what they say. I'd assume the devs are more adept at judging whether these numerical inconsistencies should be considered problematic or not. One issue with taking the first approach is that issues may also pop up in other packages, such as #306 |
More detailed error:
So it is all numerical failures. That is a little bit encouraging... |
I managed to go into the singularity container by unpacking the tarball at doing a and find the detailed testing results at
Now, I'm no expert in these tests, and have no idea what this means. It does not look like small numerical errors to my untrained eye, but I might be completely wrong... |
@casparvl To avoid getting stuck on this even longer, I think we should:
We should take a similar approach to unblock other PRs ( We can't reasonable expect that we'll figure out each and every failing test for all software we'll install, especially because we known that some test suite are quite buggy themselves (PyTorch comes to mind, but this also applies to LAPACK, see @bartoldeman's PR which dealt with a non-numerical failure in the LAPACK test suite, see also issue #18017). This procedure should probably be documented as well, and even become part of the contribution policy, with some rules of thumb on when this approach is acceptable (how many failing tests we see, on how many CPU targets, etc.). |
bot: build repo:eessi-2023.06-software arch:aarch64/neoverse_v1 |
Updates by the bot instance
|
New job on instance
|
Created issue for the OpenBLAS test failures: |
remove Lmod cache update
No description provided.