Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CMake test setup #112

Closed
wants to merge 5 commits into from
Closed

Fix CMake test setup #112

wants to merge 5 commits into from

Conversation

wo80
Copy link
Contributor

@wo80 wo80 commented Aug 8, 2023

WARNING: this pull request will make the tests fail due to currently unresolved issue #108

That being said, I think it's important to have tests reflecting reality and I think that this should be merged rather sooner than later (even if the issue remains unresolved for now).

As explained in above mentioned issue, the current CMake test setup is rather complex. This is due mainly to the logging implemented. This pull request removes all those logging features, which greatly simplifies the setup. After all, I think logging should not be part of automated tests - at least not on the test subject side, the testing framework should handle that, and if anybody wants verbose output, ctest can be called like

ctest -VV -C Debug

Additionally, the test executables can still be run by hand to get the test output

./d_test -t "SP" -s 5 -l 20000000 -f "../../EXAMPLE/g20.rua"

@wo80
Copy link
Contributor Author

wo80 commented Aug 8, 2023

For completeness, here's the current output of ctest -C Debug (run on Windows):

Test project /projects/superlu/build
      Start  1: s_test_9_2_0_LA
 1/24 Test  #1: s_test_9_2_0_LA ..................   Passed    0.02 sec
      Start  2: s_test_19_2_0_LA
 2/24 Test  #2: s_test_19_2_0_LA .................   Passed    0.03 sec
      Start  3: s_test_2_0_SP
 3/24 Test  #3: s_test_2_0_SP ....................   Passed    0.06 sec
      Start  4: s_test_9_2_10000000_LA
 4/24 Test  #4: s_test_9_2_10000000_LA ...........***Exception: SegFault  0.02 sec
      Start  5: s_test_19_2_10000000_LA
 5/24 Test  #5: s_test_19_2_10000000_LA ..........***Exception: SegFault  0.02 sec
      Start  6: s_test_2_10000000_SP
 6/24 Test  #6: s_test_2_10000000_SP .............***Exception: SegFault  0.01 sec
      Start  7: d_test_9_2_0_LA
 7/24 Test  #7: d_test_9_2_0_LA ..................   Passed    0.02 sec
      Start  8: d_test_19_2_0_LA
 8/24 Test  #8: d_test_19_2_0_LA .................   Passed    0.03 sec
      Start  9: d_test_2_0_SP
 9/24 Test  #9: d_test_2_0_SP ....................   Passed    0.06 sec
      Start 10: d_test_9_2_10000000_LA
10/24 Test #10: d_test_9_2_10000000_LA ...........***Exception: SegFault  0.01 sec
      Start 11: d_test_19_2_10000000_LA
11/24 Test #11: d_test_19_2_10000000_LA ..........***Exception: SegFault  0.01 sec
      Start 12: d_test_2_10000000_SP
12/24 Test #12: d_test_2_10000000_SP .............***Exception: SegFault  0.01 sec
      Start 13: c_test_9_2_0_LA
13/24 Test #13: c_test_9_2_0_LA ..................   Passed    0.02 sec
      Start 14: c_test_19_2_0_LA
14/24 Test #14: c_test_19_2_0_LA .................   Passed    0.06 sec
      Start 15: c_test_2_0_SP
15/24 Test #15: c_test_2_0_SP ....................   Passed    0.11 sec
      Start 16: c_test_9_2_10000000_LA
16/24 Test #16: c_test_9_2_10000000_LA ...........***Exception: SegFault  0.01 sec
      Start 17: c_test_19_2_10000000_LA
17/24 Test #17: c_test_19_2_10000000_LA ..........***Exception: SegFault  0.01 sec
      Start 18: c_test_2_10000000_SP
18/24 Test #18: c_test_2_10000000_SP .............***Exception: SegFault  0.01 sec
      Start 19: z_test_9_2_0_LA
19/24 Test #19: z_test_9_2_0_LA ..................   Passed    0.03 sec
      Start 20: z_test_19_2_0_LA
20/24 Test #20: z_test_19_2_0_LA .................   Passed    0.07 sec
      Start 21: z_test_2_0_SP
21/24 Test #21: z_test_2_0_SP ....................   Passed    0.15 sec
      Start 22: z_test_9_2_10000000_LA
22/24 Test #22: z_test_9_2_10000000_LA ...........***Exception: SegFault  0.01 sec
      Start 23: z_test_19_2_10000000_LA
23/24 Test #23: z_test_19_2_10000000_LA ..........***Exception: SegFault  0.01 sec
      Start 24: z_test_2_10000000_SP
24/24 Test #24: z_test_2_10000000_SP .............***Exception: SegFault  0.01 sec

50% tests passed, 12 tests failed out of 24

Total Test time (real) =   0.80 sec

The following tests FAILED:
          4 - s_test_9_2_10000000_LA (SEGFAULT)
          5 - s_test_19_2_10000000_LA (SEGFAULT)
          6 - s_test_2_10000000_SP (SEGFAULT)
         10 - d_test_9_2_10000000_LA (SEGFAULT)
         11 - d_test_19_2_10000000_LA (SEGFAULT)
         12 - d_test_2_10000000_SP (SEGFAULT)
         16 - c_test_9_2_10000000_LA (SEGFAULT)
         17 - c_test_19_2_10000000_LA (SEGFAULT)
         18 - c_test_2_10000000_SP (SEGFAULT)
         22 - z_test_9_2_10000000_LA (SEGFAULT)
         23 - z_test_19_2_10000000_LA (SEGFAULT)
         24 - z_test_2_10000000_SP (SEGFAULT)
Errors while running CTest

@wo80
Copy link
Contributor Author

wo80 commented Aug 8, 2023

I should also mention that this does not affect the Github workflow. Those tests will still be reported as passing, though the segfault occurs.

@gruenich
Copy link
Contributor

gruenich commented Aug 12, 2023

Overall I support your approach to simplify the code around testing. Logging should not be part of these tests.

  1. Can you please rebase your changes instead of merging commits into your branch? Your approach clutters the merge request and makes it unnecessary difficult to review your changes.
  2. Can you split your commit into smaller chunks? For example one commit for removing dead code (like the function cat), another one to drop ALINTST, one for white-space changes, and one to get rid runtest.cmake.

@wo80
Copy link
Contributor Author

wo80 commented Aug 12, 2023

Let us not start this discussion again. The author of this repo merged #107 without any comment, which indicates to me that your best practices regarding commits and pull requests might not be the same as hers.

Overall, the messed up commit timeline was a result of merging your PR #109 which conflicted with mine made earlier, and then later merging #107, though I thought it was made clear that it shouldn't be merged. This is not my fault!

@xiaoyeli
Copy link
Owner

I agree with @gruenich : it's better to set up individual PRs, rather than big chunk.
It's difficult for me to review the big chunk containing many commits.

@wo80
Copy link
Contributor Author

wo80 commented Aug 12, 2023

it's better to set up individual PRs, rather than big chunk. It's difficult for me to review the big chunk containing many commits.

And we totally agree on that. But

  1. I don't think this is a big pull request.
  2. You merged CMake build with Visual Studio on Windows #107 though I already agreed to split it up into smaller ones. So the handling of the subject matter isn't very consistent.

If you think this PR can't be merged as is, feel free to close it.

@gruenich
Copy link
Contributor

You are right, this merge request ist not too big. Nevertheless, splitting it up into smaller commits with descriptive commit messages makes it easier to understand what happens. Further, if someone is looking for regressions, it helps with bisecting the change.
If you want, I can split this MR to show what I mean.

@wo80
Copy link
Contributor Author

wo80 commented Aug 12, 2023

If you want, I can split this MR to show what I mean.

I know exactly what you mean. I just think it's a case of overengineering.

What you are basically asking is not rebasing, but resetting and completely re-doing the PR. But at the end it's just a commit fixing a completely broken CMake test setup by removing the part that is broken. To me, this doesn't need more dissection. It's not affecting the actual C code base, it's not affecting other parts of the build system - just a very simple fix for a broken setup.

If @xiaoyeli doesn't agree, I'm fine with that. Close it.

@wo80
Copy link
Contributor Author

wo80 commented Aug 12, 2023

I've cleaned up the commit timeline. This eliminates the previous merge with upstream. I'm not going to dissect this further, reasons explained above.

@wo80
Copy link
Contributor Author

wo80 commented Aug 12, 2023

To finally come to a conclusion:

In case @xiaoyeli thinks that logging should not be removed, I'd totally accept that as a reason to reject this PR. Since it wouldn't make sense to rework this PR, a new one should be opened to fix the test setup including the logging.

In case this PR gets merged, the readme should be updated, since ctest will no longer produce the *.out log files.

@gruenich
Copy link
Contributor

#114 was merged, containing your changes to the test setup. Thanks for providing the code!

I lost track of the changes in this pull request, that where not part of #114. I still support your idea of having tests that reflect reality. I wouldn't enable failing tests, but once fixed, they should be properly reported.

Let's join forces to achieve this goal and salvage the remaining part from here.

@wo80
Copy link
Contributor Author

wo80 commented Nov 18, 2023

Dear @gruenich, dear @xiaoyeli. This has probably been the worst experience contributing to an open source project - EVER! Consider this comment to be the last interaction with this repo.

@wo80 wo80 closed this Nov 18, 2023
@gruenich
Copy link
Contributor

I am sorry to hear that your contribution experience feels so bad. I would hate to see losing your input and your code improvements to the SuperLU project!

I tried to hand over my learning from contributing to SuperLU. I hope this helps you to understand my actions, which were not meant to be hostile:

  1. SuperLU is scarce on resources, there is no team of multiple experts developing, supporting, reviewing. Given how important SuperLU is for the communities of scientific computing, simulation, and HPC, this might be surprising. It is a sad reality.
  2. SuperLU is a massively used dependency, there are so many people out there that are relying on a working SuperLU, and they are probably not aware of this crucial dependency.
  3. Given 1) and 2), the pace of development is slow. Only small changes are reviewed at once and it might take quite some time. Look up my past contributions: little by little I improve SuperLU and get my changes included.
  4. Chopping your changes from Fix CMake test setup #112 into smaller commits, I wanted to help you getting the changes included, which finally worked; [cmake] Simplify creation of tests #114 is part of the code.
  5. Several of your changes were accepted and are part of SuperLU. You improved SuperLU for yourself and thousands of users.
  6. I support your goal to have real regression tests in the CI/CD. Again, this will not happen overnight.
  7. All the other points you mentioned, I agree with you. Let's tackle them one by one. If you turn your back on SuperLU, I try to continue working on these issues.
  8. Over time the situation will only improve, when people contribute, gain trust and share the load from Sherry. By a real team, a real community.

@xiaoyeli
Copy link
Owner

xiaoyeli commented Nov 21, 2023

Regarding bullet 1 above, our fundings/efforts in recent years have been on SuperLU_DIST, distributed memory and GPU development.
For serial superlu and superlu_mt, I am mainly doing maintenance. I am relying on the community to help improve it. Given the limited bandwidth, I only look at PRs and issues at spare time. If the PRs are too complicated, I may not have time to comb through them. So the simple PR structure is preferred.

@wo80
Copy link
Contributor Author

wo80 commented Nov 22, 2023

I really tried to leave all this mess alone, but I just can't...

@xiaoyeli No, no, no, no, no ... there's no excuse for completely ignoring the things I have said in all those discussions (well, rather monologues, or dialogs with @gruenich ). You have been active here during the last 3 months, answering issues, merging pull requests (introducing bugs), pushing code to master (introducing nonsense). I understand that this project has no priority for you, but that cannot justify you acting so irresponsible or you ignoring, for example, my honest and repeated advice to get the CI tests working. No excuse for that!

In #108 on August 5th (yes, nearly 4 months ago now) I first mentioned that the tests are - quote - fundamentally flawed (meaning they aren't testing anything) - and that's the correct assessment. On the same day I laid out the path to fix that, opening a pull request that fixed the problems. Unfortunately, @gruenich jumped in complaining about the formal look of the PR and you supported that (the one and only comment I ever got from you), but then again completely ignoring my arguments against @gruenich complaints.

So, why am I writing all this? Because I just downloaded the master branch and tried to build with Visual Studio - and it failed...

Reason: e87529f

Revert that and it compiles.

But surprise: ctest reports ... segfaults.

Reason: not sure, maybe #108 (comment) or #108 (comment), but, hey @xiaoyeli, lets ignore that...

In #108 (comment) I laid out the path how to fix the Github workflow and back then I would have been happy to help out. But not anymore. So, @gruenich, since you are so keen to help here, do your thing...

Final comment.

EVER!

(maybe not)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants