Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test suite has started SEGFAULTING locally #784

Open
MatthewDaggitt opened this issue Mar 26, 2024 · 13 comments
Open

Test suite has started SEGFAULTING locally #784

MatthewDaggitt opened this issue Mar 26, 2024 · 13 comments
Labels

Comments

@MatthewDaggitt
Copy link
Collaborator

MatthewDaggitt commented Mar 26, 2024

Since I last was hacking on Marabou two weeks ago, the test suite has started to fail for me whenever I try and build Marabou. I haven't changed anything in my environment apart from pulling the latest version of master. Is anyone else experiencing this problem or have any ideas why it might be happening?

20/76 Test #36: Test_PermutationMatrix .............***Exception: SegFault  0.29 sec
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL

Total Test time (real) =   5.05 sec

The following tests FAILED:
	  8 - Test_DegradationChecker (SEGFAULT)
	  9 - Test_DisjunctionConstraint (SEGFAULT)
	 10 - Test_DnCWorker (SEGFAULT)
	 11 - Test_Engine (SEGFAULT)
	 12 - Test_Equation (SEGFAULT)
	 17 - Test_MILPEncoder (SEGFAULT)
	 34 - Test_LUFactorization (SEGFAULT)
	 36 - Test_PermutationMatrix (SEGFAULT)
	 43 - Test_SparseUnsortedList (SEGFAULT)
	 49 - Test_GurobiWrapper (SEGFAULT)
	 53 - Test_LinearExpression (SEGFAULT)
	 55 - Test_MString (SEGFAULT)
	 56 - Test_MStringf (SEGFAULT)
	 57 - Test_Map (SEGFAULT)
	 62 - Test_Vector (SEGFAULT)
	 63 - Test_MatrixMultiplication (SEGFAULT)
	 73 - Test_OnnxParser (SEGFAULT)
	 75 - Test_QueryLoader (SEGFAULT)
	 77 - Test_NetworkLevelReasoner (SEGFAULT)
	 78 - Test_WsLayerElimination (SEGFAULT)
	 81 - Test_Checker (SEGFAULT)
	 83 - Test_UnsatCertificateNode (SEGFAULT)
	 84 - Test_UnsatCertificateUtils (SEGFAULT)
@wu-haoze
Copy link
Collaborator

@MatthewDaggitt , to understand the issue better,

  1. after pulling from the master and rebuild, do you recall it downloading and re-compiling external packages?
  2. What if you remove all downloaded packages from the tools/ directory and recompile? I wonder whether it's because the packages in the tools directory were built for C++11 instead of C++17.

@MatthewDaggitt
Copy link
Collaborator Author

  1. after pulling from the master and rebuild, do you recall it downloading and re-compiling external packages?

No I didn't.

  1. What if you remove all downloaded packages from the tools/ directory and recompile?

No, I've recloned the entire Marabou repo and built from scratch again, downloading everything anew. Still segfaults in exactly the same way. I'll try to pinpoint the commit where this problem starts.

@wu-haoze
Copy link
Collaborator

wu-haoze commented Mar 27, 2024

@MatthewDaggitt This fix seems to resolve the issue you encountered: f93fb3e

The issue is somehow indeed with the external dependency. Could you please try this fix and see if it fixes the issue locally?

UPDATE:

Please try this instead: 47e920b

@MatthewDaggitt
Copy link
Collaborator Author

MatthewDaggitt commented Apr 2, 2024

Hi @wu-haoze, unfortunately that doesn't fix the error for me. Even when I'm using 1.84 I still get the same error...

Trying 1.74 now.... Yup same problem with 1.74. So doesn't seem to be connected to boost for me.

@MatthewDaggitt
Copy link
Collaborator Author

Okay, so I've now tried building commit f9c12ca which I know was good, but that now fails...

So as you say @wu-haoze it must be something non-deterministic in our environments that have changed. You mentioned that the CI was failing? Do you have a link to the failed run?

@MatthewDaggitt
Copy link
Collaborator Author

I guess the next step would be actually to go in and find where the segfault is and why...

@wu-haoze
Copy link
Collaborator

wu-haoze commented Apr 2, 2024

@wu-haoze
Copy link
Collaborator

@MatthewDaggitt could this line be the culprit?

./b2 cxxflags=-fPIC link=static cxxflags=-std=c++11 install >> /dev/null

The flag is still C++11. Could you please check whether changing it to c++17 and recompile boost would fix the problem?

@MatthewDaggitt
Copy link
Collaborator Author

No, unfortunately it doesn't....

@GirardR1006
Copy link

GirardR1006 commented Jun 26, 2024

Confirmed that the segfault in the test is still here (commit 3c8e105) while building locally on a Ubuntu 22.04 with g++ 11.4.0.
Installed boost 1.84.

Note that rebuilding on a clean ubuntu docker image results on success for the whole test suite.
To reproduce with docker:

docker run -it ubuntu:latest
apt install g++ wget git cmake python3 python3-dev
git clone https://github.com/NeuralNetworkVerification/Marabou.git
cd Marabou
mkdir build
cmake ../
cmake --build ..

Inside of this docker image, g++ is version 13.2.0.

@MatthewDaggitt
Copy link
Collaborator Author

Hmm interesting that it works on a clean docker build. I have tried nuking every cache and setting I can think of but am still getting the error on my development machine. It's very irritating, it's really meant I can't make progress on unifying the parsers at all...

@GirardR1006
Copy link

GirardR1006 commented Aug 20, 2024

For the record, I'm currently writing a Nix derivation for Marabou to bundle with the CAISAR platform. I hope to make it public soon, and it may help provide another angle to pinpoint the problem.

@GirardR1006
Copy link

It seems you succeeded into fixing your CI :)

For the record, the aforementioned flake is here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants