-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
terminate called recursively -- Aborted (core dumped) #152
Comments
Could you please try to re-run with |
Ok, it ran for a while, then there was an unrelated node failure (a few of my other unrelated jobs on other nodes also failed at the same time, so I think there was a cluster issue at the time). I restarted the run and about a day later raxml-ng fails again with the same error. Here is the second log:
I didn't set up a way of tracking the memory usage throughout the job, but at least in the beginning, it was using around a quarter of the available memory (~200G) on the system. (The very first run I did had access to 1 TB of memory, but I ran out my cpu hour allocation on that system) |
Hi, I know this overlapped with the holiday season, but I wanted to make sure it isn't forgotten. My allocation on the cluster expired and I'll see if we can get a new one, but I did manage to get that debug output for you before that happened. many thanks for all your help |
Thank you, this debug output was very helpful. Here are a couple of things you can try:
As a side note. according to the log file, your system has 128 cores and 250GB of memory. This means that: a) 200GB would be pretty close to the limit, b) you can probably use more than 24 threads (if despite the above, it turns out memory is not the constraint). |
sorry for the delay, but I've gotten renewed access to the cluster and just run it with these three settings. Now I've gotten a new error right before the core dump:
|
Sorry for the delay! Since I can't see any obvious reasons for this error, could you please send me you input file? (Even better if you could reproduce this error on a smaller alignment) |
Hi, I also see the same problem, Did you solve this problem? |
Please send me your input files and full raxml log, and I will have a look. |
I'm trying to infer a tree using concatenated SNPs and consistently get this core dump when multithreading. I do not have this problem with the original
raxmlHPC
, but it does not complete within the 2 day maximum walltime of the cluster and doesn't do checkpointing either (as far as i know), so I can't restart it from where it left off.So I'm trying raxml-ng (latest version 1.1.0):
I have also tried with
--threads 1 --workers 1
and it timed out at the 2 day maximum, but the log doesn't indicate that it got anywhere pastso even if it worked with 1 thread, it looks like it'd be painstakingly slow.
The text was updated successfully, but these errors were encountered: