-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kraken2_build step stalling #27
Comments
I've (thankfully) never experienced that issue. How many genomes are included in the build? |
The database to be updated is the full GTDB_release207, and the sample TSV I'm trying to add includes ~4,000 genomes |
A related question: if we instead passed the reads that were unclassified from GTDB into a second database (db-create with only the non-GTDB genomes), should that give similar results as a single database via the db-update workflow? There are methods for combining outputs for the same sample from different databases, though I imagine there could be downstream effects on Bracken estimates. |
The downside of a 2-step classification approach versus a 1-step is that there is no direct "competition" during classification across the 2 steps. So, some reads could be falsely classified in the 1st step when they would actually be classified as something in the 2nd step if the 2 reference databases were combined. |
Same problem here. I ran the kraken2 database building using 40 cores (7 GB each), and after 24 hours the process stalled at this point:
|
@MixalisSn do you think that the stalling could be due to limited memory? |
@nick-youngblut I thought the 120 GB were enough. Any way, I added the --fast-build flag, using the same resources, and the build was completed successfully. |
Hi there,
I'm continuing to troubleshoot the db-update process for a kraken2 database, and I've hit a wall at the kraken2_build step. The pipeline doesn't throw any errors; it just continues to run indefinitely (12+ hours without failure or completion). It seems similar to the problem described here: DerrickWood/kraken2#428
So far, I've tried to implement the workaround mentioned in the comments of that issue I linked, where you add the --fast-build flag to the kraken2 call in the db-update snakefile, but it doesn't seem to have solved the issue. Any chance you've seen this before and/or have any thoughts on what might be causing it? I definitely have enough RAM. I'm using 28 cores with 16 Gb per core.
Thanks!
The text was updated successfully, but these errors were encountered: