-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gather
and manysearch
benchmarks for plugin v0.9.6 and v0.9.8
#479
Comments
With #498 included: On a large/complex metagenome, with 45k gather matches, at k=31 with a scaled of 1000 and threshold of 50kb, fmg on rocksdb is kind of ridiculously fast?! This is with the gtdb database in
The question remaining is how well does it parallelize, I think...
|
scripts and code in Tried doing a host contamination analysis on a bunch of the small/simple metagenomes that Jean is analyzing - at k=31, scaled=10_000, -c 4, -t 0, against 8 host genomes (human, cow, dog, cat, chicken, mouse, goat, and pig). List of metagenomes:
Time:
|
Next, host contamination analysis on the 10 biggest/most annoying metagenomes that Jean is analyzing - at k=31, scaled=10_000, -c 4, -t 0, against 8 host genomes (human, cow, dog, cat, chicken, mouse, goat, and pig). List of metagenomes:
Time:
|
Tried out fmg + rocksdb on the same list here ("small-list") -
so, very much NOT parallel 😅 , but about 7 minutes per metagenome. Next up - can I parallelize with snakemake? |
snakemake results. I think the reason for lack of parallelization here is that all but one of the jobs finished ~instantly, with ERR2241825 taking up all the time. This was done with fmg
|
with
|
|
branchwater plugin v0.9.9, indexing GTDB rs220 at k=21 - 7 hours and 20 GB RAM. 25 GB on disk. (located on farm:
|
gather benchmarks
CURRENT: v0.9.8 (link)
Cost of RocksDB indexing: 2:40:31 / 9632s / 14.4 GB.
PREVIOUS: v0.9.6 (link)
Cost of RocksDB indexing: 4:47:34 / 17255s / 14.0 GB
manysearch benchmarks
* this run used sig.zip files and manifest CSVs in the benchmarking.
(more description to follow.)
The text was updated successfully, but these errors were encountered: