-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fastgather
is faster than fastmultigather
in loading the database
#268
Comments
fastgather
is faster than fastmultigather
fastgather
is faster than fastmultigather
in loading the database
Ok, thinking through the different strategies used by
So perhaps what you're seeing is the effect of |
There's something else going on here... even when using the code from #292, |
I guess it could be the need to store stuff in memory, or something. Maybe that consumes a lot of time. But it seems strange to me. |
I'm seeing this issue too -- my benchmarks over in #298 were ~ 2 mins for The thing that confuses me is that I don't think any code in The utils have changed more recently, but really only the recent parallelization of |
ref #312 |
looking at sourmash-bio/sourmash#3232, it still kind of blows my mind how much faster |
Per the benchmarks in #479, this is still true - Now that I'm way more read into the codebase, I am pretty sure there is no simple bug that is slowing down fastmultigather. I would guess that the slowdown is one or more of:
but in any case the next step here is to do profiling, I think. |
version info
fastgather
fastmultigather
I killed it at 15 minutes
I suspect the in-memory zip decompression from
htop
tracking, but I am unsure.The text was updated successfully, but these errors were encountered: