-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MRG: refactor fastgather/fastmultigather CSV output to use mpsc channels #567
Conversation
Ready for review @bluegenes! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple documentation qualms - otherwise lgtm 🎉
…water into refactor_gather_csv
Here's the complete updated doc section: Running
|
Note: PR into #568
This PR adds support for a single output file with
-o
for fastmultigather on a non-RocksDB database. It also refactorsfastgather
to use the same underlying mpsc mechanism, so all three gather output mechanisms share more code in common. The ultimate goal is to enable a better internal API, ref #569 and #552.In particular, this means that now
-o
works the same way on a RocksDB database and on a non-RocksDB database :).This PR also disables the current functionality of creating individual output files for each query, which simplifies matters greatly, but does break backwards compatibility. See long-ranging discussions over in sourmash-bio/sourmash#2722 and sourmash-bio/sourmash#2328.
Finally, one last breakage:
--create-empty-results
now only creates empty prefetch results files, and not gather results files.This PR also:
Collection::min_max_scaled
method (ref update code to usemin_max_scaled
#527);utils::prefetch(...)
only yields non-zero overlaps, preventing infinite loops;This PR:
fastmultigather
--output
is only being used when searching against a rocksdb #239min_max_scaled
#527TODO:
fastmultigather
#446