Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXP: try out lower level pyo3 API #552

Open
wants to merge 116 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
116 commits
Select commit Hold shift + click to select a range
fcb699f
steal code from branch_api
ctb Dec 21, 2024
1bf7d77
bring in latest main
ctb Dec 21, 2024
96b6fea
Merge branch 'main' of github.com:sourmash-bio/sourmash_plugin_branch…
ctb Dec 30, 2024
a08d76d
compiles
ctb Dec 30, 2024
6023aa8
re-enable
ctb Dec 30, 2024
45ea7ac
better error
ctb Dec 30, 2024
1822118
attempt
ctb Dec 30, 2024
6c16066
it's aliiiiive
ctb Dec 30, 2024
392bdc8
w00t
ctb Dec 30, 2024
dcf80fb
w00t
ctb Dec 30, 2024
6e981fb
w00t**2
ctb Dec 30, 2024
846e768
more granular
ctb Dec 30, 2024
e1af3e5
...compiles
ctb Dec 30, 2024
7f83e88
w00t**4
ctb Dec 30, 2024
b3499cd
Merge branch 'main' of github.com:sourmash-bio/sourmash_plugin_branch…
ctb Jan 2, 2025
3715c22
Merge branch 'main' of github.com:sourmash-bio/sourmash_plugin_branch…
ctb Jan 2, 2025
2dbd2b9
stub test for diff ksize in multigather
ctb Jan 3, 2025
5cd6218
cargo fmt
ctb Jan 3, 2025
217e69e
Rename test file
ctb Jan 3, 2025
04ea44b
refactor CSV output for fastgather/fastmultigather to use mpsc
ctb Jan 3, 2025
656f870
cargo fmt
ctb Jan 3, 2025
ae10d0b
Merge branch 'refactor_gather_csv' into branch_api_2
ctb Jan 3, 2025
fd3ce53
tests mostly pass
ctb Jan 3, 2025
7a0b78c
Merge branch 'refactor_gather_csv' into branch_api_2
ctb Jan 3, 2025
1fa4093
update API
ctb Jan 3, 2025
ca6c1e7
update for revised API
ctb Jan 3, 2025
8bc9d33
fix skipmer test
ctb Jan 4, 2025
cc17722
upd comment
ctb Jan 4, 2025
c6a34f8
Merge branch 'fix_skip_test' into refactor_gather_csv
ctb Jan 4, 2025
88a6466
black
ctb Jan 4, 2025
e755b0b
Merge branch 'fix_skip_test' into refactor_gather_csv
ctb Jan 4, 2025
ec91bc1
switch to min_max_scaled for rocksdb
ctb Jan 4, 2025
42ecb2e
black
ctb Jan 4, 2025
5ef15cc
Merge branch 'refactor_gather_csv' into branch_api_2
ctb Jan 4, 2025
a83ee1b
comment
ctb Jan 4, 2025
3f40c6b
ensure overlap is > 0
ctb Jan 4, 2025
fad1e60
Merge branch 'refactor_gather_csv' into branch_api_2
ctb Jan 4, 2025
def6300
rm comment
ctb Jan 4, 2025
4fb6d7b
rm print
ctb Jan 4, 2025
0e483ce
rm print
ctb Jan 4, 2025
ff40d6b
cleanup
ctb Jan 4, 2025
41f1b07
fix clippy messages about too-complex returns
ctb Jan 4, 2025
6295d8d
Merge branch 'refactor_gather_csv' into branch_api_2
ctb Jan 4, 2025
b400638
implement to_collection for BranchRevIndex
ctb Jan 4, 2025
2f05442
cargo fmt
ctb Jan 4, 2025
fab979b
Merge branch 'refactor_gather_csv' into branch_api_2
ctb Jan 4, 2025
526d817
rename BranchCollection to BranchMultiCollection
ctb Jan 4, 2025
d29a70a
combine/collapse
ctb Jan 5, 2025
68daf23
add loaded_sketches
ctb Jan 5, 2025
686e8ff
foo
ctb Jan 5, 2025
2084827
start implementing in Python
ctb Jan 5, 2025
f70a665
upgrade x rocksdb
ctb Jan 5, 2025
06205cc
more
ctb Jan 5, 2025
9cabf5e
fix scaled
ctb Jan 5, 2025
cc3c5a1
punt
ctb Jan 5, 2025
a285671
refactor out to obj fn
ctb Jan 5, 2025
32ced0e
start making 'index'
ctb Jan 5, 2025
7d81663
update error handling, add function & test
ctb Jan 5, 2025
96bf5be
refactor manysearch out into manysearch_obj
ctb Jan 5, 2025
b542478
better refactor
ctb Jan 5, 2025
c64a27c
cargo fmt
ctb Jan 5, 2025
ea4e19b
refactor manysearch_rocksdb
ctb Jan 5, 2025
7ba32a5
refactor multisearch
ctb Jan 5, 2025
3f3cc2b
cargo fmt and black
ctb Jan 5, 2025
d501ed7
disable pyexample stuff
ctb Jan 5, 2025
8c8dead
cargo fmt and black
ctb Jan 5, 2025
50be8a7
add manysearch x rocksdb
ctb Jan 5, 2025
1f44422
separate out Rust API changes from #552
ctb Jan 5, 2025
6a1b92d
Merge branch 'factor_out_rust_api' into branch_api_2
ctb Jan 5, 2025
7e5a141
rm obsolete code
ctb Jan 5, 2025
46da554
upd overlap
ctb Jan 5, 2025
052d551
Merge branch 'refactor_gather_csv' into factor_out_rust_api
ctb Jan 5, 2025
436dd7a
fix
ctb Jan 5, 2025
4a3ccaf
fix
ctb Jan 5, 2025
746ea88
fix
ctb Jan 5, 2025
88430b0
remove remaining Box/error stuff
ctb Jan 5, 2025
b74f9d8
remove remaining join => Err
ctb Jan 5, 2025
8fded7a
remove remaining Box/error stuff
ctb Jan 5, 2025
f6019b1
remove remaining join => Err
ctb Jan 5, 2025
86e1afc
Merge branch 'factor_out_rust_api' into branch_api_2
ctb Jan 5, 2025
ef6671a
add multisearch_against
ctb Jan 5, 2025
a79da94
add pairwise
ctb Jan 5, 2025
f6b423c
add basic pairwise and multisearch tests
ctb Jan 5, 2025
95f8492
fmt
ctb Jan 5, 2025
8c9b73f
clean up error handling a bit
ctb Jan 5, 2025
67d1468
Merge branch 'factor_out_rust_api' into branch_api_2
ctb Jan 5, 2025
b5c4e50
upd fastmultigather
ctb Jan 6, 2025
aadc9ac
refactor out loading of sketches
ctb Jan 6, 2025
5d5cc45
Merge branch 'factor_out_rust_api' into branch_api_2
ctb Jan 6, 2025
ebbd67b
fix tests
ctb Jan 6, 2025
547484a
update docs
ctb Jan 6, 2025
bf9a5b7
upd
ctb Jan 6, 2025
089f6ab
Merge branch 'main' of github.com:sourmash-bio/sourmash_plugin_branch…
ctb Jan 7, 2025
6091cf8
break test again
ctb Jan 7, 2025
92d634a
do heinous dev stuff
ctb Jan 7, 2025
8fea8a7
fix fix comment
ctb Jan 7, 2025
4734806
Merge branch 'fix_skip_test' into refactor_gather_csv
ctb Jan 7, 2025
ea52473
upd
ctb Jan 7, 2025
38326e3
upd
ctb Jan 7, 2025
a1a646d
Merge branch 'main' of github.com:sourmash-bio/sourmash_plugin_branch…
ctb Jan 7, 2025
401562c
do not require -o after all
ctb Jan 7, 2025
6293ced
Merge branch 'refactor_gather_csv' into factor_out_rust_api
ctb Jan 7, 2025
55c810a
fix for crate_empty_results
ctb Jan 7, 2025
bb82cf0
fmt
ctb Jan 7, 2025
4168c75
Merge branch 'main' of github.com:sourmash-bio/sourmash_plugin_branch…
ctb Jan 7, 2025
39e2f10
Merge branch 'factor_out_rust_api' into branch_api_2
ctb Jan 7, 2025
660b17f
remove @CTB notes
ctb Jan 7, 2025
b778f52
Merge branch 'factor_out_rust_api' into branch_api_2
ctb Jan 7, 2025
520e4ff
Revert "remove @CTB notes"
ctb Jan 7, 2025
465d466
fix fix
ctb Jan 7, 2025
6a41221
Merge branch 'main' of github.com:sourmash-bio/sourmash_plugin_branch…
ctb Jan 7, 2025
442ce72
Merge branch 'factor_out_rust_api' into branch_api_2
ctb Jan 7, 2025
442af9d
remove obligatory clone from Rust layer
ctb Jan 7, 2025
73c06a0
Merge branch 'factor_out_rust_api' into branch_api_2
ctb Jan 7, 2025
9d3eaea
fix
ctb Jan 7, 2025
13b03d3
Merge branch 'main' of github.com:sourmash-bio/sourmash_plugin_branch…
ctb Jan 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix clippy messages about too-complex returns
  • Loading branch information
ctb committed Jan 4, 2025
commit 41f1b07d90e96f9d55ace489b8973639ffa52548
18 changes: 8 additions & 10 deletions src/manysearch.rs
Original file line number Diff line number Diff line change
@@ -19,6 +19,13 @@ use sourmash::signature::SigsTrait;
use sourmash::sketch::minhash::KmerMinHash;
use sourmash::storage::SigStore;

type AbundanceStats = (Option<u64>,
Option<u64>,
Option<f64>,
Option<f64>,
Option<f64>);


pub fn manysearch(
query_filepath: String,
against_filepath: String,
@@ -174,16 +181,7 @@ pub fn manysearch(
fn inflate_abundances(
query: &KmerMinHash,
against: &KmerMinHash,
) -> Result<
(
Option<u64>,
Option<u64>,
Option<f64>,
Option<f64>,
Option<f64>,
),
SourmashError,
> {
) -> Result<AbundanceStats, SourmashError> {
let abunds: Vec<u64>;
let sum_weighted: u64;
let sum_all_abunds: u64 = against.sum_abunds();
14 changes: 7 additions & 7 deletions src/multisearch.rs
Original file line number Diff line number Diff line change
@@ -17,6 +17,12 @@ use crate::utils::multicollection::SmallSignature;
use crate::utils::{csvwriter_thread, load_collection, MultiSearchResult, ReportType};
use sourmash::ani_utils::ani_from_containment;

type OverlapStatsReturn = (f64,
HashMap<u64, f64>,
HashMap<u64, f64>,
HashMap<String, HashMap<u64, f64>>,
HashMap<u64, f64>);

#[derive(Default, Clone, Debug)]
struct ProbOverlapStats {
prob_overlap: f64,
@@ -71,13 +77,7 @@ fn compute_single_prob_overlap(
fn compute_prob_overlap_stats(
queries: &Vec<SmallSignature>,
againsts: &Vec<SmallSignature>,
) -> (
f64,
HashMap<u64, f64>,
HashMap<u64, f64>,
HashMap<String, HashMap<u64, f64>>,
HashMap<u64, f64>,
) {
) -> OverlapStatsReturn {
let n_comparisons = againsts.len() as f64 * queries.len() as f64;

// Combine all the queries and against into a single signature each
Loading