Skip to content

Commit

Permalink
Merge branch 'latest' of github.com:sourmash-bio/sourmash into propag…
Browse files Browse the repository at this point in the history
…ate_zip_errors
  • Loading branch information
ctb committed Dec 14, 2024
2 parents 3f4bcae + 7362b43 commit 810c880
Show file tree
Hide file tree
Showing 4 changed files with 40 additions and 33 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ repos:
- id: check-toml
- id: debug-statements
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.8.1
rev: v0.8.2
hooks:
- id: ruff-format
- id: ruff
Expand Down
46 changes: 17 additions & 29 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions doc/databases.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,25 @@ The databases do not need to be unpacked or prepared in any way after download.

You can verify that they've been successfully downloaded (and view database properties such as `ksize` and `scaled`) with `sourmash sig summarize <output>`.

## Sketches for human and animal genomes

These sketches are of the latest releases of a number of animal
genomes. Among other uses, they can be used to detect host
contamination in microbial metagenomes.

Each file includes sketches at k=21, k=31, and k=51, at a scaled of
1000, and is about 110 MB.

* Human (hg38) - [hg38.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/host/hg38.sig.zip)
* Cow (bosTau9) - [bosTau9.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/host/bosTau9.sig.zip)
* Dog (canFam6) - [canFam6.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/host/canFam6.sig.zip)
* Horse (equCab3) - [equCab3.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/host/equCab3.sig.zip)
* Cat (felCat9) - [felCat9.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/host/felCat9.sig.zip)
* Chicken (galGAl6) - [galGal6.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/host/galGal6.sig.zip)
* Mouse (mm39) - [mm39.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/host/mm39.sig.zip)
* Goat (oviAri4) - [oviAri4.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/host/oviAri4.sig.zip)
* Pig (susCr11) - [susScr11.sig.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/host/susScr11.sig.zip)

## GTDB R08-RS214 - DNA databases

[GTDB R08-RS214](https://forum.gtdb.ecogenomic.org/t/announcing-gtdb-r08-rs214/456) consists of 402,709 genomes organized into 85,205 species clusters.
Expand Down
6 changes: 3 additions & 3 deletions src/core/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ log = "0.4.22"
md5 = "0.7.0"
memmap2 = "0.9.5"
murmurhash3 = "0.0.5"
needletail = { version = "0.6.0", default-features = false }
needletail = { version = "0.6.1", default-features = false }
niffler = { version = "2.4.0", default-features = false, features = [ "gz" ] }
nohash-hasher = "0.2.0"
num-iter = "0.1.45"
Expand All @@ -53,11 +53,11 @@ piz = "0.5.0"
primal-check = "0.3.4"
rayon = { version = "1.10.0", optional = true }
rkyv = { version = "0.7.44", optional = true }
roaring = "0.10.7"
roaring = "0.10.8"
roots = "0.0.8"
serde = { version = "1.0.215", features = ["derive"] }
serde_json = "1.0.133"
statrs = "0.17.1"
statrs = "0.18.0"
streaming-stats = "0.2.3"
thiserror = "2.0"
twox-hash = "1.6.0"
Expand Down

0 comments on commit 810c880

Please sign in to comment.