Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: revise documentation structure; add internals page. #2184

Merged
merged 66 commits into from
Oct 15, 2023
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
77e4f49
first try at revised index
ctb Aug 6, 2022
9f9f046
Merge branch 'latest' into update_doc_structure
ctb Aug 15, 2022
33eeb06
lots of details
ctb Aug 16, 2022
68fe6d7
much docs, so wow
ctb Aug 16, 2022
0bf170d
even more docs
ctb Aug 16, 2022
f3ccf11
even more docs
ctb Aug 16, 2022
4c60cd7
mo'
ctb Aug 16, 2022
471ddf4
add issue links
ctb Aug 16, 2022
419e433
fix more links
ctb Aug 16, 2022
8e9248f
notes
ctb Aug 17, 2022
91fbf7c
update text
ctb Aug 17, 2022
b67d38f
Apply suggestions from code review
ctb Aug 18, 2022
fbbcee5
address many of @ccbaumler suggestions
ctb Aug 18, 2022
87ab0dc
Update doc/sourmash-internals.md
ctb Aug 18, 2022
ee17cf3
misc changes
ctb Aug 18, 2022
bb4a89d
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Aug 19, 2022
6d4a397
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Aug 20, 2022
8fce224
whups, add faq.md
ctb Aug 20, 2022
9e1fbfc
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Aug 21, 2022
df8d0e2
Merge branch 'latest' into update_doc_structure
ctb Aug 22, 2022
a967534
Merge branch 'latest' into update_doc_structure
ctb Aug 26, 2022
a1ec24b
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Aug 30, 2022
2b6e3df
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Aug 31, 2022
e468205
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Sep 23, 2023
2f1eb4b
update some text
ctb Sep 23, 2023
6576d07
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Sep 26, 2023
87ae6a2
more FAQ
ctb Sep 26, 2023
1d5b754
update / add verify notes
ctb Sep 26, 2023
7a254ee
more writing
ctb Sep 26, 2023
651ca6e
more
ctb Sep 26, 2023
01ed183
more
ctb Sep 27, 2023
8b2fac3
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Sep 27, 2023
d99d7c9
fix links etc
ctb Sep 27, 2023
c60da96
add faq
ctb Sep 27, 2023
0120b78
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Sep 28, 2023
47aa759
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Sep 29, 2023
b21282e
add funding acks
ctb Sep 29, 2023
de2b5a4
detailed gather output
ctb Sep 30, 2023
b789b73
minor updates
ctb Sep 30, 2023
fb45b6c
add information about save formats and memory usage
ctb Sep 30, 2023
69dfbc0
add section on choosing a reference for read mapping
ctb Sep 30, 2023
e828ff9
add section on retrieving reads to FAQ
ctb Sep 30, 2023
4d2a91b
collection load order
ctb Sep 30, 2023
4e97e8e
update author/copyright
ctb Sep 30, 2023
571c865
minor corrections
ctb Sep 30, 2023
c6251d3
close to done with a first pass
ctb Oct 1, 2023
1d1e460
finish off internals
ctb Oct 1, 2023
4ed0367
finish things off?
ctb Oct 1, 2023
a4c1244
clean up missing refs
ctb Oct 3, 2023
3967b15
add publications
ctb Oct 3, 2023
30ed122
Merge branch 'latest' into update_doc_structure
ctb Oct 3, 2023
362a48f
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Oct 13, 2023
1f44043
Merge branch 'update_doc_structure' of https://github.com/sourmash-bi…
ctb Oct 13, 2023
535d7eb
update new -> index
ctb Oct 13, 2023
6026ecb
Apply suggestions from code review
ctb Oct 13, 2023
247d4e4
Merge branch 'update_doc_structure' into update_doc_structure_index
ctb Oct 14, 2023
82aa9ed
update
ctb Oct 14, 2023
14e9a51
add global toc
ctb Oct 14, 2023
c30aea6
remove ToC at top
ctb Oct 14, 2023
a08a36f
try unhiding
ctb Oct 14, 2023
d128c65
add sidebar
ctb Oct 14, 2023
13fde86
add TOC to sidebar
ctb Oct 14, 2023
8144c3c
upd sidebar
ctb Oct 14, 2023
6a5bb1a
rename internals page
ctb Oct 14, 2023
817092f
clean up
ctb Oct 14, 2023
b944b69
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Oct 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions doc/new.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Welcome to sourmash!

sourmash is a command-line tool and Python/Rust library for
**metagenome analysis** and **genome comparison** with k-mers. It
ctb marked this conversation as resolved.
Show resolved Hide resolved
supports the compositional analysis of metagenomes, rapid search of
large sequence databases, and flexible taxonomic analysis with both
ctb marked this conversation as resolved.
Show resolved Hide resolved
NCBI and GTDB taxonomies. sourmash works well with sequences 30kb or
ctb marked this conversation as resolved.
Show resolved Hide resolved
larger, including bacterial and viral genomes.

You might try sourmash if you want to -

* identify which reference genomes are present in a metagenome
* search all Genbank microbial genomes with a genome query
* build an annotation-free clustering of many genomes using k-mers or average nucleotide identity (ANI)
* taxonomically classify genomes or metagenomes against NCBI and/or GTDB;
ctb marked this conversation as resolved.
Show resolved Hide resolved
* search all available reference genomes
* search thousands of metagenomes with a query genome or sequence

Underneath, sourmash uses [FracMinHash sketches](@@) for fast and
lightweight sequence comparison; FracMinHash builds on
[MinHash sketching](@@wikipedia) to support both Jaccard similarity
_and_ containment analyses with k-mers. This significantly expands
the range of operations that can be done quickly and in low
memory. sourmash also implements a number of new and powerful analysis
techniques, including minimum metagenome covers and alignment-free ANI
estimation.
ctb marked this conversation as resolved.
Show resolved Hide resolved

sourmash is inspired by [mash](https://mash.readthedocs.io), and
supports most mash analyses. sourmash also implements an expanded set
of functionality for metagenome and taxonomic analysis.

## Using sourmash

### Tutorials and examples

These tutorials are command line tutorials that should work on Mac OS
X and Linux. They require about 5 GB of disk space and 5 GB of RAM.

* [The first sourmash tutorial - making signatures, comparing, and searching](tutorial-basic.md)

* [Using sourmash LCA to do taxonomic classification](tutorials-lca.md)

* [Analyzing the genomic and taxonomic composition of an environmental genome using GTDB and sample-specific MAGs with sourmash](tutorial-lemonade.md)

* [Some sourmash command line examples!](sourmash-examples.md)

### How-To Guides

* Installing sourmash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Installing sourmash
* [Installing sourmash (The first three sections here)](tutorial-basic.md)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... have to dig!


* [Classifying genome sketches](classifying-signatures.md)

* [Working with private collections of genome sketches.](sourmash-collections.md)

* [Using the `LCA_Database` API.](using-LCA-database-API.ipynb)

* [Building plots from `sourmash compare` output](plotting-compare.md).

* [A short guide to using sourmash output with R](other-languages.md).

### How sourmash works under the hood

* [An introduction to k-mers for genome comparison and analysis](kmers-and-minhash.md)
* [Support, versioning, and migration between versions](support.md)

### Reference material

* [UNIX command-line documentation](command-line.md)
* [Genbank and GTDB databases and taxonomy files](databases.md)
* [Python examples using the API](api-example.md)
* [Publications about sourmash](publications.md)
* [A guide to the internals of sourmash](sourmash-internals.md)

## Developing and extending sourmash

* [Releasing a new version of sourmash](release.md)
Loading