rust-shardio

Library for out-of-memory sorting of large datasets which need to be processed in multiple map / sort / reduce passes.

You write a stream of items of type T implementing Serialize and Deserialize to a ShardWriter. The items are buffered, sorted according to a customizable sort key, then serialized to disk in chunks with serde + lz4, while maintaining an index of the position and key range of each chunk. You use a ShardReader to stream through a item in a selected interval of the key space, in sorted order.

See Docs for API and examples.

Note: Enable the 'full-test' feature in Release mode to turn on some long-running stress tests.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
.github		.github
benches		benches
src		src
.gitignore		.gitignore
.travis.yml		.travis.yml
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rust-shardio

About

Releases

Packages

Contributors 9

Languages

License

10XGenomics/rust-shardio

Folders and files

Latest commit

History

Repository files navigation

rust-shardio

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Languages

Packages