Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
jeff-k committed May 12, 2024
1 parent 3abf81b commit 1c54ae4
Show file tree
Hide file tree
Showing 3 changed files with 85 additions and 4 deletions.
22 changes: 22 additions & 0 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Rust

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]

env:
CARGO_TERM_COLOR: always

jobs:
build:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Build
run: cargo build --verbose
- name: Run tests
run: cargo test --verbose
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@ edition = "2021"
[dependencies]
futures = "0.3"
futures-test = "0.3"
bio-seq = { path="../bio-seq/bio-seq" }
bio-seq = "0.12"

[dev-dependencies]
flate2 = "1"
clap = { version="4", features=["derive"] }
bio-seq = { path="../bio-seq/bio-seq" }
bio-seq = "0.12"

[[example]]
name = "fqcheck"
Expand Down
63 changes: 61 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,65 @@
# bio-streams
[![Docs.rs](https://docs.rs/bio-streams/badge.svg)](https://docs.rs/bio-streams)
[![CI status](https://github.com/jeff-k/bio-streams/actions/workflows/rust.yml/badge.svg)](https://github.com/jeff-k/bio-streams/actions/workflows/rust.yml)

## examples
<div class="title-block" style="text-align: center;" align="center">

# bio-steams

### Types and datastructures for streaming genomics data

#### This crate is in early development. Contributions are very welcome.

Webassembly example: (https://jeff-k.github.io/fqdemo/)[Remove non M. TB reads from streaming fastqs], (https://jeff-k.github.io/amplicon-tiling/)[amplicon bases SARS-CoV-2 assembly]
</div>

## Features

Shared `Record` type by `Fastq` and `Fasta` streams:

```rust
pub struct Record<T: for<'a> TryFrom<&'a [u8]> = Vec<u8>> {
pub fields: Vec<u8>,
pub seq: T,
pub quality: Option<Vec<Phred>>, // fasta records set quality to `None`
}
```

Records can be read into custom types: `pub struct Fastq<R: BufRead, T = Seq<Dna>>`

## Examples

```rust
// Open a pair of gzipped fastq files as streams of `Record`s with `Seq<Dna>` sequences

let fq1: Fastq<BufReader<MultiGzDecoder<File>>> = Fastq::new(BufReader::new(
MultiGzDecoder::new(File::open(&file1).unwrap()),
));

let fq2: Fastq<BufReader<MultiGzDecoder<File>>> = Fastq::new(BufReader::new(
MultiGzDecoder::new(File::open(&file2).unwrap()),
));

for zipped in fq1.zip(fq2) {
match zipped {
(Ok(r1), Ok(r2)) => {
// check that the last characters of the name strings are 1 and 2
if r1.fields[r1.fields.len() - 1] != b'1' || r2.fields[r2.fields.len() - 1] != b'2'
{
eprintln!("paired records do not end in 1/2");
}

// check that the description fields are equal up to the last character
if r1.fields[..r1.fields.len() - 1] != r2.fields[..r2.fields.len() - 1] {
eprintln!("reads do not have the same names");
exit(1);
}
}
_ => {
eprintln!("Parse error in fastq files");
}
}
}
```

To run the `fqcheck` example program with read files `r1.fq.gz` and `f2.fq.gz`:

Expand Down

0 comments on commit 1c54ae4

Please sign in to comment.