Skip to content

Commit

Permalink
Samtools idxstats component (viash-hub#32)
Browse files Browse the repository at this point in the history
* idxstats component

* Update samtools version

* Apply suggestions for PR

* Update src/samtools/samtools_idxstats/test.sh

Add a slash to path to test data

Co-authored-by: Robrecht Cannoodt <[email protected]>

---------

Co-authored-by: Robrecht Cannoodt <[email protected]>
  • Loading branch information
emmarousseau and rcannood authored Apr 3, 2024
1 parent 2433679 commit 1200bc3
Show file tree
Hide file tree
Showing 12 changed files with 141 additions and 1 deletion.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@

* `samtools`:
- `samtools/flagstat`: Counts the number of alignments in SAM/BAM/CRAM files for each FLAG type (PR #31).

- `samtools/idxstats`: Reports alignment summary statistics for a SAM/BAM/CRAM file (PR #32).

## MAJOR CHANGES

Expand Down
53 changes: 53 additions & 0 deletions src/samtools/samtools_idxstats/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
name: samtools_idxstats
namespace: samtools
description: Reports alignment summary statistics for a BAM file.
keywords: [stats, mapping, counts, chromosome, bam, sam, cram]
links:
homepage: https://www.htslib.org/
documentation: https://www.htslib.org/doc/samtools-idxstats.html
repository: https://github.com/samtools/samtools
references:
doi: 10.1093/bioinformatics/btp352, 10.1093/gigascience/giab008
license: MIT/Expat

argument_groups:
- name: Inputs
arguments:
- name: "--bam"
type: file
description: BAM input file.
- name: "--bai"
type: file
description: BAM index file.
- name: "--fasta"
type: file
description: Reference file the CRAM was created with (optional).
- name: Outputs
arguments:
- name: "--output"
type: file
description: |
File containing samtools stats output in tab-delimited format.
required: true
must_exist: false
example: output.idxstats

resources:
- type: bash_script
path: script.sh
test_resources:
- type: bash_script
path: test.sh
- type: file
path: test_data
engines:
- type: docker
image: quay.io/biocontainers/samtools:1.19.2--h50ea8bc_1
setup:
- type: docker
run: |
samtools --version 2>&1 | grep -E '^(samtools|Using htslib)' | \
sed 's#Using ##;s# \([0-9\.]*\)$#: \1#' > /var/software_versions.txt
runners:
- type: executable
- type: nextflow
12 changes: 12 additions & 0 deletions src/samtools/samtools_idxstats/help.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
```
samtools idxstats
```

Usage: samtools idxstats [options] <in.bam>
--input-fmt-option OPT[=VAL]
Specify a single input file format option in the form
of OPTION or OPTION=VALUE
-@, --threads INT
Number of additional threads to use [0]
--verbosity INT
Set level of verbosity
8 changes: 8 additions & 0 deletions src/samtools/samtools_idxstats/script.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

## VIASH START
## VIASH END

set -e

samtools idxstats "$par_bam" > "$par_output"
49 changes: 49 additions & 0 deletions src/samtools/samtools_idxstats/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/bin/bash

test_dir="${meta_resources_dir}/test_data"
echo ">>> Testing $meta_functionality_name"

"$meta_executable" \
--bam "$test_dir/a.sorted.bam" \
--bai "$test_dir/a.sorted.bam.bai" \
--output "$test_dir/a.sorted.idxstats"

echo ">>> Checking whether output exists"
[ ! -f "$test_dir/a.sorted.idxstats" ] && echo "File 'a.sorted.idxstats' does not exist!" && exit 1

echo ">>> Checking whether output is non-empty"
[ ! -s "$test_dir/a.sorted.idxstats" ] && echo "File 'a.sorted.idxstats' is empty!" && exit 1

echo ">>> Checking whether output is correct"
diff "$test_dir/a.sorted.idxstats" "$test_dir/a_ref.sorted.idxstats" || \
(echo "Output file a.sorted.idxstats does not match expected output" && exit 1)

rm "$test_dir/a.sorted.idxstats"

############################################################################################

echo ">>> Testing $meta_functionality_name with singletons in the input"

"$meta_executable" \
--bam "$test_dir/test.paired_end.sorted.bam" \
--bai "$test_dir/test.paired_end.sorted.bam.bai" \
--output "$test_dir/test.paired_end.sorted.idxstats"

echo ">>> Checking whether output exists"
[ ! -f "$test_dir/test.paired_end.sorted.idxstats" ] && \
echo "File 'test.paired_end.sorted.idxstats' does not exist!" && exit 1

echo ">>> Checking whether output is non-empty"
[ ! -s "$test_dir/test.paired_end.sorted.idxstats" ] && \
echo "File 'test.paired_end.sorted.idxstats' is empty!" && exit 1

echo ">>> Checking whether output is correct"
diff "$test_dir/test.paired_end.sorted.idxstats" "$test_dir/test_ref.paired_end.sorted.idxstats" || \
(echo "Output file test.paired_end.sorted.idxstats does not match expected output" && exit 1)

rm "$test_dir/test.paired_end.sorted.idxstats"

############################################################################################

echo "All tests succeeded!"
exit 0
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
xx 20 6 0
* 0 0 0
14 changes: 14 additions & 0 deletions src/samtools/samtools_idxstats/test_data/script.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/bash

# dowload test data from snakemake wrapper
if [ ! -d /tmp/idxstats_source ]; then
git clone --depth 1 --single-branch --branch master https://github.com/snakemake/snakemake-wrappers.git /tmp/idxstats_source
fi

cp -r /tmp/idxstats_source/bio/samtools/idxstats/test/mapped/* src/samtools/idxstats/test_data
# samtools idxstats a.sorted.bam > a.sorted.idxstats

# dowload test data from nf-core module
wget https://github.com/nf-core/test-datasets/raw/modules/data/genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam
wget https://github.com/nf-core/test-datasets/raw/modules/data/genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai
# samtools idxstats test.paired_end.sorted.bam > test_ref.paired_end.sorted.idxstats
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MT192765.1 29829 197 3
* 0 0 0

0 comments on commit 1200bc3

Please sign in to comment.