Skip to content

Commit

Permalink
Merge branch 'viash-hub:main' into fix_configs
Browse files Browse the repository at this point in the history
  • Loading branch information
emmarousseau authored Apr 13, 2024
2 parents 2f72500 + cd3cd7a commit 8324bd9
Show file tree
Hide file tree
Showing 33 changed files with 5,476 additions and 0 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@
* `samtools`:
- `samtools/flagstat`: Counts the number of alignments in SAM/BAM/CRAM files for each FLAG type (PR #31).
- `samtools/idxstats`: Reports alignment summary statistics for a SAM/BAM/CRAM file (PR #32).
- `samtools/samtools_index`: Index SAM/BAM/CRAM files (PR #35).
- `samtools/samtools_sort`: Sort SAM/BAM/CRAM files (PR #36).
- `samtools/samtools_stats`: Reports alignment summary statistics for a BAM file (PR #39).

## MAJOR CHANGES

Expand Down
67 changes: 67 additions & 0 deletions src/samtools/samtools_index/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
name: samtools_index
namespace: samtools
description: Index SAM/BAM/CRAM files.
keywords: [index, bam, sam, cram]
links:
homepage: https://www.htslib.org/
documentation: https://www.htslib.org/doc/samtools-index.html
repository: https://github.com/samtools/samtools
references:
doi: [10.1093/bioinformatics/btp352, 10.1093/gigascience/giab008]
license: MIT/Expat

argument_groups:
- name: Inputs
arguments:
- name: --input
type: file
description: Input file name
required: true
must_exist: true
- name: Outputs
arguments:
- name: --output
alternatives: -o
type: file
description: Output file name
required: true
direction: output
example: out.bam.bai
- name: Options
arguments:
- name: --bai
alternatives: -b
type: boolean_true
description: Generate BAM index
- name: --csi
alternatives: -c
type: boolean_true
description: |
Create a CSI index for BAM files instead of the traditional BAI
index. This will be required for genomes with larger chromosome
sizes.
- name: --min_shift
alternatives: -m
type: integer
description: |
Create a CSI index, with a minimum interval size of 2^INT.
resources:
- type: bash_script
path: script.sh
test_resources:
- type: bash_script
path: test.sh
- type: file
path: test_data
engines:
- type: docker
image: quay.io/biocontainers/samtools:1.19.2--h50ea8bc_1
setup:
- type: docker
run: |
samtools --version 2>&1 | grep -E '^(samtools|Using htslib)' | \
sed 's#Using ##;s# \([0-9\.]*\)$#: \1#' > /var/software_versions.txt
runners:
- type: executable
- type: nextflow
13 changes: 13 additions & 0 deletions src/samtools/samtools_index/help.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
```
samtools index
```

Usage: samtools index -M [-bc] [-m INT] <in1.bam> <in2.bam>...
or: samtools index [-bc] [-m INT] <in.bam> [out.index]
Options:
-b, --bai Generate BAI-format index for BAM files [default]
-c, --csi Generate CSI-format index for BAM files
-m, --min-shift INT Set minimum interval size for CSI indices to 2^INT [14]
-M Interpret all filename arguments as files to be indexed
-o, --output FILE Write index to FILE [alternative to <out.index> in args]
-@, --threads INT Sets the number of threads [none]
18 changes: 18 additions & 0 deletions src/samtools/samtools_index/script.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash

## VIASH START
## VIASH END

set -e
[[ "$par_multiple" == "false" ]] && unset par_multiple
[[ "$par_bai" == "false" ]] && unset par_bai
[[ "$par_csi" == "false" ]] && unset par_csi
[[ "$par_multiple" == "false" ]] && unset par_multiple

samtools index \
"$par_input" \
${par_csi:+-c} \
${par_bai:+-b} \
${par_min_shift:+-m "par_min_shift"} \
${par_multiple:+-M} \
-o "$par_output"
91 changes: 91 additions & 0 deletions src/samtools/samtools_index/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
#!/bin/bash

test_dir="${meta_resources_dir}/test_data"

echo ">>> Testing $meta_functionality_name"

echo ">>> Generating BAM index"
"$meta_executable" \
--input "$test_dir/a.sorted.bam" \
--bai \
--output "$test_dir/a.sorted.bam.bai"

echo ">>> Check whether output exists"
[ ! -f "$test_dir/a.sorted.bam.bai" ] && echo "File 'a.sorted.bam.bai' does not exist!" && exit 1

echo ">>> Check whether output is empty"
[ ! -s "$test_dir/a.sorted.bam.bai" ] && echo "File 'a.sorted.bam.bai' is empty!" && exit 1

echo ">>> Check whether output is correct"
diff "$test_dir/a.sorted.bam.bai" "$test_dir/a_ref.sorted.bam.bai" || \
(echo "File 'a.sorted.bam.bai' does not match expected output." && exit 1)

rm "$test_dir/a.sorted.bam.bai"

#################################################################################################

echo ">>> Generating CSI index"
"$meta_executable" \
--input "$test_dir/a.sorted.bam" \
--csi \
--output "$test_dir/a.sorted.bam.csi"

echo ">>> Check whether output exists"
[ ! -f "$test_dir/a.sorted.bam.csi" ] && echo "File 'a.sorted.bam.csi' does not exist!" && exit 1

echo ">>> Check whether output is empty"
[ ! -s "$test_dir/a.sorted.bam.csi" ] && echo "File 'a.sorted.bam.csi' is empty!" && exit 1

echo ">>> Check whether output is correct"
diff "$test_dir/a.sorted.bam.csi" "$test_dir/a_ref.sorted.bam.csi" || \
(echo "File 'a.sorted.bam.csi' does not match expected output." && exit 1)

rm "$test_dir/a.sorted.bam.csi"

#################################################################################################

echo ">>> Generating bam index with -M option"
"$meta_executable" \
--input "$test_dir/a.sorted.bam" \
--bai \
--output "$test_dir/a_multiple.sorted.bam.bai" \
--multiple

echo ">>> Check whether output exists"
[ ! -f "$test_dir/a_multiple.sorted.bam.bai" ] && echo "File 'a_multiple.sorted.bam.bai' does not exist!" && exit 1

echo ">>> Check whether output is empty"
[ ! -s "$test_dir/a_multiple.sorted.bam.bai" ] && echo "File 'a_multiple.sorted.bam.bai' is empty!" && exit 1

echo ">>> Check whether output is correct"
diff "$test_dir/a_multiple.sorted.bam.bai" "$test_dir/a_multiple_ref.sorted.bam.bai" || \
(echo "File 'a_multiple.sorted.bam.bai' does not match expected output." && exit 1)


#################################################################################################

echo ">>> Generating BAM index with -m option"

"$meta_executable" \
--input "$test_dir/a.sorted.bam" \
--min_shift 4 \
--bai \
--output "$test_dir/a_4.sorted.bam.bai"

echo ">>> Check whether output exists"
[ ! -f "$test_dir/a_4.sorted.bam.bai" ] && echo "File 'a_4.sorted.bam.bai' does not exist!" && exit 1

echo ">>> Check whether output is empty"
[ ! -s "$test_dir/a_4.sorted.bam.bai" ] && echo "File 'a_4.sorted.bam.bai' is empty!" && exit 1

echo ">>> Check whether output is correct"
diff "$test_dir/a_4.sorted.bam.bai" "$test_dir/a_4_ref.sorted.bam.bai" || \
(echo "File 'a_4.sorted.bam.bai' does not match expected output." && exit 1)

rm "$test_dir/a_4.sorted.bam.bai"

#################################################################################################


echo "All tests succeeded!"
exit 0
Binary file added src/samtools/samtools_index/test_data/a.sorted.bam
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
12 changes: 12 additions & 0 deletions src/samtools/samtools_index/test_data/script.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/bin/bash

# dowload test data from snakemake wrapper
if [ ! -d /tmp/idxstats_source ]; then
git clone --depth 1 --single-branch --branch master https://github.com/snakemake/snakemake-wrappers.git /tmp/idxstats_source
fi

cp -r /tmp/idxstats_source/bio/samtools/idxstats/test/mapped/* src/samtools/idxstats/test_data
# samtools index a_ref.sorted.bam -o a_ref.sorted.bam.bai
# samtools index a_ref.sorted.bam -c a_ref.sorted.bam.csi


149 changes: 149 additions & 0 deletions src/samtools/samtools_sort/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
name: samtools_sort
namespace: samtools
description: Sort SAM/BAM/CRAM file.
keywords: [sort, bam, sam, cram]
links:
homepage: https://www.htslib.org/
documentation: https://www.htslib.org/doc/samtools-idxstats.html
repository: https://github.com/samtools/samtools
references:
doi: [10.1093/bioinformatics/btp352, 10.1093/gigascience/giab008]
license: MIT/Expat

argument_groups:
- name: Inputs
arguments:
- name: --input
type: file
description: SAM/BAM/CRAM input file.
required: true
must_exist: true
- name: Outputs
arguments:
- name: --output
type: file
description: |
Write final output to file.
required: true
direction: output
example: out.bam
- name: --output_fmt
alternatives: -O
type: string
description: |
Specify output format (SAM, BAM, CRAM).
example: BAM
- name: --output_fmt_option
type: string
description: |
Specify a single output file format option in the form
of OPTION or OPTION=VALUE.
- name: --reference
type: file
description: |
Reference sequence FASTA FILE.
example: ref.fa
- name: --write_index
type: boolean_true
description: |
Automatically index the output files.
- name: --prefix
alternatives: -T
type: string
description: |
Write temporary files to PREFIX.nnnn.bam.
- name: --no_PG
type: boolean_true
description: |
Do not add a PG line.
- name: --template_coordinate
type: boolean_true
description: |
Sort by template-coordinate.
- name: --input_fmt_option
type: string
description: |
Specify a single input file format option in the form
of OPTION or OPTION=VALUE.
- name: Options
arguments:
- name: --compression
alternatives: -l
type: integer
description: |
Set compression level, from 0 (uncompressed) to 9 (best).
default: 0
- name: --uncompressed
alternatives: -u
type: boolean_true
description: |
Output uncompressed data (equivalent to --compression 0).
- name: --minimiser
alternatives: -M
type: boolean_true
description: |
Use minimiser for clustering unaligned/unplaced reads.
- name: --not_reverse
alternatives: -R
type: boolean_true
description: |
Do not use reverse strand (only compatible with --minimiser)
- name: --kmer_size
alternatives: -K
type: integer
description: |
Kmer size to use for minimiser.
example: 20
- name: --order
alternatives: -I
type: file
description: |
Order minimisers by their position in FILE FASTA.
example: ref.fa
- name: --window
alternatives: -w
type: integer
description: |
Window size for minimiser INDEXING VIA --order REF.FA.
example: 100
- name: --homopolymers
alternatives: -H
type: boolean_true
description: |
Squash homopolymers when computing minimiser.
- name: --natural_sort
alternatives: -n
type: boolean_true
description: |
Sort by read name (natural): cannot be used with samtools index.
- name: --ascii_sort
alternatives: -N
type: boolean_true
description: |
Sort by read name (ASCII): cannot be used with samtools index.
- name: --tag
alternatives: -t
type: string
description: |
Sort by value of TAG. Uses position as secondary index
(or read name if --natural_sort is set).
resources:
- type: bash_script
path: script.sh
test_resources:
- type: bash_script
path: test.sh
- type: file
path: test_data
engines:
- type: docker
image: quay.io/biocontainers/samtools:1.19.2--h50ea8bc_1
setup:
- type: docker
run: |
samtools --version 2>&1 | grep -E '^(samtools|Using htslib)' | \
sed 's#Using ##;s# \([0-9\.]*\)$#: \1#' > /var/software_versions.txt
runners:
- type: executable
- type: nextflow
Loading

0 comments on commit 8324bd9

Please sign in to comment.