-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rseq bamstat #155
Rseq bamstat #155
Changes from 15 commits
38f586b
271108c
2c26968
5ea8c78
897cd89
ea0383c
44b3fcc
6cc4f94
c9613d1
1679c59
3af66f8
6ddfd7d
a64fa69
bc4e6d1
cebbe88
e2b3458
10549b2
1b90eee
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,59 @@ | ||||||
name: rseqc_bamstat | ||||||
namespace: rseqc | ||||||
keywords: [ rnaseq, genomics ] | ||||||
description: Generate statistics from a bam file. | ||||||
links: | ||||||
homepage: https://rseqc.sourceforge.net/ | ||||||
documentation: https://rseqc.sourceforge.net/#bam-stat-py | ||||||
issue_tracker: https://github.com/MonashBioinformaticsPlatform/RSeQC/issues | ||||||
repository: https://github.com/MonashBioinformaticsPlatform/RSeQC | ||||||
references: | ||||||
doi: 10.1093/bioinformatics/bts356 | ||||||
license: GPL-3.0 | ||||||
authors: | ||||||
- __merge__: /src/_authors/emma_rousseau.yaml | ||||||
roles: [ author, maintainer ] | ||||||
|
||||||
argument_groups: | ||||||
- name: "Input" | ||||||
arguments: | ||||||
- name: "--input" | ||||||
type: file | ||||||
rcannood marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
required: true | ||||||
description: Input alignment file in BAM or SAM format. | ||||||
- name: "--map_qual" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
type: integer | ||||||
rcannood marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
example: 30 | ||||||
description: | | ||||||
Minimum mapping quality (phred scaled) to determine uniquely mapped reads. Default: '30'. | ||||||
|
||||||
- name: "Output" | ||||||
arguments: | ||||||
- name: "--output" | ||||||
type: file | ||||||
direction: output | ||||||
description: Output file (txt) with mapping quality statistics. | ||||||
|
||||||
resources: | ||||||
- type: bash_script | ||||||
path: script.sh | ||||||
test_resources: | ||||||
- type: bash_script | ||||||
path: test.sh | ||||||
- type: file | ||||||
path: test_data | ||||||
|
||||||
engines: | ||||||
- type: docker | ||||||
image: ubuntu:22.04 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
setup: | ||||||
- type: apt | ||||||
packages: [ python3-pip ] | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
- type: python | ||||||
packages: [ RSeQC ] | ||||||
- type: docker | ||||||
run: | | ||||||
echo "RSeQC bam_stat.py: $(bam_stat.py --version | cut -d' ' -f2-)" > /var/software_versions.txt | ||||||
runners: | ||||||
- type: executable | ||||||
- type: nextflow |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
``` | ||
bam_stat.py -h | ||
``` | ||
|
||
Usage: bam_stat.py [options] | ||
|
||
Summarizing mapping statistics of a BAM or SAM file. | ||
|
||
|
||
|
||
Options: | ||
--version show program's version number and exit | ||
-h, --help show this help message and exit | ||
-i INPUT_FILE, --input-file=INPUT_FILE | ||
Alignment file in BAM or SAM format. | ||
-q MAP_QUAL, --mapq=MAP_QUAL | ||
Minimum mapping quality (phred scaled) to determine | ||
"uniquely mapped" reads. default=30 |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,9 @@ | ||||||
#!/bin/bash | ||||||
|
||||||
|
||||||
set -eo pipefail | ||||||
|
||||||
bam_stat.py \ | ||||||
--input "${par_input}" \ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although this works I would use the name used in the help message
Suggested change
|
||||||
${par_map_qual:+--mapq "${par_map_qual}"} \ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
> $par_output |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,49 @@ | ||||||
#!/bin/bash | ||||||
|
||||||
# define input and output for script | ||||||
|
||||||
input_bam="test.paired_end.sorted.bam" | ||||||
output_summary="mapping_quality.txt" | ||||||
|
||||||
# run executable and tests | ||||||
echo "> Running $meta_functionality_name." | ||||||
|
||||||
"$meta_executable" \ | ||||||
--input "$meta_resources_dir/test_data/$input_bam" \ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
--output "$output_summary" | ||||||
|
||||||
exit_code=$? | ||||||
[[ $exit_code != 0 ]] && echo "Non zero exit code: $exit_code" && exit 1 | ||||||
|
||||||
echo ">> Checking whether output is present" | ||||||
[ ! -f "$output_summary" ] && echo "$output_summary file missing" && exit 1 | ||||||
[ ! -s "$output_summary" ] && echo "$output_summary file is empty" && exit 1 | ||||||
|
||||||
echo ">> Checking whether output is correct" | ||||||
diff "$meta_resources_dir/test_data/ref_output.txt" "$meta_resources_dir/$output_summary" || { echo "Output is not correct"; exit 1; } | ||||||
|
||||||
############################################################################# | ||||||
|
||||||
echo ">>> Test 2: Test with non-default mapping quality threshold" | ||||||
|
||||||
output_summary="mapping_quality_mapq_30.txt" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
# run executable and tests | ||||||
echo "> Running $meta_functionality_name." | ||||||
|
||||||
"$meta_executable" \ | ||||||
--input "$meta_resources_dir/test_data/$input_bam" \ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
--output "$output_summary" \ | ||||||
--map_qual 50 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
exit_code=$? | ||||||
[[ $exit_code != 0 ]] && echo "Non zero exit code: $exit_code" && exit 1 | ||||||
|
||||||
echo ">> Checking whether output is present" | ||||||
[ ! -f "$output_summary" ] && echo "$output_summary file missing" && exit 1 | ||||||
[ ! -s "$output_summary" ] && echo "$output_summary file is empty" && exit 1 | ||||||
|
||||||
echo ">> Checking whether output is correct" | ||||||
diff "$meta_resources_dir/test_data/ref_output_mapq.txt" "$meta_resources_dir/$output_summary" || { echo "Output is not correct"; exit 1; } | ||||||
|
||||||
exit 0 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
|
||
#================================================== | ||
#All numbers are READ count | ||
#================================================== | ||
|
||
Total records: 200 | ||
|
||
QC failed: 0 | ||
Optical/PCR duplicate: 0 | ||
Non primary hits 0 | ||
Unmapped reads: 3 | ||
mapq < mapq_cut (non-unique): 1 | ||
|
||
mapq >= mapq_cut (unique): 196 | ||
Read-1: 99 | ||
Read-2: 97 | ||
Reads map to '+': 98 | ||
Reads map to '-': 98 | ||
Non-splice reads: 196 | ||
Splice reads: 0 | ||
Reads mapped in proper pairs: 192 | ||
Proper-paired reads map to different chrom:0 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
|
||
#================================================== | ||
#All numbers are READ count | ||
#================================================== | ||
|
||
Total records: 200 | ||
|
||
QC failed: 0 | ||
Optical/PCR duplicate: 0 | ||
Non primary hits 0 | ||
Unmapped reads: 3 | ||
mapq < mapq_cut (non-unique): 20 | ||
|
||
mapq >= mapq_cut (unique): 177 | ||
Read-1: 88 | ||
Read-2: 89 | ||
Reads map to '+': 96 | ||
Reads map to '-': 81 | ||
Non-splice reads: 177 | ||
Splice reads: 0 | ||
Reads mapped in proper pairs: 175 | ||
Proper-paired reads map to different chrom:0 |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we can subsample to half to get the file to about 10kb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the
input
parameter works I would use the parameter defined in the help page as the input parameter name.