-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Loading status checks…
Merge branch 'main' into add-agat_sp_add_start_and_stop
Showing
30 changed files
with
2,839 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
name: agat_convert_sp_gff2tsv | ||
namespace: agat | ||
description: | | ||
The script aims to convert gtf/gff file into tabulated file. Attribute's | ||
tags from the 9th column become column titles. | ||
keywords: [gene annotations, GFF conversion] | ||
links: | ||
homepage: https://github.com/NBISweden/AGAT | ||
documentation: https://agat.readthedocs.io/en/latest/tools/agat_convert_sp_gff2tsv.html | ||
issue_tracker: https://github.com/NBISweden/AGAT/issues | ||
repository: https://github.com/NBISweden/AGAT | ||
references: | ||
doi: 10.5281/zenodo.3552717 | ||
license: GPL-3.0 | ||
authors: | ||
- __merge__: /src/_authors/leila_paquay.yaml | ||
roles: [ author, maintainer ] | ||
|
||
argument_groups: | ||
- name: Inputs | ||
arguments: | ||
- name: --gff | ||
alternatives: [-f] | ||
description: Input GTF/GFF file. | ||
type: file | ||
required: true | ||
direction: input | ||
example: input.gff | ||
- name: Outputs | ||
arguments: | ||
- name: --output | ||
alternatives: [-o, --out, --outfile] | ||
description: Output GFF file. If no output file is specified, the output will be written to STDOUT. | ||
type: file | ||
direction: output | ||
required: true | ||
example: output.gff | ||
- name: Arguments | ||
arguments: | ||
- name: --config | ||
alternatives: [-c] | ||
description: | | ||
String - Input agat config file. By default AGAT takes as input | ||
agat_config.yaml file from the working directory if any, | ||
otherwise it takes the orignal agat_config.yaml shipped with | ||
AGAT. To get the agat_config.yaml locally type: "agat config | ||
--expose". The --config option gives you the possibility to use | ||
your own AGAT config file (located elsewhere or named | ||
differently). | ||
type: file | ||
required: false | ||
example: custom_agat_config.yaml | ||
resources: | ||
- type: bash_script | ||
path: script.sh | ||
test_resources: | ||
- type: bash_script | ||
path: test.sh | ||
- type: file | ||
path: test_data | ||
engines: | ||
- type: docker | ||
image: quay.io/biocontainers/agat:1.4.0--pl5321hdfd78af_0 | ||
setup: | ||
- type: docker | ||
run: | | ||
agat --version | sed 's/AGAT\s\(.*\)/agat: "\1"/' > /var/software_versions.txt | ||
runners: | ||
- type: executable | ||
- type: nextflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
```sh | ||
agat_convert_sp_gff2tsv.pl --help | ||
``` | ||
|
||
------------------------------------------------------------------------------ | ||
| Another GFF Analysis Toolkit (AGAT) - Version: v1.4.0 | | ||
| https://github.com/NBISweden/AGAT | | ||
| National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se | | ||
------------------------------------------------------------------------------ | ||
|
||
|
||
Name: | ||
agat_convert_sp_gff2tsv.pl | ||
|
||
Description: | ||
The script aims to convert gtf/gff file into tabulated file. Attribute's | ||
tags from the 9th column become column titles. | ||
|
||
Usage: | ||
agat_convert_sp_gff2tsv.pl -gff file.gff [ -o outfile ] | ||
agat_convert_sp_gff2tsv.pl --help | ||
|
||
Options: | ||
--gff or -f | ||
Input GTF/GFF file. | ||
|
||
-o , --output , --out or --outfile | ||
Output GFF file. If no output file is specified, the output will | ||
be written to STDOUT. | ||
|
||
-c or --config | ||
String - Input agat config file. By default AGAT takes as input | ||
agat_config.yaml file from the working directory if any, | ||
otherwise it takes the orignal agat_config.yaml shipped with | ||
AGAT. To get the agat_config.yaml locally type: "agat config | ||
--expose". The --config option gives you the possibility to use | ||
your own AGAT config file (located elsewhere or named | ||
differently). | ||
|
||
-h or --help | ||
Display this helpful text. | ||
|
||
Feedback: | ||
Did you find a bug?: | ||
Do not hesitate to report bugs to help us keep track of the bugs and | ||
their resolution. Please use the GitHub issue tracking system available | ||
at this address: | ||
|
||
https://github.com/NBISweden/AGAT/issues | ||
|
||
Ensure that the bug was not already reported by searching under Issues. | ||
If you're unable to find an (open) issue addressing the problem, open a new one. | ||
Try as much as possible to include in the issue when relevant: | ||
- a clear description, | ||
- as much relevant information as possible, | ||
- the command used, | ||
- a data sample, | ||
- an explanation of the expected behaviour that is not occurring. | ||
|
||
Do you want to contribute?: | ||
You are very welcome, visit this address for the Contributing | ||
guidelines: | ||
https://github.com/NBISweden/AGAT/blob/master/CONTRIBUTING.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
#!/bin/bash | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
agat_convert_sp_gff2tsv.pl \ | ||
-f "$par_gff" \ | ||
-o "$par_output" \ | ||
${par_config:+--config "${par_config}"} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
#!/bin/bash | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
test_dir="${meta_resources_dir}/test_data" | ||
out_dir="${meta_resources_dir}/out_data" | ||
|
||
echo "> Run $meta_name with test data" | ||
"$meta_executable" \ | ||
--gff "$test_dir/1.gff" \ | ||
--output "$out_dir/output.gff" | ||
|
||
echo ">> Checking output" | ||
[ ! -f "$out_dir/output.gff" ] && echo "Output file output.gff does not exist" && exit 1 | ||
|
||
echo ">> Check if output is empty" | ||
[ ! -s "$out_dir/output.gff" ] && echo "Output file output.gff is empty" && exit 1 | ||
|
||
echo ">> Check if output matches expected output" | ||
diff "$out_dir/output.gff" "$test_dir/agat_convert_sp_gff2tsv_1.tsv" | ||
if [ $? -ne 0 ]; then | ||
echo "Output file output.gff does not match expected output" | ||
exit 1 | ||
fi | ||
|
||
echo "> Test successful" |
Large diffs are not rendered by default.
Oops, something went wrong.
881 changes: 881 additions & 0 deletions
881
src/agat/agat_convert_sp_gff2tsv/test_data/agat_convert_sp_gff2tsv_1.tsv
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
#!/bin/bash | ||
|
||
# clone repo | ||
if [ ! -d /tmp/agat_source ]; then | ||
git clone --depth 1 --single-branch --branch master https://github.com/NBISweden/AGAT /tmp/agat_source | ||
fi | ||
|
||
# copy test data | ||
cp -r /tmp/agat_source/t/scripts_output/out/agat_convert_sp_gff2tsv_1.tsv src/agat/agat_convert_sp_gff2tsv/test_data | ||
cp -r /tmp/agat_source/t/scripts_output/in/1.gff src/agat/agat_convert_sp_gff2tsv/test_data |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
name: agat_convert_sp_gxf2gxf | ||
namespace: agat | ||
description: | | ||
This script fixes and/or standardizes any GTF/GFF file into full sorted | ||
GTF/GFF file. It AGAT parser removes duplicate features, fixes | ||
duplicated IDs, adds missing ID and/or Parent attributes, deflates | ||
factorized attributes (attributes with several parents are duplicated | ||
with uniq ID), add missing features when possible (e.g. add exon if only | ||
CDS described, add UTR if CDS and exon described), fix feature locations | ||
(e.g. check exon is embedded in the parent features mRNA, gene), etc... | ||
All AGAT's scripts with the _sp_ prefix use the AGAT parser, before to | ||
perform any supplementary task. So, it is not necessary to run this | ||
script prior the use of any other _sp_ script. | ||
keywords: [gene annotations, GFF conversion] | ||
links: | ||
homepage: https://github.com/NBISweden/AGAT | ||
documentation: https://agat.readthedocs.io/en/latest/tools/agat_convert_sp_gxf2gxf.html | ||
issue_tracker: https://github.com/NBISweden/AGAT/issues | ||
repository: https://github.com/NBISweden/AGAT | ||
references: | ||
doi: 10.5281/zenodo.3552717 | ||
license: GPL-3.0 | ||
authors: | ||
- __merge__: /src/_authors/leila_paquay.yaml | ||
roles: [ author, maintainer ] | ||
|
||
argument_groups: | ||
- name: Inputs | ||
arguments: | ||
- name: --gxf | ||
alternatives: [-g, --gtf, --gff] | ||
description: | | ||
String - Input GTF/GFF file. Compressed file with .gz extension is accepted. | ||
type: file | ||
required: true | ||
direction: input | ||
example: input.gff | ||
- name: Outputs | ||
arguments: | ||
- name: --output | ||
alternatives: [-o] | ||
description: | | ||
String - Output GFF file. If no output file is specified, the output will be written to STDOUT. | ||
type: file | ||
direction: output | ||
required: true | ||
example: output.gff | ||
- name: Arguments | ||
arguments: | ||
- name: --config | ||
alternatives: [-c] | ||
description: | | ||
String - Input agat config file. By default AGAT takes as input agat_config.yaml file from the working directory if any, otherwise it takes the original agat_config.yaml shipped with AGAT. To get the agat_config.yaml locally type: "agat config --expose". The --config option gives you the possibility to use your own AGAT config file (located elsewhere or named differently). | ||
type: file | ||
required: false | ||
example: custom_agat_config.yaml | ||
resources: | ||
- type: bash_script | ||
path: script.sh | ||
test_resources: | ||
- type: bash_script | ||
path: test.sh | ||
- type: file | ||
path: test_data | ||
engines: | ||
- type: docker | ||
image: quay.io/biocontainers/agat:1.4.0--pl5321hdfd78af_0 | ||
setup: | ||
- type: docker | ||
run: | | ||
agat --version | sed 's/AGAT\s\(.*\)/agat: "\1"/' > /var/software_versions.txt | ||
runners: | ||
- type: executable | ||
- type: nextflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
```sh | ||
agat_convert_sp_gxf2gxf.pl --help | ||
``` | ||
|
||
------------------------------------------------------------------------------ | ||
| Another GFF Analysis Toolkit (AGAT) - Version: v1.4.0 | | ||
| https://github.com/NBISweden/AGAT | | ||
| National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se | | ||
------------------------------------------------------------------------------ | ||
|
||
|
||
Name: | ||
agat_convert_sp_gxf2gxf.pl | ||
|
||
Description: | ||
This script fixes and/or standardizes any GTF/GFF file into full sorted | ||
GTF/GFF file. It AGAT parser removes duplicate features, fixes | ||
duplicated IDs, adds missing ID and/or Parent attributes, deflates | ||
factorized attributes (attributes with several parents are duplicated | ||
with uniq ID), add missing features when possible (e.g. add exon if only | ||
CDS described, add UTR if CDS and exon described), fix feature locations | ||
(e.g. check exon is embedded in the parent features mRNA, gene), etc... | ||
|
||
All AGAT's scripts with the _sp_ prefix use the AGAT parser, before to | ||
perform any supplementary task. So, it is not necessary to run this | ||
script prior the use of any other _sp_ script. | ||
|
||
Usage: | ||
agat_convert_sp_gxf2gxf.pl -g infile.gff [ -o outfile ] | ||
agat_convert_sp_gxf2gxf.pl --help | ||
|
||
Options: | ||
-g, --gtf, --gff or --gxf | ||
String - Input GTF/GFF file. Compressed file with .gz extension | ||
is accepted. | ||
|
||
-o or --output | ||
String - Output GFF file. If no output file is specified, the | ||
output will be written to STDOUT. | ||
|
||
-c or --config | ||
String - Input agat config file. By default AGAT takes as input | ||
agat_config.yaml file from the working directory if any, | ||
otherwise it takes the orignal agat_config.yaml shipped with | ||
AGAT. To get the agat_config.yaml locally type: "agat config | ||
--expose". The --config option gives you the possibility to use | ||
your own AGAT config file (located elsewhere or named | ||
differently). | ||
|
||
-h or --help | ||
Boolean - Display this helpful text. | ||
|
||
Feedback: | ||
Did you find a bug?: | ||
Do not hesitate to report bugs to help us keep track of the bugs and | ||
their resolution. Please use the GitHub issue tracking system available | ||
at this address: | ||
|
||
https://github.com/NBISweden/AGAT/issues | ||
|
||
Ensure that the bug was not already reported by searching under Issues. | ||
If you're unable to find an (open) issue addressing the problem, open a new one. | ||
Try as much as possible to include in the issue when relevant: | ||
- a clear description, | ||
- as much relevant information as possible, | ||
- the command used, | ||
- a data sample, | ||
- an explanation of the expected behaviour that is not occurring. | ||
|
||
Do you want to contribute?: | ||
You are very welcome, visit this address for the Contributing | ||
guidelines: | ||
https://github.com/NBISweden/AGAT/blob/master/CONTRIBUTING.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
#!/bin/bash | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
agat_convert_sp_gxf2gxf.pl \ | ||
-g "$par_gxf" \ | ||
-o "$par_output" \ | ||
${par_config:+--config "${par_config}"} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
#!/bin/bash | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
test_dir="${meta_resources_dir}/test_data" | ||
out_dir="${meta_resources_dir}/out_data" | ||
|
||
echo "> Run $meta_name with test data" | ||
"$meta_executable" \ | ||
--gxf "$test_dir/0_test.gff" \ | ||
--output "$out_dir/output.gff" | ||
|
||
echo ">> Checking output" | ||
[ ! -f "$out_dir/output.gff" ] && echo "Output file output.gff does not exist" && exit 1 | ||
|
||
echo ">> Check if output is empty" | ||
[ ! -s "$out_dir/output.gff" ] && echo "Output file output.gff is empty" && exit 1 | ||
|
||
|
||
echo ">> Check if output matches expected output" | ||
diff "$out_dir/output.gff" "$test_dir/0_correct_output.gff" | ||
if [ $? -ne 0 ]; then | ||
echo "Output file output.gff does not match expected output" | ||
exit 1 | ||
fi | ||
|
||
echo "> Test successful" |
36 changes: 36 additions & 0 deletions
36
src/agat/agat_convert_sp_gxf2gxf/test_data/0_correct_output.gff
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
##gff-version 3 | ||
scaffold625 maker gene 337818 343277 . + . ID=CLUHARG00000005458;Name=TUBB3_2 | ||
scaffold625 maker mRNA 337818 343277 . + . ID=CLUHART00000008717;Parent=CLUHARG00000005458 | ||
scaffold625 maker exon 337818 337971 . + . ID=CLUHART00000008717:exon:1404;Parent=CLUHART00000008717 | ||
scaffold625 maker exon 340733 340841 . + . ID=CLUHART00000008717:exon:1405;Parent=CLUHART00000008717 | ||
scaffold625 maker exon 341518 341628 . + . ID=CLUHART00000008717:exon:1406;Parent=CLUHART00000008717 | ||
scaffold625 maker exon 341964 343277 . + . ID=CLUHART00000008717:exon:1407;Parent=CLUHART00000008717 | ||
scaffold625 maker CDS 337915 337971 . + 0 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717 | ||
scaffold625 maker CDS 340733 340841 . + 0 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717 | ||
scaffold625 maker CDS 341518 341628 . + 2 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717 | ||
scaffold625 maker CDS 341964 343033 . + 2 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717 | ||
scaffold625 maker five_prime_UTR 337818 337914 . + . ID=CLUHART00000008717:five_prime_utr;Parent=CLUHART00000008717 | ||
scaffold625 maker three_prime_UTR 343034 343277 . + . ID=CLUHART00000008717:three_prime_utr;Parent=CLUHART00000008717 | ||
scaffold789 maker gene 558184 564780 . + . ID=CLUHARG00000003852;Name=PF11_0240 | ||
scaffold789 maker mRNA 558184 564780 . + . ID=CLUHART00000006146;Parent=CLUHARG00000003852 | ||
scaffold789 maker exon 558184 560123 . + . ID=CLUHART00000006146:exon:995;Parent=CLUHART00000006146 | ||
scaffold789 maker exon 561401 561519 . + . ID=CLUHART00000006146:exon:996;Parent=CLUHART00000006146 | ||
scaffold789 maker exon 564171 564235 . + . ID=CLUHART00000006146:exon:997;Parent=CLUHART00000006146 | ||
scaffold789 maker exon 564372 564780 . + . ID=CLUHART00000006146:exon:998;Parent=CLUHART00000006146 | ||
scaffold789 maker CDS 558191 560123 . + 0 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146 | ||
scaffold789 maker CDS 561401 561519 . + 2 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146 | ||
scaffold789 maker CDS 564171 564235 . + 0 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146 | ||
scaffold789 maker CDS 564372 564588 . + 1 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146 | ||
scaffold789 maker five_prime_UTR 558184 558190 . + . ID=CLUHART00000006146:five_prime_utr;Parent=CLUHART00000006146 | ||
scaffold789 maker three_prime_UTR 564589 564780 . + . ID=CLUHART00000006146:three_prime_utr;Parent=CLUHART00000006146 | ||
scaffold789 maker mRNA 558184 564780 . + . ID=CLUHART00000006147;Parent=CLUHARG00000003852 | ||
scaffold789 maker exon 558184 560123 . + . ID=CLUHART00000006147:exon:997;Parent=CLUHART00000006147 | ||
scaffold789 maker exon 561401 561519 . + . ID=CLUHART00000006147:exon:998;Parent=CLUHART00000006147 | ||
scaffold789 maker exon 562057 562121 . + . ID=CLUHART00000006147:exon:999;Parent=CLUHART00000006147 | ||
scaffold789 maker exon 564372 564780 . + . ID=CLUHART00000006147:exon:1000;Parent=CLUHART00000006147 | ||
scaffold789 maker CDS 558191 560123 . + 0 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147 | ||
scaffold789 maker CDS 561401 561519 . + 2 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147 | ||
scaffold789 maker CDS 562057 562121 . + 0 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147 | ||
scaffold789 maker CDS 564372 564588 . + 1 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147 | ||
scaffold789 maker five_prime_UTR 558184 558190 . + . ID=CLUHART00000006147:five_prime_utr;Parent=CLUHART00000006147 | ||
scaffold789 maker three_prime_UTR 564589 564780 . + . ID=CLUHART00000006147:three_prime_utr;Parent=CLUHART00000006147 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
##gff-version 3 | ||
scaffold625 maker gene 337818 343277 . + . ID=CLUHARG00000005458;Name=TUBB3_2 | ||
scaffold625 maker mRNA 337818 343277 . + . ID=CLUHART00000008717;Parent=CLUHARG00000005458 | ||
scaffold625 maker exon 337818 337971 . + . ID=CLUHART00000008717:exon:1404;Parent=CLUHART00000008717 | ||
scaffold625 maker exon 340733 340841 . + . ID=CLUHART00000008717:exon:1405;Parent=CLUHART00000008717 | ||
scaffold625 maker exon 341518 341628 . + . ID=CLUHART00000008717:exon:1406;Parent=CLUHART00000008717 | ||
scaffold625 maker exon 341964 343277 . + . ID=CLUHART00000008717:exon:1407;Parent=CLUHART00000008717 | ||
scaffold625 maker CDS 337915 337971 . + 0 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717 | ||
scaffold625 maker CDS 340733 340841 . + 0 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717 | ||
scaffold625 maker CDS 341518 341628 . + 2 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717 | ||
scaffold625 maker CDS 341964 343033 . + 2 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717 | ||
scaffold625 maker five_prime_UTR 337818 337914 . + . ID=CLUHART00000008717:five_prime_utr;Parent=CLUHART00000008717 | ||
scaffold625 maker three_prime_UTR 343034 343277 . + . ID=CLUHART00000008717:three_prime_utr;Parent=CLUHART00000008717 | ||
scaffold789 maker gene 558184 564780 . + . ID=CLUHARG00000003852;Name=PF11_0240 | ||
scaffold789 maker mRNA 558184 564780 . + . ID=CLUHART00000006146;Parent=CLUHARG00000003852 | ||
scaffold789 maker exon 558184 560123 . + . ID=CLUHART00000006146:exon:995;Parent=CLUHART00000006146 | ||
scaffold789 maker exon 561401 561519 . + . ID=CLUHART00000006146:exon:996;Parent=CLUHART00000006146 | ||
scaffold789 maker exon 564171 564235 . + . ID=CLUHART00000006146:exon:997;Parent=CLUHART00000006146 | ||
scaffold789 maker exon 564372 564780 . + . ID=CLUHART00000006146:exon:998;Parent=CLUHART00000006146 | ||
scaffold789 maker CDS 558191 560123 . + 0 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146 | ||
scaffold789 maker CDS 561401 561519 . + 2 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146 | ||
scaffold789 maker CDS 564171 564235 . + 0 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146 | ||
scaffold789 maker CDS 564372 564588 . + 1 ID=CLUHART00000006146:cds;Parent=CLUHART00000006146 | ||
scaffold789 maker five_prime_UTR 558184 558190 . + . ID=CLUHART00000006146:five_prime_utr;Parent=CLUHART00000006146 | ||
scaffold789 maker three_prime_UTR 564589 564780 . + . ID=CLUHART00000006146:three_prime_utr;Parent=CLUHART00000006146 | ||
scaffold789 maker mRNA 558184 564780 . + . ID=CLUHART00000006147;Parent=CLUHARG00000003852 | ||
scaffold789 maker exon 558184 560123 . + . ID=CLUHART00000006147:exon:997;Parent=CLUHART00000006147 | ||
scaffold789 maker exon 561401 561519 . + . ID=CLUHART00000006147:exon:998;Parent=CLUHART00000006147 | ||
scaffold789 maker exon 562057 562121 . + . ID=CLUHART00000006147:exon:999;Parent=CLUHART00000006147 | ||
scaffold789 maker exon 564372 564780 . + . ID=CLUHART00000006147:exon:1000;Parent=CLUHART00000006147 | ||
scaffold789 maker CDS 558191 560123 . + 0 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147 | ||
scaffold789 maker CDS 561401 561519 . + 2 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147 | ||
scaffold789 maker CDS 562057 562121 . + 0 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147 | ||
scaffold789 maker CDS 564372 564588 . + 1 ID=CLUHART00000006147:cds;Parent=CLUHART00000006147 | ||
scaffold789 maker five_prime_UTR 558184 558190 . + . ID=CLUHART00000006147:five_prime_utr;Parent=CLUHART00000006147 | ||
scaffold789 maker three_prime_UTR 564589 564780 . + . ID=CLUHART00000006147:three_prime_utr;Parent=CLUHART00000006147 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
#!/bin/bash | ||
|
||
# clone repo | ||
if [ ! -d /tmp/agat_source ]; then | ||
git clone --depth 1 --single-branch --branch master https://github.com/NBISweden/AGAT /tmp/agat_source | ||
fi | ||
|
||
# copy test data | ||
cp -r /tmp/agat_source/t/gff_syntax/in/0_test.gff src/agat/agat_convert_sp_gxf2gxf/test_data | ||
cp -r /tmp/agat_source/t/gff_syntax/out/0_correct_output.gff src/agat/agat_convert_sp_gxf2gxf/test_data |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
name: bedtools_bamtofastq | ||
namespace: bedtools | ||
description: | | ||
Conversion tool for extracting FASTQ records from sequence alignments in BAM format. | ||
keywords: [Conversion ,BAM, FASTQ] | ||
links: | ||
documentation: https://bedtools.readthedocs.io/en/latest/content/tools/bamtofastq.html | ||
repository: https://github.com/arq5x/bedtools2 | ||
homepage: https://bedtools.readthedocs.io/en/latest/# | ||
issue_tracker: https://github.com/arq5x/bedtools2/issues | ||
references: | ||
doi: 10.1093/bioinformatics/btq033 | ||
license: MIT | ||
requirements: | ||
commands: [bedtools] | ||
authors: | ||
- __merge__: /src/_authors/theodoro_gasperin.yaml | ||
roles: [ author, maintainer ] | ||
|
||
argument_groups: | ||
- name: Inputs | ||
arguments: | ||
- name: --input | ||
alternatives: -i | ||
type: file | ||
description: Input BAM file to be converted to FASTQ. | ||
required: true | ||
|
||
- name: Outputs | ||
arguments: | ||
- name: --fastq | ||
alternatives: -fq | ||
direction: output | ||
type: file | ||
description: Output FASTQ file. | ||
required: true | ||
|
||
- name: --fastq2 | ||
alternatives: -fq2 | ||
type: file | ||
direction: output | ||
description: | | ||
FASTQ for second end. Used if BAM contains paired-end data. | ||
BAM should be sorted by query name is creating paired FASTQ. | ||
- name: Options | ||
arguments: | ||
- name: --tags | ||
type: boolean_true | ||
description: | | ||
Create FASTQ based on the mate info in the BAM R2 and Q2 tags. | ||
resources: | ||
- type: bash_script | ||
path: script.sh | ||
|
||
test_resources: | ||
- type: bash_script | ||
path: test.sh | ||
- path: test_data | ||
|
||
engines: | ||
- type: docker | ||
image: debian:stable-slim | ||
setup: | ||
- type: apt | ||
packages: [bedtools, procps] | ||
- type: docker | ||
run: | | ||
echo "bedtools: \"$(bedtools --version | sed -n 's/^bedtools //p')\"" > /var/software_versions.txt | ||
runners: | ||
- type: executable | ||
- type: nextflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
```bash | ||
bedtools bamtofastq | ||
``` | ||
|
||
Tool: bedtools bamtofastq (aka bamToFastq) | ||
Version: v2.30.0 | ||
Summary: Convert BAM alignments to FASTQ files. | ||
|
||
Usage: bamToFastq [OPTIONS] -i <BAM> -fq <FQ> | ||
|
||
Options: | ||
-fq2 FASTQ for second end. Used if BAM contains paired-end data. | ||
BAM should be sorted by query name is creating paired FASTQ. | ||
|
||
-tags Create FASTQ based on the mate info | ||
in the BAM R2 and Q2 tags. | ||
|
||
Tips: | ||
If you want to create a single, interleaved FASTQ file | ||
for paired-end data, you can just write both to /dev/stdout: | ||
|
||
bedtools bamtofastq -i x.bam -fq /dev/stdout -fq2 /dev/stdout > x.ilv.fq | ||
|
||
Also, the samtools fastq command has more fucntionality and is a useful alternative. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#!/bin/bash | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
# Exit on error | ||
set -eo pipefail | ||
|
||
# Unset parameters | ||
[[ "$par_tags" == "false" ]] && unset par_tags | ||
|
||
# Execute bedtools bamtofastq with the provided arguments | ||
bedtools bamtofastq \ | ||
${par_tags:+-tags} \ | ||
${par_fastq2:+-fq2 "$par_fastq2"} \ | ||
-i "$par_input" \ | ||
-fq "$par_fastq" | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
#!/bin/bash | ||
|
||
# exit on error | ||
set -eo pipefail | ||
|
||
test_data="$meta_resources_dir/test_data" | ||
|
||
############################################# | ||
# helper functions | ||
assert_file_exists() { | ||
[ -f "$1" ] || { echo "File '$1' does not exist" && exit 1; } | ||
} | ||
assert_file_not_empty() { | ||
[ -s "$1" ] || { echo "File '$1' is empty but shouldn't be" && exit 1; } | ||
} | ||
assert_file_contains() { | ||
grep -q "$2" "$1" || { echo "File '$1' does not contain '$2'" && exit 1; } | ||
} | ||
assert_identical_content() { | ||
diff -a "$2" "$1" \ | ||
|| (echo "Files are not identical!" && exit 1) | ||
} | ||
############################################# | ||
|
||
# Test 1: normal conversion | ||
mkdir test1 | ||
cd test1 | ||
|
||
echo "> Run bedtools bamtofastq on BAM file" | ||
"$meta_executable" \ | ||
--input "$test_data/example.bam" \ | ||
--fastq "output.fastq" | ||
|
||
# checks | ||
assert_file_exists "output.fastq" | ||
assert_file_not_empty "output.fastq" | ||
assert_identical_content "output.fastq" "$test_data/expected.fastq" | ||
echo "- test1 succeeded -" | ||
|
||
cd .. | ||
|
||
# Test 2: with tags | ||
mkdir test2 | ||
cd test2 | ||
|
||
echo "> Run bedtools bamtofastq on BAM file with tags" | ||
"$meta_executable" \ | ||
--input "$test_data/example.bam" \ | ||
--fastq "output.fastq" \ | ||
--tags | ||
|
||
# checks | ||
assert_file_exists "output.fastq" | ||
assert_file_not_empty "output.fastq" | ||
assert_identical_content "output.fastq" "$test_data/expected.fastq" | ||
echo "- test2 succeeded -" | ||
|
||
cd .. | ||
|
||
# Test 3: with option fq2 | ||
mkdir test3 | ||
cd test3 | ||
|
||
echo "> Run bedtools bamtofastq on BAM file with output_fq2" | ||
"$meta_executable" \ | ||
--input "$test_data/example.bam" \ | ||
--fastq "output1.fastq" \ | ||
--fastq2 "output2.fastq" | ||
|
||
# checks | ||
assert_file_exists "output1.fastq" | ||
assert_file_not_empty "output1.fastq" | ||
assert_identical_content "output1.fastq" "$test_data/expected_1.fastq" | ||
assert_file_exists "output2.fastq" | ||
assert_file_not_empty "output2.fastq" | ||
assert_identical_content "output2.fastq" "$test_data/expected_2.fastq" | ||
echo "- test3 succeeded -" | ||
|
||
cd .. | ||
|
||
echo "All tests succeeded" | ||
exit 0 | ||
|
||
|
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
@SQ SN:chr2:172936693-172938111 LN:1418 | ||
my_read 99 chr2:172936693-172938111 129 60 100M = 429 400 CTAACTAGCCTGGGAAAAAAGGATAGTGTCTCTCTGTTCTTTCATAGGAAATGTTGAATCAGACCCCTACTGGGAAAAGAAATTTAATGCATATCTCACT * XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100 | ||
my_read 147 chr2:172936693-172938111 429 60 100M = 129 -400 TCGAGCTCTGCATTCATGGCTGTGTCTAAAGGGCATGTCAGCCTTTGATTCTCTCTGAGAGGTAATTATCCTTTTCCTGTCACGGAACAACAAATGATAG * XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
@my_read | ||
CTAACTAGCCTGGGAAAAAAGGATAGTGTCTCTCTGTTCTTTCATAGGAAATGTTGAATCAGACCCCTACTGGGAAAAGAAATTTAATGCATATCTCACT | ||
+ | ||
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | ||
@my_read | ||
CTAACTAGCCTGGGAAAAAAGGATAGTGTCTCTCTGTTCTTTCATAGGAAATGTTGAATCAGACCCCTACTGGGAAAAGAAATTTAATGCATATCTCACT | ||
+ | ||
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | ||
@my_read | ||
CTATCATTTGTTGTTCCGTGACAGGAAAAGGATAATTACCTCTCAGAGAGAATCAAAGGCTGACATGCCCTTTAGACACAGCCATGAATGCAGAGCTCGA | ||
+ | ||
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | ||
@my_read | ||
CTATCATTTGTTGTTCCGTGACAGGAAAAGGATAATTACCTCTCAGAGAGAATCAAAGGCTGACATGCCCTTTAGACACAGCCATGAATGCAGAGCTCGA | ||
+ | ||
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
@my_read/1 | ||
CTAACTAGCCTGGGAAAAAAGGATAGTGTCTCTCTGTTCTTTCATAGGAAATGTTGAATCAGACCCCTACTGGGAAAAGAAATTTAATGCATATCTCACT | ||
+ | ||
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
@my_read/2 | ||
CTATCATTTGTTGTTCCGTGACAGGAAAAGGATAATTACCTCTCAGAGAGAATCAAAGGCTGACATGCCCTTTAGACACAGCCATGAATGCAGAGCTCGA | ||
+ | ||
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#!/bin/bash | ||
|
||
# create sam file | ||
printf "@SQ\tSN:chr2:172936693-172938111\tLN:1418\n" > example.sam | ||
printf "my_read\t99\tchr2:172936693-172938111\t129\t60\t100M\t=\t429\t400\tCTAACTAGCCTGGGAAAAAAGGATAGTGTCTCTCTGTTCTTTCATAGGAAATGTTGAATCAGACCCCTACTGGGAAAAGAAATTTAATGCATATCTCACT\t*\tXT:A:U\tNM:i:0\tSM:i:37\tAM:i:37\tX0:i:1\tX1:i:0\tXM:i:0\tXO:i:0\tXG:i:0\tMD:Z:100\n" >> example.sam | ||
printf "my_read\t147\tchr2:172936693-172938111\t429\t60\t100M\t=\t129\t-400\tTCGAGCTCTGCATTCATGGCTGTGTCTAAAGGGCATGTCAGCCTTTGATTCTCTCTGAGAGGTAATTATCCTTTTCCTGTCACGGAACAACAAATGATAG\t*\tXT:A:U\tNM:i:0\tSM:i:37\tAM:i:37\tX0:i:1\tX1:i:0\tXM:i:0\tXO:i:0\tXG:i:0\tMD:Z:100\n" >> example.sam | ||
|
||
# create bam file | ||
# samtools view -b example.sam > example.bam | ||
|
||
# create fastq files | ||
# bedtools bamtofastq -i example.bam -fq expected.fastq | ||
# bedtools bamtofastq -i example.bam -fq expected_1.fastq -fq2 expected_2.fastq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
name: bedtools_bedtobam | ||
namespace: bedtools | ||
description: Converts feature records (bed/gff/vcf) to BAM format. | ||
keywords: [Converts, BED, GFF, VCF, BAM] | ||
links: | ||
documentation: https://bedtools.readthedocs.io/en/latest/content/tools/bedtobam.html | ||
repository: https://github.com/arq5x/bedtools2 | ||
homepage: https://bedtools.readthedocs.io/en/latest/# | ||
issue_tracker: https://github.com/arq5x/bedtools2/issues | ||
references: | ||
doi: 10.1093/bioinformatics/btq033 | ||
license: MIT | ||
requirements: | ||
commands: [bedtools] | ||
authors: | ||
- __merge__: /src/_authors/theodoro_gasperin.yaml | ||
roles: [ author, maintainer ] | ||
|
||
argument_groups: | ||
- name: Inputs | ||
arguments: | ||
- name: --input | ||
alternatives: -i | ||
type: file | ||
description: Input file (bed/gff/vcf). | ||
required: true | ||
|
||
- name: --genome | ||
alternatives: -g | ||
type: file | ||
description: | | ||
Input genome file. | ||
NOTE: This is not a fasta file. This is a two-column tab-delimited file | ||
where the first column is the chromosome name and the second their sizes. | ||
required: true | ||
|
||
- name: Outputs | ||
arguments: | ||
- name: --output | ||
alternatives: -o | ||
type: file | ||
direction: output | ||
description: Output BAM file to be written. | ||
|
||
- name: Options | ||
arguments: | ||
- name: --map_quality | ||
alternatives: -mapq | ||
type: integer | ||
description: | | ||
Set the mappinq quality for the BAM records. | ||
min: 0 | ||
max: 255 | ||
default: 255 | ||
|
||
- name: --bed12 | ||
type: boolean_true | ||
description: | | ||
The BED file is in BED12 format. The BAM CIGAR | ||
string will reflect BED "blocks". | ||
- name: --uncompress_bam | ||
alternatives: -ubam | ||
type: boolean_true | ||
description: | | ||
Write uncompressed BAM output. Default writes compressed BAM. | ||
resources: | ||
- type: bash_script | ||
path: script.sh | ||
|
||
test_resources: | ||
- type: bash_script | ||
path: test.sh | ||
|
||
engines: | ||
- type: docker | ||
image: debian:stable-slim | ||
setup: | ||
- type: apt | ||
packages: [bedtools, procps] | ||
- type: docker | ||
run: | | ||
echo "bedtools: \"$(bedtools --version | sed -n 's/^bedtools //p')\"" > /var/software_versions.txt | ||
test_setup: | ||
- type: apt | ||
packages: [samtools] | ||
|
||
runners: | ||
- type: executable | ||
- type: nextflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
```bash | ||
bedtools bedtobam | ||
``` | ||
|
||
Tool: bedtools bedtobam (aka bedToBam) | ||
Version: v2.30.0 | ||
Summary: Converts feature records to BAM format. | ||
|
||
Usage: bedtools bedtobam [OPTIONS] -i <bed/gff/vcf> -g <genome> | ||
|
||
Options: | ||
-mapq Set the mappinq quality for the BAM records. | ||
(INT) Default: 255 | ||
|
||
-bed12 The BED file is in BED12 format. The BAM CIGAR | ||
string will reflect BED "blocks". | ||
|
||
-ubam Write uncompressed BAM output. Default writes compressed BAM. | ||
|
||
Notes: | ||
(1) BED files must be at least BED4 to create BAM (needs name field). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#!/bin/bash | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
set -eo pipefail | ||
|
||
# Unset parameters | ||
[[ "$par_bed12" == "false" ]] && unset par_bed12 | ||
[[ "$par_uncompress_bam" == "false" ]] && unset par_uncompress_bam | ||
|
||
# Execute bedtools bed to bam | ||
bedtools bedtobam \ | ||
${par_bed12:+-bed12} \ | ||
${par_uncompress_bam:+-ubam} \ | ||
${par_map_quality:+-mapq "$par_map_quality"} \ | ||
-i "$par_input" \ | ||
-g "$par_genome" \ | ||
> "$par_output" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,188 @@ | ||
#!/bin/bash | ||
|
||
# exit on error | ||
set -eo pipefail | ||
|
||
############################################# | ||
# helper functions | ||
assert_file_exists() { | ||
[ -f "$1" ] || { echo "File '$1' does not exist" && exit 1; } | ||
} | ||
assert_file_not_empty() { | ||
[ -s "$1" ] || { echo "File '$1' is empty but shouldn't be" && exit 1; } | ||
} | ||
assert_file_contains() { | ||
grep -q "$2" "$1" || { echo "File '$1' does not contain '$2'" && exit 1; } | ||
} | ||
assert_identical_content() { | ||
diff -a "$2" "$1" \ | ||
|| (echo "Files are not identical!" && exit 1) | ||
} | ||
############################################# | ||
|
||
# Create directories for tests | ||
echo "Creating Test Data..." | ||
TMPDIR=$(mktemp -d "$meta_temp_dir/XXXXXX") | ||
function clean_up { | ||
[[ -d "$TMPDIR" ]] && rm -r "$TMPDIR" | ||
} | ||
trap clean_up EXIT | ||
|
||
# Create and populate input files | ||
printf "chr1\t248956422\nchr3\t242193529\nchr2\t198295559\n" > "$TMPDIR/genome.txt" | ||
printf "chr2:172936693-172938111\t128\t228\tmy_read/1\t37\t+\nchr2:172936693-172938111\t428\t528\tmy_read/2\t37\t-\n" > "$TMPDIR/example.bed" | ||
printf "chr2:172936693-172938111\t128\t228\tmy_read/1\t60\t+\t128\t228\t255,0,0\t1\t100\t0\nchr2:172936693-172938111\t428\t528\tmy_read/2\t60\t-\t428\t528\t255,0,0\t1\t100\t0\n" > "$TMPDIR/example.bed12" | ||
# Create and populate example.gff file | ||
printf "##gff-version 3\n" > "$TMPDIR/example.gff" | ||
printf "chr1\t.\tgene\t1000\t2000\t.\t+\t.\tID=gene1;Name=Gene1\n" >> "$TMPDIR/example.gff" | ||
printf "chr3\t.\tmRNA\t1000\t2000\t.\t+\t.\tID=transcript1;Parent=gene1\n" >> "$TMPDIR/example.gff" | ||
printf "chr1\t.\texon\t1000\t1200\t.\t+\t.\tID=exon1;Parent=transcript1\n" >> "$TMPDIR/example.gff" | ||
printf "chr2\t.\texon\t1500\t1700\t.\t+\t.\tID=exon2;Parent=transcript1\n" >> "$TMPDIR/example.gff" | ||
printf "chr1\t.\tCDS\t1000\t1200\t.\t+\t0\tID=cds1;Parent=transcript1\n" >> "$TMPDIR/example.gff" | ||
printf "chr1\t.\tCDS\t1500\t1700\t.\t+\t2\tID=cds2;Parent=transcript1\n" >> "$TMPDIR/example.gff" | ||
|
||
# Expected output sam files for each test | ||
cat <<EOF > "$TMPDIR/expected.sam" | ||
@HD VN:1.0 SO:unsorted | ||
@PG ID:BEDTools_bedToBam VN:Vv2.30.0 | ||
@PG ID:samtools PN:samtools PP:BEDTools_bedToBam VN:1.16.1 CL:samtools view -h output.bam | ||
@SQ SN:chr1 AS:../genome.txt LN:248956422 | ||
@SQ SN:chr3 AS:../genome.txt LN:242193529 | ||
@SQ SN:chr2 AS:../genome.txt LN:198295559 | ||
my_read/1 0 chr1 129 255 100M * 0 0 * * | ||
my_read/2 16 chr1 429 255 100M * 0 0 * * | ||
EOF | ||
cat <<EOF > "$TMPDIR/expected12.sam" | ||
@HD VN:1.0 SO:unsorted | ||
@PG ID:BEDTools_bedToBam VN:Vv2.30.0 | ||
@PG ID:samtools PN:samtools PP:BEDTools_bedToBam VN:1.16.1 CL:samtools view -h output.bam | ||
@SQ SN:chr1 AS:../genome.txt LN:248956422 | ||
@SQ SN:chr3 AS:../genome.txt LN:242193529 | ||
@SQ SN:chr2 AS:../genome.txt LN:198295559 | ||
my_read/1 0 chr1 129 255 100M * 0 0 * * | ||
my_read/2 16 chr1 429 255 100M * 0 0 * * | ||
EOF | ||
cat <<EOF > "$TMPDIR/expected_mapquality.sam" | ||
@HD VN:1.0 SO:unsorted | ||
@PG ID:BEDTools_bedToBam VN:Vv2.30.0 | ||
@PG ID:samtools PN:samtools PP:BEDTools_bedToBam VN:1.16.1 CL:samtools view -h output.bam | ||
@SQ SN:chr1 AS:../genome.txt LN:248956422 | ||
@SQ SN:chr3 AS:../genome.txt LN:242193529 | ||
@SQ SN:chr2 AS:../genome.txt LN:198295559 | ||
my_read/1 0 chr1 129 10 100M * 0 0 * * | ||
my_read/2 16 chr1 429 10 100M * 0 0 * * | ||
EOF | ||
cat <<EOF > "$TMPDIR/expected_gff.sam" | ||
@HD VN:1.0 SO:unsorted | ||
@PG ID:BEDTools_bedToBam VN:Vv2.30.0 | ||
@PG ID:samtools PN:samtools PP:BEDTools_bedToBam VN:1.16.1 CL:samtools view -h output.bam | ||
@SQ SN:chr1 AS:../genome.txt LN:248956422 | ||
@SQ SN:chr3 AS:../genome.txt LN:242193529 | ||
@SQ SN:chr2 AS:../genome.txt LN:198295559 | ||
gene 0 chr1 1000 255 1001M * 0 0 * * | ||
mRNA 0 chr3 1000 255 1001M * 0 0 * * | ||
exon 0 chr1 1000 255 201M * 0 0 * * | ||
exon 0 chr2 1500 255 201M * 0 0 * * | ||
CDS 0 chr1 1000 255 201M * 0 0 * * | ||
CDS 0 chr1 1500 255 201M * 0 0 * * | ||
EOF | ||
|
||
# Test 1: Default conversion BED to BAM | ||
mkdir "$TMPDIR/test1" && pushd "$TMPDIR/test1" > /dev/null | ||
|
||
echo "> Run bedtools_bedtobam on BED file" | ||
"$meta_executable" \ | ||
--input "../example.bed" \ | ||
--genome "../genome.txt" \ | ||
--output "output.bam" | ||
|
||
samtools view -h output.bam > output.sam | ||
|
||
# checks | ||
assert_file_exists "output.bam" | ||
assert_file_not_empty "output.bam" | ||
assert_identical_content "output.sam" "../expected.sam" | ||
echo "- test1 succeeded -" | ||
|
||
popd > /dev/null | ||
|
||
# Test 2: BED12 file | ||
mkdir "$TMPDIR/test2" && pushd "$TMPDIR/test2" > /dev/null | ||
|
||
echo "> Run bedtools_bedtobam on BED12 file" | ||
"$meta_executable" \ | ||
--input "../example.bed12" \ | ||
--genome "../genome.txt" \ | ||
--output "output.bam" \ | ||
--bed12 \ | ||
|
||
samtools view -h output.bam > output.sam | ||
|
||
# checks | ||
assert_file_exists "output.bam" | ||
assert_file_not_empty "output.bam" | ||
assert_identical_content "output.sam" "../expected12.sam" | ||
echo "- test2 succeeded -" | ||
|
||
popd > /dev/null | ||
|
||
# Test 3: Uncompressed BAM file | ||
mkdir "$TMPDIR/test3" && pushd "$TMPDIR/test3" > /dev/null | ||
|
||
echo "> Run bedtools_bedtobam on BED file with uncompressed BAM output" | ||
"$meta_executable" \ | ||
--input "../example.bed" \ | ||
--genome "../genome.txt" \ | ||
--output "output.bam" \ | ||
--uncompress_bam | ||
|
||
# checks | ||
assert_file_exists "output.bam" | ||
assert_file_not_empty "output.bam" | ||
# Cannot assert_identical_content because umcompress option does not work on this version of bedtools. | ||
|
||
echo "- test3 succeeded -" | ||
|
||
popd > /dev/null | ||
|
||
# Test 4: Map quality | ||
mkdir "$TMPDIR/test4" && pushd "$TMPDIR/test4" > /dev/null | ||
|
||
echo "> Run bedtools_bedtobam on BED file with map quality" | ||
"$meta_executable" \ | ||
--input "../example.bed" \ | ||
--genome "../genome.txt" \ | ||
--output "output.bam" \ | ||
--map_quality 10 \ | ||
|
||
samtools view -h output.bam > output.sam | ||
|
||
# checks | ||
assert_file_exists "output.bam" | ||
assert_file_not_empty "output.bam" | ||
assert_identical_content "output.sam" "../expected_mapquality.sam" | ||
echo "- test4 succeeded -" | ||
|
||
popd > /dev/null | ||
|
||
# Test 5: gff to bam conversion | ||
mkdir "$TMPDIR/test5" && pushd "$TMPDIR/test5" > /dev/null | ||
|
||
echo "> Run bedtools_bedtobam on GFF file" | ||
"$meta_executable" \ | ||
--input "../example.gff" \ | ||
--genome "../genome.txt" \ | ||
--output "output.bam" | ||
|
||
samtools view -h output.bam > output.sam | ||
|
||
# checks | ||
assert_file_exists "output.bam" | ||
assert_file_not_empty "output.bam" | ||
assert_identical_content "output.sam" "../expected_gff.sam" | ||
echo "- test5 succeeded -" | ||
|
||
popd > /dev/null | ||
|
||
echo "---- All tests succeeded! ----" | ||
exit 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters