diff --git a/CHANGELOG.md b/CHANGELOG.md index 276195c2..9d31a2da 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,7 +8,9 @@ * `fastp`: An ultra-fast all-in-one FASTQ preprocessor (PR #3). -* `busco`: Assess genome assembly and annotation completeness with single copy orthologs (PR #6). +* `busco`: + - `busco/busco_run`: Assess genome assembly and annotation completeness with single copy orthologs (PR #6). + - `busco/busco_list_datasets`: Lists available busco datasets (PR #18) * `featurecounts`: Assign sequence reads to genomic features (PR #11). diff --git a/src/busco/busco_list_datasets/config.vsh.yaml b/src/busco/busco_list_datasets/config.vsh.yaml new file mode 100644 index 00000000..444e2a6d --- /dev/null +++ b/src/busco/busco_list_datasets/config.vsh.yaml @@ -0,0 +1,36 @@ +functionality: + name: busco + description: Lists the available busco datasets + info: + keywords: [lineage datasets] + homepage: https://busco.ezlab.org/ + documentation: https://busco.ezlab.org/busco_userguide.html + repository: https://gitlab.com/ezlab/busco + reference: "10.1007/978-1-4939-9173-0_14" + licence: MIT + argument_groups: + - name: Outputs + arguments: + - name: --output + alternatives: ["-o"] + direction: output + type: file + description: | + Output file of the available busco datasets + required: false + default: busco_dataset_list.txt + example: file.txt + resources: + - type: bash_script + path: script.sh + test_resources: + - type: bash_script + path: test.sh +platforms: + - type: docker + image: quay.io/biocontainers/busco:5.6.1--pyhdfd78af_0 + setup: + - type: docker + run: | + busco --version | sed 's/BUSCO\s\(.*\)/busco: "\1"/' > /var/software_versions.txt + - type: nextflow diff --git a/src/busco/busco_list_datasets/script.sh b/src/busco/busco_list_datasets/script.sh new file mode 100644 index 00000000..6c80725c --- /dev/null +++ b/src/busco/busco_list_datasets/script.sh @@ -0,0 +1,6 @@ +#!/bin/bash + +## VIASH START +## VIASH END + +busco --list-datasets | awk '/^#{40}/{flag=1; next} flag{print}' > $par_output \ No newline at end of file diff --git a/src/busco/busco_list_datasets/test.sh b/src/busco/busco_list_datasets/test.sh new file mode 100644 index 00000000..c303cd77 --- /dev/null +++ b/src/busco/busco_list_datasets/test.sh @@ -0,0 +1,15 @@ +#!/bin/bash + +## VIASH START +## VIASH END + +"$meta_executable" \ + --output datasets.txt + +echo ">> Checking output" +[ ! -f "datasets.txt" ] && echo "datasets.txt does not exist" && exit 1 + +echo ">> Checking if output is empty" +[ ! -s "datasets.txt" ] && echo "datasets.txt is empty" && exit 1 + +rm datasets.txt \ No newline at end of file diff --git a/src/busco/config.vsh.yaml b/src/busco/busco_run/config.vsh.yaml similarity index 99% rename from src/busco/config.vsh.yaml rename to src/busco/busco_run/config.vsh.yaml index fba14892..2297fc2d 100644 --- a/src/busco/config.vsh.yaml +++ b/src/busco/busco_run/config.vsh.yaml @@ -1,5 +1,5 @@ functionality: - name: busco + name: busco_run description: Assessment of genome assembly and annotation completeness with single copy orthologs info: keywords: [Genome assembly, quality control] @@ -35,7 +35,7 @@ functionality: required: false description: | Specify a BUSCO lineage dataset that is most closely related to the assembly or gene set being assessed. - The full list of available datasets can be viewed [here](https://busco-data.ezlab.org/v5/data/lineages/) or by running `busco --list-datasets` (which requires installing the tool). + The full list of available datasets can be viewed [here](https://busco-data.ezlab.org/v5/data/lineages/) or by running the busco/busco_list_datasets component. When unsure, the "--auto_lineage" flag can be set to automatically find the optimal lineage path. Requested datasets will automatically be downloaded if not already present in the download folder. example: stramenopiles_odb10 diff --git a/src/busco/help.txt b/src/busco/busco_run/help.txt similarity index 100% rename from src/busco/help.txt rename to src/busco/busco_run/help.txt diff --git a/src/busco/script.sh b/src/busco/busco_run/script.sh similarity index 100% rename from src/busco/script.sh rename to src/busco/busco_run/script.sh diff --git a/src/busco/test.sh b/src/busco/busco_run/test.sh similarity index 100% rename from src/busco/test.sh rename to src/busco/busco_run/test.sh diff --git a/src/busco/test_data/genome.fna b/src/busco/busco_run/test_data/genome.fna similarity index 100% rename from src/busco/test_data/genome.fna rename to src/busco/busco_run/test_data/genome.fna diff --git a/src/busco/test_data/protein.fasta b/src/busco/busco_run/test_data/protein.fasta similarity index 100% rename from src/busco/test_data/protein.fasta rename to src/busco/busco_run/test_data/protein.fasta diff --git a/src/busco/test_data/script.sh b/src/busco/busco_run/test_data/script.sh similarity index 100% rename from src/busco/test_data/script.sh rename to src/busco/busco_run/test_data/script.sh