diff --git a/README.md b/README.md index bc2bd93..107af96 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,9 @@ The development of this pipeline is part of the GPS Project ([Global Pneumococca # Table of contents - [Workflow](#workflow) - [Usage](#usage) - - [Requirement](#requirement) + - [Requirements](#requirements) + - [Software](#software) + - [Hardware](#hardware) - [Accepted Inputs](#accepted-inputs) - [Setup](#setup) - [Run](#run) @@ -47,12 +49,26 @@ The development of this pipeline is part of the GPS Project ([Global Pneumococca   # Usage -## Requirement -- A POSIX-compatible system (e.g. Linux, macOS, Windows with [WSL](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux)) with Bash 3.2 or later -- Java 11 or later (up to 21) ([OpenJDK](https://openjdk.org/)/[Oracle Java](https://www.oracle.com/java/)) -- [Docker](https://www.docker.com/) or [Singularity](https://sylabs.io/singularity/)/[Apptainer](https://apptainer.org/) - - For Linux, [Singularity](https://sylabs.io/singularity/)/[Apptainer](https://apptainer.org/) or [Docker Engine](https://docs.docker.com/engine/) is recommended over [Docker Desktop for Linux](https://docs.docker.com/desktop/). The latter is known to cause permission issues when running the pipeline on Linux. -- It is recommended to have at least 16GB of RAM and 50GB of free storage +## Requirements +### Software + - A POSIX-compatible operating system (e.g. Linux, macOS, Windows with [WSL](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux)) with Bash 3.2 or later + - [Installation guide for WSL on Windows](https://learn.microsoft.com/en-us/windows/wsl/install) by Microsoft + - Java 11 or later (up to 21) ([OpenJDK](https://openjdk.org/)/[Oracle Java](https://www.oracle.com/java/)) + - [Installation guide for OpenJDK](https://www.freecodecamp.org/news/install-openjdk-free-java-multi-os-guide/) by freeCodeCamp + - [Docker](https://www.docker.com/) or [Singularity](https://sylabs.io/singularity/)/[Apptainer](https://apptainer.org/) + - Installation guides: + - For Linux + > ℹ️ For Linux, [Docker Engine](https://docs.docker.com/engine/) or [Singularity](https://sylabs.io/singularity/)/[Apptainer](https://apptainer.org/) is recommended over [Docker Desktop](https://docs.docker.com/desktop/). The latter is known to cause permission issues when running the pipeline on Linux. + - [Docker Engine on Linux](https://docs.docker.com/engine/install/) by Docker + - [Apptainer on Linux](https://apptainer.org/docs/admin/main/installation.html) by Apptainer + - For macOS + - [Docker Desktop on macOS](https://docs.docker.com/desktop/install/mac-install/) by Docker + > ℹ️ After installation, you might need to [allow Docker to access more system resources](https://docs.docker.com/desktop/settings/mac/), especially CPU and Memory, to match the hardware requirement of the pipeline + - For Windows with WSL + - [Docker Desktop on Windows with WSL](https://docs.docker.com/desktop/wsl/) by Docker + +### Hardware +It is recommended to have at least 16GB of RAM and 50GB of free storage > ℹ️ Details on storage > - The pipeline core files use ~5MB > - All default databases use ~8GB in total @@ -464,7 +480,7 @@ This project uses open-source components. You can find the homepage or source co [resistanceDatabase](https://github.com/kumarnaren/resistanceDatabase) - Narender Kumar ([@kumarnaren](https://github.com/kumarnaren)) - License (GPL-3.0): https://github.com/kumarnaren/resistanceDatabase/blob/main/LICENSE -- `sequences.fasta` is renamed to `ariba_ref_sequences.fasta` and used as-is +- `sequences.fasta` is renamed to `ariba_ref_sequences.fasta` and modified - `metadata.tsv` is renamed to `ariba_metadata.tsv` and modified - The files are used as the default inputs of `GET_ARIBA_DB` process of the `amr.nf` module diff --git a/data/ariba_metadata.tsv b/data/ariba_metadata.tsv index ecfed73..c0ed955 100644 --- a/data/ariba_metadata.tsv +++ b/data/ariba_metadata.tsv @@ -79,10 +79,6 @@ vanG_KF704242 1 0 . . VAN otrA_X53401 1 0 . . TET vanA_M97297 1 0 . . TET vanC_AF162694 1 0 . . TET -23S_NZ_CP018347 0 1 A2114G . ERY -23S_NZ_CP018347 0 1 A2115G . ERY -23S_NZ_CP018347 0 1 A2118G . ERY -23S_NZ_CP018347 0 1 C2630A . ERY -23S_NZ_CP018347 0 1 C2630G . ERY +23S_NR_076173 0 1 A2061G . ERY_CLI rrgA_EF560637 1 0 . . PILI1 pitB_GU256423 1 0 . . PILI2 diff --git a/data/ariba_ref_sequences.fasta b/data/ariba_ref_sequences.fasta index 44c605c..54f6c61 100644 --- a/data/ariba_ref_sequences.fasta +++ b/data/ariba_ref_sequences.fasta @@ -799,44 +799,49 @@ GGTTTTTTTGATTTTGAAGAGAAATACCAATTAATCAGCGCCACGATCACTGTCCCAGCACCATTGCCTCTCGCGCTTGA AAGGAGCAGGCACAGCTGCTTTATCGAAACTTGGGATTGACGGGTCTGGCTCGAATCGATTTTTTCGTCACCAATCAAGGAGCGATTTAT TTAAACGAAATCAACACCATGCCGGGATTTACTGGGCACTCCCGCTACCCAGCTATGATGGCGGAAGTCGGGTTATCCTACGAAATATTA GTAGAGCAATTGATTGCACTGGCAGAGGAGGACAAACGATGA ->23S_NZ_CP018347 -tttggataagtcctcgagctattagtattagtccgctacatgtgtcgccacacttccacttctaacctatctacctgatc -atctctcagggctcttactgatatataatcatgggaaatctcatcttgaggtgggtttcacacttagatgctttcagcgt -ttatcccttccctacatagctacccagcgatgcctttggcaagacaactggtacaccagcggtaagtccactctggtcct -ctcgtactaggagcagatcctctcaaatttcctacgcccgcgacggatagggaccgaactgtctcacgacgttctgaacc -cagctcgcgtgccgctttaatgggcgaacagcccaacccttgggaccgactacagccccaggatgcgacgagccgacatc -gaggtgccaaacctccccgtcgatgtgaactcttgggggagataagcctgttatccccagggtagcttttatccgttgag -cgatggcccttccatacggaaccaccggatcactaagcccgactttcgtccctgctcgagttgtagctctcgcagtcaag -ctcccttatacctttacactctgcgaatgatttccaaccattctgagggaacctttgggcgcctccgttaccttttagga -ggcgaccgccccagtcaaactgcccgtcagacactgtctccgatagggatcacctatctgggttagagtggccataacac -aagggtagtatcccaacagcgtctccttcgaaactggcgtcccgatctcttagactcctacctatcctgtacatgtggta -cagacactcaatatcaaactgcagtaaagctccatggggtctttccgtcctgtcgcgggtaacctgcatcttcacaggta -ctaaaatttcaccgagtctctcgttgagacagtgcccaaatcattacgcctttcgtgcgggtcggaacttacccgacaag -gaatttcgctaccttaggaccgttatagttacggccgccgtttactggggcttcaattcataccttcgcttacgctaagc -actcctcttaaccttccagcaccgggcaggcgtcaccccctatacatcatcttacgatttagcagagagctgtgtttttg -ataaacagttgcttgggcctattcactgcggctgacctaaagtcagcaccccttctcccgaagttacggggtcattttgc -cgagttccttaacgagagttctctcgctcacctgaggctactcgcctcgactacctgtgtcggtttgcggtacgggtaga -gtatgtttaaacgctagaagcttttcttggcagtgtgacgtcactaacttcgctactaaacttcgctccccatcacagct -caatgttatagaattaagcatttgactcaattcacacctcactgcttagacagactcttccaatcgtctgctttagttag -cctactgcgtccctccatcactacatactctagtacaggaatatcaacctgttgtccatcggatacacctttcggtctct -ccttaggtcccgactaacccagggcggacgagccttcccctggaaaccttagtcttacggtggacaggattctcacctgt -ctttcgctactcataccggcattctcacttctatgcgttccagcactcctcacggtataccttcatcacacatagaacgc -tctcctaccatacctataaaggtatccacagcttcggtaaattgttttagccccggtacattttcggcgcagggtcactc -gactagtgagctattacgcactctttgaatgaatagctgcttctaagctaacatcctagttgtctgtgcaaccccacatc -cttttccacttaacaattattttgggaccttagctggtggtctgggctgtttccctttcgactacggatcttagcactcg -cagtctgactgccgaccataattcattggcattcggagtttatctgagattggtaatccgggatggacccctcacccaaa -cagtgctctacctccaagaatctctaatgtcgacgctagccctaaagctatttcggagagaaccagctatctccaagttc -gtttggaatttctccgctacccacaagtcatccaagcacttttcaacgtgccctggttcggtcctccagtgcgtcttacc -gcaccttcaacctgctcatgggtaggtcacatggtttcgggtctacgtcatgatactaattcgccctgttcagactcggt -ttccctacggctccgtctcttcaacttaacctcgcatcataacgtaactcgccggttcattctacaaaaggcacgctctc -acccattaacgggctcgaacttgttgtaggcacacggtttcaggttctatttcactcccctcccggggtgcttttcacct -ttccctcacggtactggttcactatcggtcactagggagtatttagggttgggagatggtcctcccagattccgacggga -tttcacgtgtcccgccgtactcaggatactgctaggtacaaagactattttaaatacgaggctattactctctttggctg -atcttcccaaatcattcttctataatctttgagtccacattgcagtcctacaaccccgaagagtaaactcttcggtttgc -ccttctgccgtttcgctcgccgctactaaggcaatcgcttttgctttctcttcctgcagctacttagatgtttcagttca -ctgcgtcttcctcctcacatccttaacagatgtgggtaacaggtattacctgttgggttcccccattcggaaatccctgg -atcatcgcttacttacagctacccaaggtatatcgtcgtttgtcacgtccttcgtcggctcctagtgccaaggcatccac -cgtgcgcccttattaacttaacct +>23S_NR_076173 +GGTTAAGTTAATAAGGGCGCACGGTGGATGCCTTGGCACTAGGAGCCGACGAAGGACGTGACAAACGACG +ATATGCCTTGGGTAGCTGTAAGTAAGCGATGATCCAGGGATTTCCGAATGGGGGAACCCAACAGGTAATA +CCTGTTACCCACATCTGTTAAGGATGTGAGGAGGAAGACGCAGTGAACTGAAACATCTAAGTAGCTGCAG +GAAGAGAAAGCAAAAGCGATTGCCTTAGTAGCGGCGAGCGAAACGGCAGGAGGGCAAACCGAAGAGTTTA +CTCTTCGGGGTTGTAGGACTGCAATGTGGACTCAAAGATTATAGAAGAATGATTTGGGAAGATCAGCCAA +AGAGAGTAATAGCCTCGTATTTAAAATAGTCTTTGTACTTAGCAGTATCCTGAGTACGGCGGGACACGTG +AAATCCCGTCGGAATCTGGGAGGACCATCTCCCAACCCTAAATACTCCCTAGTGACCGATAGTGAACCAG +TACCGTGAGGGAAAGGTGAAAAGCACCCCGGGAGGGGAGTGAAATAGAACCTGAAACCGTGTGCCTACAA +CAAGTTCGAGCCCGTTAATGGGTGAGAGCGTGCCTTTTGTAGAATGAACCGGCGAGTTACGTTATGATGC +GAGGTTAAGTTGAAGAGACGGAGCCGTAGGGAAACCGAGTCTGAATAGGGCACCTTAGTATCATGACGTA +GACCCGAAACCATGTGACCTACCCATGAGCAGGTTGAAGGTGCGGTAAGACGCACTGGAGGACCGAACCA +GGGCACGTTGAAAAGTGCTTGGATGACTTGTGGGTAGCGGAGAAATTCCAAACGAACTTGGAGATAGCTG +GTTCTCTCCGAAATAGCTTTAGGGCTAGCGTCGACATTAGAGATTCTTGGAGGTAGAGCACTGTTTGGGT +GAGGGGTCCATCCCGGATTACCAATCTCAGATAAACTCCGAATGCCAATGAATTATGGTCGGCAGTCAGA +CTGCGAGTGCTAAGATCCGTAGTCGAAAGGGAAACAGCCCAGACCACCAGCTAAGGTCCCAAAATAATTG +TTAAGTGGAAAAGGATGTGGGGTTGCACAGACAACTAGGATGTTAGCTTAGAAGCAGCTATTCATTCAAA +GAGTGCGTAATAGCTCACTAGTCGAGTGACCCTGCGCCGAAAATGTACCGGGGCTAAAACAATTTACCGA +AGCTGTGGATACCTTTATAGGTATGGTAGGAGAGCGTTCTATGTGTGATGAAGGTATACCGTGAGGAGTG +CTGGAACGCATAGAAGTGAGAATGCCGGTATGAGTAGCGAAAGACAGGTGAGAATCCTGTCCACCGTAAG +ACTAAGGTTTCCAGGGGAAGGCTCGTCCGCCCTGGGTTAGTCGGGACCTAAGGAGAGACCGAAAGGTGTA +TCCGATGGACAACAGGTTGATATTCCTGTACTAGAGTATGTAGTGATGGAGGGACGCAGTAGGCTAACTA +AAGCAGACGATTGGAAGAGTCTGTCTAAGCAGTGAGGTGTGAATTGAGTCAAATGCTTAATTCTATAACA +TTGAGCTGTGATGGGGAGCGAAGTTTAGTAGCGAAGTTAGTGACGTCACACTGCCAAGAAAAGCTTCTAG +CGTTTAACCATACTCTACCCGTACCGCAAACCGACACAGGTAGTCGAGGCGAGTAGCCTCAGGTGAGCGA +GAGAACTCTCGTTAAGGAACTCGGCAAAATGACCCCGTAACTTCGGGAGAAGGGGTGCTGACTTTAAGTC +AGCCGCAGTGAATAGGCCCAAGCAACTGTTTATCAAAAACACAGCTCTCTGCTAAATCGTAAGATGATGT +ATAGGGGGTGACGCCTGCCCGGTGCTGGAAGGTTAAGAGGAGTGCTTAGCGTAAGCGAAGGTATGAATTG +AAGCCCCAGTAAACGGCGGCCGTAACTATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCC +GACCCGCACGAAAGGCGTAATGATTTGGGCACTGTCTCAACGAGAGACTCGGTGAAATTTTAGTACCTGT +GAAGATGCAGGTTACCCGCGACAGGACGGAAAGACCCCATGGAGCTTTACTGCAGTTTGATATTGAGTGT +CTGTACCACATGTACAGGATAGGTAGGAGTCTAAGAGATCGGGACGCCAGTTTCGAAGGAGACGCTGTTG +GGATACTACCCTTGTGTTATGGCCACTCTAACCCAGATAGGTGATCCCTATCGGAGACAGTGTCTGACGG +GCAGTTTGACTGGGGCGGTCGCCTCCTAAAAGGTAACGGAGGCGCCCAAAGGTTCCCTCAGAATGGTTGG +AAATCATTCGCAGAGTGTAAAGGTATAAGGGAGCTTGACTGCGAGAGCTACAACTCGAGCAGGGACGAAA +GTCGGGCTTAGTGATCCGGTGGTTCCGTATGGAAGGGCCATCGCTCAACGGATAAAAGCTACCCTGGGGA +TAACAGGCTTATCTCCCCCAAGAGTTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCGTCGCA +TCCTGGGGCTGTAGTCGGTCCCAAGGGTTGGGCTGTTCGCCCATTAAAGCGGCACGCGAGCTGGGTTCAG +AACGTCGTGAGACAGTTCGGTCCCTATCCGTCGCGGGCGTAGGAAATTTGAGAGGATCTGCTCCTAGTAC +GAGAGGACCAGAGTGGACTTACCGCTGGTGTACCAGTTGTCTTGCCAAAGGCATCGCTGGGTAGCTATGT +AGGGAAGGGATAAACGCTGAAAGCATCTAAGTGTGAAACCCACCTCAAGATGAGATTTCCCATGATTATA +TATCAGTAAGAGCCCTGAGAGATGATCAGGTAGATAGGTTAGAAGTGGAAGTGTGGCGACACATGTAGCG +GACTAATACTAATAGCTCGAGGACTTATCCAA >rrgA_EF560637 ATGAAAAAAGTAAGAAAGATATTTCAGAAGGCAGTTGCAGGACTGTGCTGTATATCTCAGTTGACAGCTT TTTCTTCGATAGTTGCTTTAGCAGAAACGCCTGAAACCAGTCCAGCGATAGGAAAAGTAGTGATTAAGGA diff --git a/doc/workflow.drawio.svg b/doc/workflow.drawio.svg index 87d8e97..2000bfa 100644 --- a/doc/workflow.drawio.svg +++ b/doc/workflow.drawio.svg @@ -1,4 +1,4 @@ - + @@ -32,9 +32,9 @@ - - - + + + @@ -64,7 +64,7 @@ -
+
FASTQ (Reads) @@ -72,31 +72,31 @@
- + FASTQ (Reads) - + - + S. Pneumo:     ≥ 60% - + Other Genus:  ≤ 2% - + Contigs:  ≤ 500 - + Length:   1.9 - 2.3 Mb - + Depth:     ≥ 20x @@ -104,7 +104,7 @@ -
+
FASTA (Assemblies) @@ -112,7 +112,7 @@
- + FASTA (Assemblies) @@ -121,7 +121,7 @@ -
+
SAM @@ -129,17 +129,17 @@
- + SAM - + - + Ref Coverage: ≥ 60% - + Het-SNP site:  ≤ 220 @@ -147,7 +147,7 @@ -
+
Results @@ -155,14 +155,14 @@
- + Results - + @@ -282,7 +282,7 @@ - + @@ -305,7 +305,6 @@ - @@ -416,27 +415,6 @@ - - - - -
-
-
- - Other AMR / Virulence - -
- ARIBA, custom script -
-
-
-
- - Othe... - -
-
@@ -464,12 +442,7 @@ Bases: ≥ 38 Mb - - - - Go / No-go - - + @@ -545,6 +518,33 @@ Go / No-go + + + + +
+
+
+ + Other AMR / Virulence + +
+ ARIBA, custom script +
+
+
+
+ + Othe... + +
+
+ + + + Go / No-go + + diff --git a/main.nf b/main.nf index 73b4767..9acd77a 100644 --- a/main.nf +++ b/main.nf @@ -1,7 +1,7 @@ #!/usr/bin/env nextflow // Version of this release -pipelineVersion = '1.0.0-rc5' +pipelineVersion = '1.0.0-rc6' // Import workflow modules include { PIPELINE } from "$projectDir/workflows/pipeline"