-
Notifications
You must be signed in to change notification settings - Fork 6
Home
The recommended method is to use a pre-built Docker or Singularity container.
Both the Docker and Singularity container have the main script
viridian_workflow
installed.
Get a Docker image of the latest release:
docker pull ghcr.io/iqbal-lab-org/viridian_workflow:latest
All Docker images are listed in the packages page.
To build a docker container, clone this repository and then from its root run:
docker build --network=host .
(without --network=host
you will likely get pip install
timing out and
the build failing).
Releases
include a Singularity image to download.
Each release has a singularity image file called
viridian_workflow_vX.Y.Z.img
, where X.Y.Z
is the release version.
To build a singularity container, clone this repository and then from its root run:
singularity build viridian_workflow.img Singularity.def
The examples below will run the default pipeline, using the built-in SARS-CoV-2 amplicon schemes ampliseq, ARTIC V3-5, and Midnight-1200. The pipeline automatically detects the scheme that best matches the input reads. To use your own amplicon scheme and/or force the choice of scheme, please read the amplicon schemes page. For a more detailed description of the pipeline options, please read the workflow usage page.
To run on paired Illumina reads:
viridian_workflow run_one_sample \
--tech illumina
--reads1 reads_1.fastq.gz \
--reads2 reads_2.fastq.gz \
--outdir OUT
To run on unpaired nanopore reads:
viridian_workflow run_one_sample \
--tech ont
--reads reads.fastq.gz \
--outdir OUT
The allowed values of --tech
are illumina
, iontorrent
, ont
.
Nanopore reads must be unpaired (ie use the --reads
option).
Illumina and iontorrent reads can be paired or unpaired.
Other options:
-
--sample_name MY_NAME
: use this to change the sample name (default is "sample") that is put in the final FASTA file, BAM file, and VCF file. -
--keep_bam
: use this option to keep the BAM file of original input reads mapped to the reference genome. -
--force
: use with caution - it will overwrite the output directory if it already exists.
The default files in the output directory are:
-
consensus.fa.gz
: a gzipped FASTA file of the consensus sequence. -
variants.vcf
: a VCF file of the identified variants between the consensus sequence and the reference genome. -
log.json.gz
: a gzipped JSON file of logging information for the viridian workflow run. This is described in detail in the JSON output file page. -
scheme_id.depth_across_genome.pdf
: a plot of the read depth across the genome, with amplicons coloured in the background. -
scheme_id.score_plot.pdf
: a plot of the scoring for amplicon scheme identification.
If the option --keep_bam
is used, then a sorted BAM file of the reads mapped
to the reference will also be present, called
reference_mapped.bam
(and its index file reference_mapped.bam.bai
).