The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
[2.5] - 2022-07-13
- Default Nextclade dataset shipped with the pipeline has been bumped from
2022-01-18T12:00:00Z
->2022-06-14T12:00:00Z
- [#234] - Remove replacement of dashes in sample name with underscores
- [#292] - Filter empty FastQ files after adapter trimming
- [#303] - New pangolin dbs (4.0.x) not assigning lineages to Sars-CoV-2 samples in MultiQC report correctly
- [#304] - Re-factor code of
ivar_variants_to_vcf
script - [#306] - Add contig field information in vcf header in ivar_variants_to_vcf and use bcftools sort
- [#311] - Invalid declaration val medaka_model_string
- [#316] - Variant calling isn't run when using --skip_asciigenome with metagenomic data
- [nf-core/rnaseq#764] - Test fails when using GCP due to missing tools in the basic biocontainer
- Updated pipeline template to nf-core/tools 2.4.1
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
artic |
1.2.1 | 1.2.2 |
bcftools |
1.14 | 1.15.1 |
multiqc |
1.11 | 1.13a |
nanoplot |
1.39.0 | 1.40.0 |
nextclade |
1.10.2 | 2.2.0 |
pangolin |
3.1.20 | 4.1.1 |
picard |
2.26.10 | 2.27.4 |
quast |
5.0.2 | 5.2.0 |
samtools |
1.14 | 1.15.1 |
spades |
3.15.3 | 3.15.4 |
vcflib |
1.0.2 | 1.0.3 |
NB: Dependency has been updated if both old and new version information is present.
NB: Dependency has been added if just the new version information is present.
NB: Dependency has been removed if new version information isn't present.
[2.4.1] - 2022-03-01
- [#288] -
--primer_set_version
only accepts Integers (incompatible with "4.1" Artic primers set)
[2.4] - 2022-02-22
- nf-core/tools#1415 - Make
--outdir
a mandatory parameter - [#281] - Nanopore medaka processing fails with error if model name, not model file, provided
- [#286] - IVAR_VARIANTS silently failing when FAI index is missing
Old parameter | New parameter |
---|---|
--publish_dir_mode |
NB: Parameter has been updated if both old and new parameter information is present.
NB: Parameter has been added if just the new parameter information is present.
NB: Parameter has been removed if new parameter information isn't present.
[2.3.1] - 2022-02-15
- [#277] - Misuse of rstrip in make_variants_long_table.py script
Dependency | Old version | New version |
---|---|---|
mosdepth |
0.3.2 | 0.3.3 |
pangolin |
3.1.19 | 3.1.20 |
[2.3] - 2022-02-04
- Please see Major updates in v2.3 for a more detailed list of changes added in this version.
- When using
--protocol amplicon
, in the previous release, iVar was used for both the variant calling and consensus sequence generation. The pipeline will now perform the variant calling and consensus sequence generation with iVar and BCFTools/BEDTools, respectively. - Bump minimum Nextflow version from
21.04.0
->21.10.3
- Port pipeline to the updated Nextflow DSL2 syntax adopted on nf-core/modules
- Updated pipeline template to nf-core/tools 2.2
- [#209] - Check that contig in primer BED and genome fasta match
- [#218] - Support for compressed FastQ files for Nanopore data
- [#232] - Remove duplicate variants called by ARTIC ONT pipeline
- [#235] - Nextclade version bump
- [#244] - Fix BCFtools consensus generation and masking
- [#245] - Mpileup file as output
- [#246] - Option to generate consensus with BCFTools / BEDTools using iVar variants
- [#247] - Add strand-bias filtering option and codon fix in consecutive positions in ivar tsv conversion to vcf
- [#248] - New variants reporting table
Old parameter | New parameter |
---|---|
--nextclade_dataset |
|
--nextclade_dataset_name |
|
--nextclade_dataset_reference |
|
--nextclade_dataset_tag |
|
--skip_consensus_plots |
|
--skip_variants_long_table |
|
--consensus_caller |
|
--callers |
--variant_caller |
NB: Parameter has been updated if both old and new parameter information is present.
NB: Parameter has been added if just the new parameter information is present.
NB: Parameter has been removed if new parameter information isn't present.
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
bcftools |
1.11 | 1.14 |
blast |
2.10.1 | 2.12.0 |
bowtie2 |
2.4.2 | 2.4.4 |
cutadapt |
3.2 | 3.5 |
fastp |
0.20.1 | 0.23.2 |
kraken2 |
2.1.1 | 2.1.2 |
minia |
3.2.4 | 3.2.6 |
mosdepth |
0.3.1 | 0.3.2 |
nanoplot |
1.36.1 | 1.39.0 |
nextclade |
1.10.2 | |
pangolin |
3.1.7 | 3.1.19 |
picard |
2.23.9 | 2.26.10 |
python |
3.8.3 | 3.9.5 |
samtools |
1.10 | 1.14 |
spades |
3.15.2 | 3.15.3 |
tabix |
0.2.6 | 1.11 |
vcflib |
1.0.2 |
NB: Dependency has been updated if both old and new version information is present.
NB: Dependency has been added if just the new version information is present.
NB: Dependency has been removed if new version information isn't present.
[2.2] - 2021-07-29
- Updated pipeline template to nf-core/tools 2.1
- Remove custom content to render Pangolin report in MultiQC as it was officially added as a module in v1.11
- [#212] - Access to
PYCOQC.out
is undefined - [#229] - ARTIC Guppyplex settings for 1200bp ARTIC primers with Nanopore data
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
multiqc |
1.10.1 | 1.11 |
pangolin |
3.0.5 | 3.1.7 |
samtools |
1.10 | 1.12 |
NB: Dependency has been updated if both old and new version information is present.
NB: Dependency has been added if just the new version information is present.
NB: Dependency has been removed if new version information isn't present.
[2.1] - 2021-06-15
- Removed workflow to download data from public databases in favour of using nf-core/fetchngs
- Added Pangolin results to MultiQC report
- Added warning to MultiQC report for samples that have no reads after adapter trimming
- Added docs about structure of data required for running Nanopore data
- Added docs about using other primer sets for Illumina data
- Added docs about overwriting default container definitions to use latest versions e.g. Pangolin
- Dashes and spaces in sample names will be converted to underscores to avoid issues when creating the summary metrics
- [#196] - Add mosdepth heatmap to MultiQC report
- [#197] - Output a .tsv comprising the Nextclade and Pangolin results for all samples processed
- [#198] - ASCIIGenome failing during analysis
- [#201] - Conditional include are not expected to work
- [#204] - Memory errors for SNP_EFF step
Old parameter | New parameter |
---|---|
--public_data_ids |
|
--skip_sra_fastq_download |
NB: Parameter has been updated if both old and new parameter information is present.
NB: Parameter has been added if just the new parameter information is present.
NB: Parameter has been removed if new parameter information isn't present.
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
nextclade_js |
0.14.2 | 0.14.4 |
pangolin |
2.4.2 | 3.0.5 |
NB: Dependency has been updated if both old and new version information is present.
NB: Dependency has been added if just the new version information is present.
NB: Dependency has been removed if new version information isn't present.
[2.0] - 2021-05-13
- Pipeline has been re-implemented in Nextflow DSL2
- All software containers are now exclusively obtained from Biocontainers
- Updated minimum Nextflow version to
v21.04.0
(see nextflow#572) - BCFtools and iVar will be run by default for Illumina metagenomics and amplicon data, respectively. However, this behaviour can be customised with the
--callers
parameter. - Variant graph processes to call variants relative to the reference genome directly from de novo assemblies have been deprecated and removed
- Variant calling with Varscan 2 has been deprecated and removed due to licensing restrictions
- New tools:
- Pangolin for lineage analysis
- Nextclade for clade assignment, mutation calling and consensus sequence quality checks
- ASCIIGenome for individual variant screenshots with annotation tracks
- Illumina and Nanopore runs containing the same 48 samples sequenced on both platforms have been uploaded to the nf-core AWS account for full-sized tests on release
- Initial implementation of a standardised samplesheet JSON schema to use with user interfaces and for validation
- Default human
--kraken2_db
link has been changed from Zenodo to an AWS S3 bucket for more reliable downloads - Updated pipeline template to nf-core/tools
1.14
- Optimise MultiQC configuration and input files for faster run-time on huge sample numbers
- [#122] - Single SPAdes command to rule them all
- [#138] - Problem masking the consensus sequence
- [#142] - Unknown method invocation
toBytes
on String type - [#169] - ggplot2 error when generating mosdepth amplicon plot with Swift v2 primers
- [#170] - ivar trimming of Swift libraries new offset feature
- [#175] - MultiQC report does not include all the metrics
- [#188] - Add and fix EditorConfig linting in entire pipeline
Old parameter | New parameter |
---|---|
--amplicon_bed |
--primer_bed |
--amplicon_fasta |
--primer_fasta |
--amplicon_left_suffix |
--primer_left_suffix |
--amplicon_right_suffix |
--primer_right_suffix |
--filter_dups |
--filter_duplicates |
--skip_adapter_trimming |
--skip_fastp |
--skip_amplicon_trimming |
--skip_cutadapt |
--artic_minion_aligner |
|
--artic_minion_caller |
|
--artic_minion_medaka_model |
|
--asciigenome_read_depth |
|
--asciigenome_window_size |
|
--blast_db |
|
--enable_conda |
|
--fast5_dir |
|
--fastq_dir |
|
--ivar_trim_offset |
|
--kraken2_assembly_host_filter |
|
--kraken2_variants_host_filter |
|
--min_barcode_reads |
|
--min_guppyplex_reads |
|
--multiqc_title |
|
--platform |
|
--primer_set |
|
--primer_set_version |
|
--public_data_ids |
|
--save_trimmed_fail |
|
--save_unaligned |
|
--sequencing_summary |
|
--singularity_pull_docker_container |
|
--skip_asciigenome |
|
--skip_bandage |
|
--skip_consensus |
|
--skip_ivar_trim |
|
--skip_nanoplot |
|
--skip_pangolin |
|
--skip_pycoqc |
|
--skip_nextclade |
|
--skip_sra_fastq_download |
|
--spades_hmm |
|
--spades_mode |
|
--cut_mean_quality |
|
--filter_unmapped |
|
--ivar_trim_min_len |
|
--ivar_trim_min_qual |
|
--ivar_trim_window_width |
|
--kraken2_use_ftp |
|
--max_allele_freq |
|
--min_allele_freq |
|
--min_base_qual |
|
--min_coverage |
|
--min_trim_length |
|
--minia_kmer |
|
--mpileup_depth |
|
--name |
|
--qualified_quality_phred |
|
--save_align_intermeds |
|
--save_kraken2_fastq |
|
--save_sra_fastq |
|
--skip_sra |
|
--skip_vg |
|
--unqualified_percent_limit |
|
--varscan2_strand_filter |
NB: Parameter has been updated if both old and new parameter information is present.
NB: Parameter has been added if just the new parameter information is present.
NB: Parameter has been removed if new parameter information isn't present.
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
artic |
1.2.1 | |
asciigenome |
1.16.0 | |
bc |
1.07.1 | |
bcftools |
1.9 | 1.11 |
bedtools |
2.29.2 | 2.30.0 |
bioconductor-biostrings |
2.54.0 | 2.58.0 |
bioconductor-complexheatmap |
2.2.0 | 2.6.2 |
blast |
2.9.0 | 2.10.1 |
bowtie2 |
2.4.1 | 2.4.2 |
cutadapt |
2.10 | 3.2 |
ivar |
1.2.2 | 1.3.1 |
kraken2 |
2.0.9beta | 2.1.1 |
markdown |
3.2.2 | |
minimap2 |
2.17 | |
mosdepth |
0.2.6 | 0.3.1 |
multiqc |
1.9 | 1.10.1 |
nanoplot |
1.36.1 | |
nextclade_js |
0.14.2 | |
pangolin |
2.4.2 | |
parallel-fastq-dump |
0.6.6 | |
picard |
2.23.0 | 2.23.9 |
pigz |
2.3.4 | |
plasmidid |
1.6.3 | 1.6.4 |
pycoqc |
2.5.2 | |
pygments |
2.6.1 | |
pymdown-extensions |
7.1 | |
python |
3.6.10 | 3.8.3 |
r-base |
3.6.2 | 4.0.3 |
r-ggplot2 |
3.3.1 | 3.3.3 |
r-tidyr |
1.1.0 | |
requests |
2.24.0 | |
samtools |
1.9 | 1.10 |
seqwish |
0.4.1 | |
snpeff |
4.5covid19 | 5.0 |
spades |
3.14.0 | 3.15.2 |
sra-tools |
2.10.7 | |
tabix |
0.2.6 | |
unicycler |
0.4.7 | 0.4.8 |
varscan |
2.4.4 | |
vg |
1.24.0 |
NB: Dependency has been updated if both old and new version information is present.
NB: Dependency has been added if just the new version information is present.
NB: Dependency has been removed if new version information isn't present.
[1.1.0] - 2020-06-23
- #112 - Per-amplicon coverage plot
- #124 - Intersect variants across callers
- nf-core/tools#616 - Updated GitHub Actions to build Docker image and push to Docker Hub
- Parameters:
--min_mapped_reads
to circumvent failures for samples with low number of mapped reads--varscan2_strand_filter
to toggle the default Varscan 2 strand filter--skip_mosdepth
- skip genome-wide and amplicon coverage plot generation from mosdepth output--amplicon_left_suffix
- to provide left primer suffix used in name field of--amplicon_bed
--amplicon_right_suffix
- to provide right primer suffix used in name field of--amplicon_bed
- Unify parameter specification with COG-UK pipeline:
--min_allele_freq
- minimum allele frequency threshold for calling variants--mpileup_depth
- SAMTools mpileup max per-file depth--ivar_exclude_reads
renamed to--ivar_trim_noprimer
--ivar_trim_min_len
- minimum length of read to retain after primer trimming--ivar_trim_min_qual
- minimum quality threshold for sliding window to pass--ivar_trim_window_width
- width of sliding window
- [#118] Updated GitHub Actions AWS workflow for small and full size tests.
--skip_qc
parameter
- Add mosdepth
0.2.6
- Add bioconductor-complexheatmap
2.2.0
- Add bioconductor-biostrings
2.54.0
- Add r-optparse
1.6.6
- Add r-tidyr
1.1.0
- Add r-tidyverse
1.3.0
- Add r-ggplot2
3.3.1
- Add r-reshape2
1.4.4
- Add r-viridis
0.5.1
- Update sra-tools
2.10.3
->2.10.7
- Update bowtie2
2.3.5.1
->2.4.1
- Update picard
2.22.8
->2.23.0
- Update minia
3.2.3
->3.2.4
- Update plasmidid
1.5.2
->1.6.3
[1.0.0] - 2020-06-01
Initial release of nf-core/viralrecon, created with the nf-core template.
This pipeline is a re-implementation of the SARS_Cov2_consensus-nf and SARS_Cov2_assembly-nf pipelines initially developed by Sarai Varona and Sara Monzon from BU-ISCIII. Porting both of these pipelines to nf-core was an international collaboration between numerous contributors and developers, led by Harshil Patel from the The Bioinformatics & Biostatistics Group at The Francis Crick Institute, London. We appreciated the need to have a portable, reproducible and scalable pipeline for the analysis of COVID-19 sequencing samples and so the Avengers Assembled!
- Download samples via SRA, ENA or GEO ids (
ENA FTP
,parallel-fastq-dump
; if required) - Merge re-sequenced FastQ files (
cat
; if required) - Read QC (
FastQC
) - Adapter trimming (
fastp
) - Variant calling
- Read alignment (
Bowtie 2
) - Sort and index alignments (
SAMtools
) - Primer sequence removal (
iVar
; amplicon data only) - Duplicate read marking (
picard
; removal optional) - Alignment-level QC (
picard
,SAMtools
) - Choice of multiple variant calling and consensus sequence generation routes (
VarScan 2
,BCFTools
,BEDTools
||iVar variants and consensus
||BCFTools
,BEDTools
)
- Read alignment (
- De novo assembly
- Present QC and visualisation for raw read, alignment, assembly and variant calling results (
MultiQC
)