- Minor version release for keeping Git History in sync
- No changes with respect to 1.4.1 on pipeline level
Major novel changes include:
- Update
igenomes.config
with NCBIGRCh38
and most recent UCSC genomes - Set
autoMounts = true
by default forsingularity
profile
Major novel changes include:
-
Support for Salmon as an alternative method to STAR and HISAT2
-
Several improvements in
featureCounts
handling of types other thanexon
. It is possible now to handle nuclearRNAseq data. Nuclear RNA has un-spliced RNA, and the whole transcript, including the introns, needs to be counted, e.g. by specifying--fc_count_type transcript
. -
Support for outputting unaligned data to results folders.
-
Added options to skip several steps
- Skip trimming using
--skipTrimming
- Skip BiotypeQC using
--skipBiotypeQC
- Skip Alignment using
--skipAlignment
to only use pseudo-alignment using Salmon
- Skip trimming using
- Adjust wording of skipped samples in pipeline output
- Fixed link to guidelines #203
- Add
Citation
andQuick Start
section toREADME.md
- Add in documentation of the
--gff
parameter
- Generate MultiQC plots in the results directory #200
- Get MultiQC to save plots as standalone files
- Get MultiQC to write out the software versions in a
.csv
file #185 - Use
file
instead ofnew File
to createpipeline_report.{html,txt}
files, and properly create subfolders
- Restore
SummarizedExperimment
object creation in the salmon_merge process avoiding increasing memory with sample size. - Fix sample names in feature counts and dupRadar to remove suffixes added in other processes
- Removed
genebody_coverage
process #195 - Implemented Pearsons correlation instead of Euclidean distance #146
- Add
--stringTieIgnoreGTF
parameter #206 - Removed unused
stringtie
channels forMultiQC
- Integrate changes in
nf-core/tools v1.6
template which resolved #90 - Moved process
convertGFFtoGTF
beforemakeSTARindex
#215 - Change all boolean parameters from
snake_case
tocamelCase
and vice versa for value parameters - Add SM ReadGroup info for QualiMap compatibility#238
- Obtain edgeR + dupRadar version information #198 and #112
- Add
--gencode
option for compatibility of Salmon and featureCounts biotypes with GENCODE gene annotations - Added functionality to accept compressed reference data in the pipeline
- Check that gtf features are on chromosomes that exist in the genome fasta file #274
- Maintain all gff features upon gtf conversion (keeps
gene_biotype
orgene_type
to makefeatureCounts
happy) - Add SortMeRNA as an optional step to allow rRNA removal #280
- Minimal adjustment of memory and CPU constraints for clusters with locked memory / CPU relation
- Cleaned up usage,
parameters.settings.json
and thenextflow.config
- Dependency list is now sorted appropriately
- Force matplotlib=3.0.3
- Picard 2.20.0 -> 2.21.1
- bioconductor-dupradar 1.12.1 -> 1.14.0
- bioconductor-edger 3.24.3 -> 3.26.5
- gffread 0.9.12 -> 0.11.4
- trim-galore 0.6.1 -> 0.6.4
- gffread 0.9.12 -> 0.11.4
- rseqc 3.0.0 -> 3.0.1
- R-Base 3.5 -> 3.6.1
- Dropped CSVtk in favor of Unix's simple
cut
andpaste
utilities - Added Salmon 0.14.2
- Added TXIMeta 1.2.2
- Added SummarizedExperiment 1.14.0
- Added SortMeRNA 2.1b
- Add tximport and summarizedexperiment dependency #171
- Add Qualimap dependency #202
Version 1.3 - 2019-03-26
- Added configurable options to specify group attributes for featureCounts #144
- Added support for RSeqC 3.0 #148
- Added a
parameters.settings.json
file for use with the newnf-core launch
helper tool. - Centralized all configuration profiles using nf-core/configs
- Fixed all centralized configs for offline usage
- Hide %dup in multiqc report
- Add option for Trimming NextSeq data properly (@jburos work)
- Fixing HISAT2 Index Building for large reference genomes #153
- Fixing HISAT2 BAM sorting using more memory than available on the system
- Fixing MarkDuplicates memory consumption issues following #179
- Use
file
instead ofnew File
to create thepipeline_report.{html,txt}
files to avoid creating local directories when outputting to AWS S3 folders
- RSeQC 2.6.4 -> 3.0.0
- Picard 2.18.15 -> 2.20.0
- r-data.table 1.11.4 -> 1.12.2
- bioconductor-edger 3.24.1 -> 3.24.3
- r-markdown 0.8 -> 0.9
- csvtk 0.15.0 -> 0.17.0
- stringtie 1.3.4 -> 1.3.6
- subread 1.6.2 -> 1.6.4
- gffread 0.9.9 -> 0.9.12
- multiqc 1.6 -> 1.7
- deeptools 3.2.0 -> 3.2.1
- trim-galore 0.5.0 -> 0.6.1
- qualimap 2.2.2b
- matplotlib 3.0.3
- r-base 3.5.1
Version 1.2 - 2018-12-12
- Removed some outdated documentation about non-existent features
- Config refactoring and code cleaning
- Added a
--fcExtraAttributes
option to specify more than ENSEMBL gene names infeatureCounts
- Remove legacy rseqc
strandRule
config code. #119 - Added STRINGTIE ballgown output to results folder #125
- HiSAT index build now requests
200GB
memory, enough to use the exons / splice junction option for building.- Added documentation about the
--hisatBuildMemory
option.
- Added documentation about the
- BAM indices are stored and re-used between processes #71
- Fixed conda bug which caused problems with environment resolution due to changes in bioconda #113
- Fixed wrong gffread command line #117
- Added
cpus = 1
toworkflow summary process
#130
Version 1.1 - 2018-10-05
- Wrote docs and made minor tweaks to the
--skip_qc
and associated options - Removed the depreciated
uppmax-modules
config profile - Updated the
hebbe
config profile to use the newwithName
syntax too - Use new
workflow.manifest
variables in the pipeline script - Updated minimum nextflow version to
0.32.0
- #77: Added back
executor = 'local'
for theworkflow_summary_mqc
- #95: Check if task.memory is false instead of null
- #97: Resolved edge-case where numeric sample IDs are parsed as numbers causing some samples to be incorrectly overwritten.
Version 1.0 - 2018-08-20
This release marks the point where the pipeline was moved from SciLifeLab/NGI-RNAseq over to the new nf-core community, at nf-core/rnaseq.
View the previous changelog at SciLifeLab/NGI-RNAseq/CHANGELOG.md
In addition to porting to the new nf-core community, the pipeline has had a number of major changes in this version. There have been 157 commits by 16 different contributors covering 70 different files in the pipeline: 7,357 additions and 8,236 deletions!
In summary, the main changes are:
- Rebranding and renaming throughout the pipeline to nf-core
- Updating many parts of the pipeline config and style to meet nf-core standards
- Support for GFF files in addition to GTF files
- Just use
--gff
instead of--gtf
when specifying a file path
- Just use
- New command line options to skip various quality control steps
- More safety checks when launching a pipeline
- Several new sanity checks - for example, that the specified reference genome exists
- Improved performance with memory usage (especially STAR and Picard)
- New BigWig file outputs for plotting coverage across the genome
- Refactored gene body coverage calculation, now much faster and using much less memory
- Bugfixes in the MultiQC process to avoid edge cases where it wouldn't run
- MultiQC report now automatically attached to the email sent when the pipeline completes
- New testing method, with data on GitHub
- Now run pipeline with
-profile test
instead of using bash scripts
- Now run pipeline with
- Rewritten continuous integration tests with Travis CI
- New explicit support for Singularity containers
- Improved MultiQC support for DupRadar and featureCounts
- Now works for all users instead of just NGI Stockholm
- New configuration for use on AWS batch
- Updated config syntax to support latest versions of Nextflow
- Built-in support for a number of new local HPC systems
- CCGA, GIS, UCT HEX, updates to UPPMAX, CFC, BINAC, Hebbe, c3se
- Slightly improved documentation (more updates to come)
- Updated software packages
...and many more minor tweaks.
Thanks to everyone who has worked on this release!