Skip to content

Commit

Permalink
Upgrading modules to use dedicated environment yml file
Browse files Browse the repository at this point in the history
  • Loading branch information
marchoeppner committed Mar 12, 2024
1 parent df11c9b commit 9005a10
Show file tree
Hide file tree
Showing 30 changed files with 194 additions and 40 deletions.
3 changes: 2 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ process {
withName: MULTIQC {
ext.prefix = "${params.run_name}_"
}


}
47 changes: 28 additions & 19 deletions docs/installation.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,43 @@
# Installation

## Site-specific config file
## Installing nextflow

This pipeline requires a site-specific configuration file to be able to talk to your local cluster or compute infrastructure. Nextflow supports a wide
range of such infrastructures, including Slurm, LSF and SGE - but also Kubernetes and AWS. For more information, see [here](https://www.nextflow.io/docs/latest/executor.html).
Nextflow is a highly portable pipeline engine. Please see the official [installation guide](https://www.nextflow.io/docs/latest/getstarted.html#installation) to learn how to set it up.

Please see conf/lsh.config for an example of how to configure this pipeline for a Slurm queue.
This pipeline expects Nextflow version 23.10.1, available [here](https://github.com/nextflow-io/nextflow/releases/tag/v23.10.1).

All software is provided through either Conda environments or Docker containers. Consider a Docker-compatible container engine if at all possible (Docker, Singularity, Podman). Conda environments are built on the fly during pipeline execution and only for a given pipeline run, which tends to slow things down quite a bit. Details on how to specify singularity as your container engine are provided in the config file for our lsh system (lsh.config).
## Software provisioning

With this information in place, you will next have to create an new site-specific profile for your local environment in `nextflow.config` using the following format:
This pipeline is set up to work with a range of software provisioning technologies - no need to manually install packages.

```
You can choose one of the following options:

[Docker](https://docs.docker.com/engine/install/)

[Singularity](https://docs.sylabs.io/guides/3.11/admin-guide/)

[Podman](https://podman.io/docs/installation)

[Conda](https://github.com/conda-forge/miniforge)

profiles {
your_profile {
includeConfig 'conf/base.config'
includeConfig 'conf/your_cluster.config'
includeConfig 'conf/resources.config'
}
}
The pipeline comes with simple pre-set profiles for all of these as described [here](usage.md); if you plan to use this pipeline regularly, consider adding your own custom profile to our [central repository](https://github.com/marchoeppner/configs) to better leverage your available resources.

## Installing the references

This pipeline requires locally stored genomes in fasta format. To build these, do:

```
nextflow run marchoeppner/eutaxpro -profile standard,singularity --build_references --run_name build_refs --outdir /path/to/references
```

This would add a new profile, called `your_profile` which uses (and expects) conda to provide all software.
where `/path/to/references` could be something like `/data/pipelines/references` or whatever is most appropriate on your system.

`base.config` Basic settings about resource usage for the individual pipeline stages.
If you do not have singularity on your system, you can also specify docker, podman or conda for software provisioning - see the [usage information](usage.md).

`resources.config` Gives information about the files that are to be used during analysis for the individual human genome assemblies.
The path specified with `--outdir` can then be given to the pipeline during normal execution as `--reference_base`. Please note that the build process will create a pipeline-specific subfolder that must not be given as part of the `--outdir` argument. This pipeline is part of a collection of pipelines that use a shared reference directory and it will choose the appropriate subfolder by itself.

## Site-specific config file

`your_cluster.config` Specifies which sort of resource manager to use and where to find e.g. local resources cluster file system (see below).
If you run on anything other than a local system, this pipeline requires a site-specific configuration file to be able to talk to your cluster or compute infrastructure. Nextflow supports a wide range of such infrastructures, including Slurm, LSF and SGE - but also Kubernetes and AWS. For more information, see [here](https://www.nextflow.io/docs/latest/executor.html).

Site-specific config-files for our pipeline ecosystem are stored centrally on [github](https://github.com/marchoeppner/nf-configs). Please talk to us if you want to add your system.
33 changes: 33 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
@@ -1 +1,34 @@
# Outputs

## Reports

<details markdown=1>
<summary>reports</summary>

Add information here

</details>

## Quality control

<details markdown=1>
<summary>MultiQC</summary>

- MultiQC/`name_of_pipeline_run`_multiqc_report.html: A graphical and interactive report of various QC steps and results

</details>

## Pipeline run metrics

<details markdown=1>
<summary>pipeline_info</summary>

This folder contains the pipeline run metrics

- pipeline_dag.svg - the workflow graph (only available if GraphViz is installed)
- pipeline_report.html - the (graphical) summary of all completed tasks and their resource usage
- pipeline_report.txt - a short summary of this analysis run in text format
- pipeline_timeline.html - chronological report of compute tasks and their duration
- pipeline_trace.txt - Detailed trace log of all processes and their various metrics

</details>
5 changes: 5 additions & 0 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -1 +1,6 @@
# Common issues





2 changes: 1 addition & 1 deletion info.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{ "name" : "nf-template", "version" : 1.2 }
{ "name" : "nf-template", "version" : 1.3 }
6 changes: 5 additions & 1 deletion lib/WorkflowPipeline.groovy
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
//
// This file holds several functions specific to the workflow/esga.nf in the nf-core/esga pipeline
// This file holds functions to validate user-supplied arguments
//

class WorkflowPipeline {
Expand All @@ -12,6 +12,10 @@ class WorkflowPipeline {
log.info 'Must provide a run_name (--run_name)'
System.exit(1)
}
if (!params.input && !params.build_references) {
log.info "Pipeline requires a sample sheet as input (--input)"
System.exit(1)
}
}

}
7 changes: 7 additions & 0 deletions modules/cat_fastq/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
name: cat_fastq
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- conda-forge::coreutils=8.30
4 changes: 2 additions & 2 deletions modules/cat_fastq/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ process CAT_FASTQ {
tag "$meta.sample_id"
label 'process_single'

conda 'conda-forge::sed=4.7'
conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/ubuntu:20.04' :
'ubuntu:20.04' }"
'nf-core/ubuntu:20.04' }"

input:
tuple val(meta), path(reads, stageAs: 'input*/*')
Expand Down
7 changes: 7 additions & 0 deletions modules/custom/dumpsoftwareversions/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
name: custom_dumpsoftwareversions
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::multiqc=1.20
5 changes: 4 additions & 1 deletion modules/custom/dumpsoftwareversions/main.nf
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
process CUSTOM_DUMPSOFTWAREVERSIONS {
label 'short_serial'

container 'quay.io/biocontainers/multiqc:1.11--pyhdfd78af_0'
conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/multiqc:1.20--pyhdfd78af_0' :
'qua.io/biocontainers/multiqc:1.20--pyhdfd78af_0' }"

input:
path versions
Expand Down
7 changes: 7 additions & 0 deletions modules/fastp/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
name: fastp
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::fastp=0.23.4
2 changes: 1 addition & 1 deletion modules/fastp/main.nf
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
process FASTP {
label 'short_parallel'

conda 'bioconda::fastp=0.23.4'
conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/fastp:0.23.4--hadf994f_2' :
'quay.io/biocontainers/fastp:0.23.4--hadf994f_2' }"
Expand Down
7 changes: 7 additions & 0 deletions modules/gunzip/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
name: gunzip
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- conda-forge::sed=4.7
2 changes: 1 addition & 1 deletion modules/gunzip/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ process GUNZIP {

publishDir "${params.outdir}/${meta.target}/${meta.tool}", mode: 'copy'

conda 'sed=4.7'
conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/ubuntu:20.04' :
'ubuntu:20.04' }"
Expand Down
7 changes: 7 additions & 0 deletions modules/multiqc/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
name: multiqc
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::multiqc=1.21
7 changes: 4 additions & 3 deletions modules/multiqc/main.nf
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
process MULTIQC {
conda 'bioconda::multiqc=1.19'

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/multiqc:1.19--pyhdfd78af_0' :
'quay.io/biocontainers/multiqc:1.19--pyhdfd78af_0' }"
'https://depot.galaxyproject.org/singularity/multiqc:1.21--pyhdfd78af_0' :
'quay.io/biocontainers/multiqc:1.21--pyhdfd78af_0' }"

input:
path('*')
Expand Down
8 changes: 8 additions & 0 deletions modules/samtools/ampliconclip/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: samtools_ampliconclip
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::samtools=1.19.2
- bioconda::htslib=1.19.1
3 changes: 2 additions & 1 deletion modules/samtools/ampliconclip/main.nf
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
process SAMTOOLS_AMPLICONCLIP {
conda 'bioconda::samtools=1.19.2'

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' :
'quay.io/biocontainers/samtools:1.19.2--h50ea8bc_0' }"
Expand Down
8 changes: 8 additions & 0 deletions modules/samtools/dict/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: samtools_dict
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::samtools=1.19.2
- bioconda::htslib=1.19.1
2 changes: 1 addition & 1 deletion modules/samtools/dict/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ process SAMTOOLS_DICT {

tag "${fasta}"

conda 'bioconda::samtools=1.19.2'
conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' :
'quay.io/biocontainers/samtools:1.19.2--h50ea8bc_0' }"
Expand Down
8 changes: 8 additions & 0 deletions modules/samtools/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: samtools_ampliconclip
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::samtools=1.19.2
- bioconda::htslib=1.19.1
8 changes: 8 additions & 0 deletions modules/samtools/faidx/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: samtools_faidx
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::samtools=1.19.2
- bioconda::htslib=1.19.1
2 changes: 1 addition & 1 deletion modules/samtools/faidx/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ process SAMTOOLS_FAIDX {

tag "${fasta}"

conda 'bioconda::samtools=1.19.2'
conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' :
'quay.io/biocontainers/samtools:1.19.2--h50ea8bc_0' }"
Expand Down
8 changes: 8 additions & 0 deletions modules/samtools/index/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: samtools_index
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::samtools=1.19.2
- bioconda::htslib=1.19.1
3 changes: 2 additions & 1 deletion modules/samtools/index/main.nf
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
process SAMTOOLS_INDEX {
conda 'bioconda::samtools=1.19.2'

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' :
'quay.io/biocontainers/samtools:1.19.2--h50ea8bc_0' }"
Expand Down
8 changes: 8 additions & 0 deletions modules/samtools/markdup/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: samtools_markdup
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::samtools=1.19.2
- bioconda::htslib=1.19.1
3 changes: 2 additions & 1 deletion modules/samtools/markdup/main.nf
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
process SAMTOOLS_MARKDUP {
conda 'bioconda::samtools=1.19.2'

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' :
'quay.io/biocontainers/samtools:1.19.2--h50ea8bc_0' }"
Expand Down
8 changes: 8 additions & 0 deletions modules/samtools/merge/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
name: samtools_merge
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- bioconda::samtools=1.19.2
- bioconda::htslib=1.19.1
2 changes: 1 addition & 1 deletion modules/samtools/merge/main.nf
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
process SAMTOOLS_MERGE {
label 'medium_parallel'

conda 'bioconda::samtools=1.19.2'
conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' :
'quay.io/biocontainers/samtools:1.19.2--h50ea8bc_0' }"
Expand Down
12 changes: 8 additions & 4 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,14 @@ params {
conda.enabled = false
singularity.enabled = false
docker.enabled = false

publish_dir_mode = 'copy'
podman.enabled = false

max_memory = 128.GB
max_cpus = 16
max_time = 240.h
maxMultiqcEmailFileSize = 25.MB

publish_dir_mode = 'copy'
custom_config_base = "https://raw.githubusercontent.com/marchoeppner/nf-configs/main"
}

Expand Down Expand Up @@ -87,10 +87,14 @@ profiles {
singularity.enabled = true
singularity.autoMounts = true
}
conda {
conda.enabled = true
}
podman {
podman.enabled = true
}
test {
includeConfig 'conf/test.config'
includeConfig 'conf/base.config'
includeConfig 'conf/resources.config'
}
}

Expand Down

0 comments on commit 9005a10

Please sign in to comment.