-
Notifications
You must be signed in to change notification settings - Fork 0
/
nf_hisat2
executable file
·176 lines (127 loc) · 7.6 KB
/
nf_hisat2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
#!/usr/bin/env nextflow
nextflow.enable.dsl=2
/* ========================================================================================
OUTPUT DIRECTORY
======================================================================================== */
params.outdir = false
if(params.outdir){
outdir = params.outdir
} else {
outdir = '.'
}
/* ========================================================================================
PARAMETERS
======================================================================================== */
params.genome = ''
params.hisat2_args = ''
params.verbose = false
params.single_end = false // default mode is auto-detect. NOTE: params are handed over automatically
params.list_genomes = false
params.help = false
/* ========================================================================================
MESSAGES
======================================================================================== */
// Show help message and exit
if (params.help){
helpMessage()
exit 0
}
if (params.list_genomes){
println ("List genomes selected [WORKLFOW]")
}
if (params.verbose){
println ("[WORKFLOW] HISAT2 ARGS ARE: " + params.hisat2_args)
}
/* ========================================================================================
GENOMES
======================================================================================== */
include { getGenome; listGenomes } from './nf_modules/genomes.mod.nf'
if (params.list_genomes){
listGenomes() // this lists all available genomes, and exits
}
genome = getGenome(params.genome)
/* ========================================================================================
FILES CHANNEL
======================================================================================== */
include { makeFilesChannel; getFileBaseNames } from './nf_modules/files.mod.nf'
file_ch = makeFilesChannel(args)
/* ========================================================================================
WORKFLOW
======================================================================================== */
include { HISAT2 } from './nf_modules/hisat2.mod.nf' params(genome: genome)
workflow {
main:
HISAT2(file_ch, outdir, params.hisat2_args, params.verbose)
}
workflow.onComplete {
def msg = """\
Pipeline execution summary
---------------------------
Jobname : ${workflow.runName}
Completed at: ${workflow.complete}
Duration : ${workflow.duration}
Success : ${workflow.success}
workDir : ${workflow.workDir}
exit status : ${workflow.exitStatus}
"""
.stripIndent()
sendMail(to: "${workflow.userName}@ethz.ch", subject: 'Minimal pipeline execution report', body: msg)
}
/* ========================================================================================
HELP MESSAGE
======================================================================================== */
def helpMessage() {
log.info"""
>>
SYNOPSIS:
This workflow takes in a list of filenames (in FastQ format) and aligns them to a genome using HISAT2.
If you run HISAT2 in this stand-alone workflow it is assumed that you know what you are doing, e.g. raw FastQ files should
have been trimmed appropriately. If called as is, HISAT2 is run in end-to-end mode (we are adding the options '--no-unal
--no-softclip', and additionally for paired-end files '--no-mixed --no-discordant' to only allow concordant read pair
alignmnents). To add additional parameters, please consider tool-specific arguments that are compatible with HISAT2
(see '--hisat2_args' below).
==============================================================================================================
USAGE:
nf_hisat2 [options] --genome <genomeID> <input files>
Mandatory arguments:
====================
<input files> List of input files, e.g. '*fastq.gz' or '*fq.gz'. In theory, files will automatically be
processed as single-end or paired end files (if file pairs share the same base-name, and
differ only by a different read number, e.g. 'base_name_R1.fastq.gz' and 'base_name_R2.fastq.gz'
(or R3, R4). To run paired-end files in single-end mode, please see '--single_end' below.
--genome [str] Genome build ID to be used for the alignment, e.g. GRCh38 (latest human genome) or GRCm38
(latest mouse genome build). To list all available genomes, see '--list_genomes' below.
Tool-specific options:
======================
--hisat2_args="[str]" This option can take any number of options that are compatible with HISAT2 to modify its
default mapping behaviour. For more detailed information on available options please refer
to the HISAT2 User Guide, or run 'hisat2 --help' on the command line. As an example, to disable
spliced alignments and tolerate a miximum fragment length of 1500 bp, use:
' --hisat2_args="--no-spliced-alignment --maxins 1500" '. Please note that the format
="your options" needs to be strictly adhered to in order to work correctly.
[Default: None]
Other options:
==============
--outdir [str] Path to the output directory. [Default: current working directory]
--list_genomes List all genome builds that are currently available to choose from. To see this list
of available genomes with more detailed information about paths and indexes, run
the command as '--list_genomes --verbose'
--single_end Force files of a read pair to be treated as single-end files. [Default: auto-detect]
--verbose More verbose status messages. [Default: OFF]
--help Displays this help message and exits.
Workflow options:
=================
Please note the single '-' hyphen for the following options!
-resume If a pipeline workflow has been interrupted or stopped (e.g. by accidentally closing a laptop),
this option will attempt to resume the workflow at the point it got interrupted by using
Nextflow's caching mechanism. This may save a lot of time.
-bg Sends the entire workflow into the background, thus disconnecting it from the terminal session.
This option launches a daemon process (which will keep running on the headnode) that watches over
your workflow, and submits new jobs to the SLURM queue as required. Use this option for big pipeline
jobs, or whenever you do not want to watch the status progress yourself. Upon completion, the
pipeline will send you an email with the job details. This option is HIGHLY RECOMMENDED!
-process.executor=local Temporarily changes where the workflow is executed to the 'local' machine. See also the nextflow.config
file for more details. [Default: slurm]
<<
""".stripIndent()
}