sv-callers

Structural variants (SVs) are an important class of genetic variation implicated in a wide array of genetic diseases. sv-callers is a Snakemake-based workflow that combines several state-of-the-art tools for detecting SVs in whole genome sequencing (WGS) data. The workflow is easy to use and deploy on any Linux-based machine. In particular, the workflow supports automated software deployment, easy configuration and addition of new analysis tools as well as enables to scale from a single computer to different HPC clusters with minimal effort.

Dependencies

Python 3
Conda - package/environment management system
Snakemake - workflow management system
Xenon CLI - command-line interface to compute and storage resources
jq - command-line JSON processor (optional)

The workflow includes the following bioinformatics tools:

SV callers
- Manta
- DELLY
- LUMPY
- GRIDSS
Post-processing
- BCFtools
- SURVIVOR

1. Clone this repo.

git clone https://github.com/GooglingTheCancerGenome/sv-callers.git
cd sv-callers

2. Install dependencies.

# download Miniconda3 installer
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
# install Conda (respond by 'yes')
bash miniconda.sh
# update Conda
conda update -y conda
# create & activate new env with installed deps
conda env create -n wf -f environment.yaml
conda activate wf
cd snakemake

3. Configure the workflow.

config files:
- analysis.yaml - analysis-specific settings (e.g., workflow mode, I/O files, SV callers, post-processing or resources used etc.)
- environment.yaml - software dependencies and versions
input files:
- example data in sv-callers/snakemake/data directory
- reference genome in .fasta (incl. index files)
- excluded regions in .bed (optional)
- WGS samples in .bam (incl. index files)
- list of (paired) samples in samples.csv
output files:
- (filtered) SVs per caller and merged calls in .vcf (incl. index files)

4. Execute the workflow.

# 'dry' run only checks I/O files
snakemake -np

# 'vanilla' run (default) mimics the execution of SV callers by writing (dummy) VCF files
snakemake -C echo_run=1

Note: One sample or a tumor/normal pair generates eight SV calling jobs (i.e., 1 x Manta, 1 x LUMPY, 1 x GRIDSS and 5 x DELLY) and six post-processing jobs. See the workflow instance of single-sample (germline) or paired-sample (somatic) analysis.

Submit jobs to Slurm or GridEngine cluster

SCH=slurm   # or gridengine
snakemake -C echo_run=1 mode=p enable_callers="['manta','delly','lumpy','gridss']" --use-conda --latency-wait 30 --jobs 14 \
--cluster "xenon scheduler $SCH --location local:// submit --name smk.{rule} --inherit-env --cores-per-task {threads} --max-run-time 1 --max-memory {resources.mem_mb} --working-directory . --stderr stderr-%j.log --stdout stdout-%j.log" &>smk.log&

To perform SV calling:

overwrite (default) parameters directly in analysis.yaml or via the snakemake CLI (use the -C argument)
- set echo_run=0
- choose between two workflow modes: single- (s) or paired-sample (p - default)
- select one or more callers using enable_callers (default all: "['manta','delly,'lumpy','gridss']")
use xenon CLI to set:
- --max-run-time of workflow jobs (in minutes)
- --temp-space (optional, in MB)
adjust compute requirements per SV caller according to the system used:
- the number of threads,
- the amount of memory(in MB),
- the amount of temporary disk space or tmpspace (path in TMPDIR env variable) can be used for intermediate files by LUMPY and GRIDSS only.

Query job accounting information

SCH=slurm   # or gridengine
xenon --json scheduler $SCH --location local:// list --identifier [jobID] | jq ...

Name		Name	Last commit message	Last commit date
Latest commit History 372 Commits
doc		doc
snakemake		snakemake
.editorconfig		.editorconfig
.gitignore		.gitignore
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
install.sh		install.sh
run.sh		run.sh
test-requirements.txt		test-requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sv-callers

Dependencies

About

Releases

Packages

Languages

License

neurocompnerds/sv-callers

Folders and files

Latest commit

History

Repository files navigation

sv-callers

Dependencies

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages