Germline CNV Detection Pipeline for Whole Exome data

Introduction

This pipeline is designed to detect CNV in many exome samples prepared with various exome capture kits. The pipeline includes a clustering step which attempts to separate samples into smaller groups which have similar coverage patterns. Given that there are enough number (>12) of samples from each exome panel/kit, the clustering should reduce the batch effects and increase the sensitivity and specificity. The pipeline currently supports GRCh37 genome version only.

Software Requirements

If you can't use Docker/Singularity:

Python >3.6
- hdbscan, scikit-learn, plotly, pandas, sci-py
R >3.5
AnnotSV for annotation

How to run

Clone this repository.
Download a copy of Cromwell
Run Rscript install_CNV_prerequisites.R to install required R packages.
or
pull the Docker container:
docker pull berguner/bsf_cnv_pipeline:latest
singularity pull docker://berguner/bsf_cnv_pipeline:latest
Collect the aligned and duplicate marked .bam and .bai files in a folder. Symbolic links would also work.
Update the test/test.inputs.json file according to your setup.
- Create a sample annotation sheet (CSV) with a column named Sample Name containing the sample names.
Create a backend configuration file for Cromwell. You can modify the one of the provided backends in this repository.
cd to the project_folder and run the pipeline:

java -Xmx4g -Dconfig.file=/path/to/backend.conf \
  -jar /path/to/cromwell-59.jar \
  run /path/to/cnv_pipeline/cnv_pipeline.wdl \
  --inputs /path/to/my.inputs.json

Pipeline steps

Count reads on exonic regions.
Clustering samples based on coverage (read count) patterns.
Running CNV calling on each cluster of samples.
Merging results from CODEX2 and ExomeDepth.
Annotation of merged results using AnnotSV.
Aggregation of deleteions found in the cohort.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
backends		backends
data		data
old		old
test		test
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
aggregate_cnv_deletions.R		aggregate_cnv_deletions.R
annotate_cnv.sh		annotate_cnv.sh
cluster_samples.py		cluster_samples.py
cnv_pipeline.wdl		cnv_pipeline.wdl
codex_chromosome_CNV.R		codex_chromosome_CNV.R
codex_exome_coverage.R		codex_exome_coverage.R
exomedepth_CNV.R		exomedepth_CNV.R
extract_codex_sample_results.py		extract_codex_sample_results.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Germline CNV Detection Pipeline for Whole Exome data

Introduction

Software Requirements

How to run

Pipeline steps

About

Releases

Packages

Languages

berguner/cnv_pipeline

Folders and files

Latest commit

History

Repository files navigation

Germline CNV Detection Pipeline for Whole Exome data

Introduction

Software Requirements

How to run

Pipeline steps

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages