Skip to content

KolmogorovLab/hapdiff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hapdiff

This is a simple SV calling package for diploid assemblies. It uses a modified version of svim-asm. The package includes its own version minimap2 to ensure reproducibility between runs, as the result might be dependent on the aligner version and parameters.

Version 0.9

Quick start

Dipdiff takes as input reference genome and a pair of haplotypes, and outputs structural vaiant calls in VCF format. A recommended way to run is the Docker distribution.

Next steps assume that your ref.fasta, hap_1.fasta and hap_2.fasta are in the same directory, which will also be used for hapdiff output. If it is not the case, you might need to bind additional directories using the Docker's -v / --volume argument. The number of threads (-t argument) should be adjusted according to the available resources.

cd directory_with_input
DD_DIR=`pwd`
docker run -v $DD_DIR:$DD_DIR -u `id -u`:`id -g` mkolmogo/hapdiff:0.9 \
  hapdiff.py --reference $DD_DIR/ref.fasta --pat $DD_DIR/hap_1.fasta --mat $DD_DIR/hap_2.fasta --out-dir $DD_DIR/hapdiff -t 20

Output files

The output directory will contain hapdiff_unphased.vcf.gz and hapdiff_phased.vcf.gz files with structural variants. Both files represent the same SVs, but in either phased or unphased VCF.

Output also contains confident_regions.bed that reflects the regions of the reference where SV calls are comprehensive.

Source Installation

Alernatively, you can run hapdiff locally as follows.

git clone https://github.com/KolmogorovLab/hapdiff
cd hapdiff
git submodule update --init
make
pip install -r requirements.txt

In addition, hapdiff requires samtools and bedtools to be installed in your system.

Afterwards, you can execute:

./hapdiff.py --reference ref.fasta --pat hap_1.fasta --mat hap_2.fasta --out-dir out_path -t 20

Acknowledgements

The major parts of the hapdiff pipeline are:

Authors

The pipeline was originally developed at Paten lab at UC Santa Cruz. The work continues at Kolmogorov lab at NCI.

Main code contributors:

  • Mikhail Kolmogorov

License

hapdiff is distributed under a BSD license. See the LICENSE file for details. Other software included in this discrubution is released under either MIT or BSD licenses.

How to get help

A preferred way report any problems or ask questions is the issue tracker.

About

SV calling for diploid assemblies

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published