Skip to content

Latest commit

 

History

History
40 lines (26 loc) · 1.63 KB

README.md

File metadata and controls

40 lines (26 loc) · 1.63 KB

VCFPhylo

This script generates Euclidean genetic distancedistance matrix(es) between pairs of individuals in a vcf file to generate a phylogenetic tree. Transversional genetic variants are weighted to two.

INSTALLATION: Not required

INPUT FILE: gz-compressed vcf file

HOW TO USE:

<1> converting gz-compressed vcf file to genotype file

$ perl reducedvcf.pl VCF_FILE GENOTYPE_FILE , where VCF_FILE is the file name of your gz-compressed vcf file and GENOTYPE_FILE is the name of output genotype file. When you finished running this script, GENOTYPE_FILE.gz will be created.

<2> creating distance matrixes from the gz-compressed genotype file.

$ perl VCF_FILE GENOTYPE_FILE OUTPUT_PREFIX NUMBER_OF_BOOTSTRAPPING_REPLICATES , where VCF_FILE is the name of gz-compressed vcf file,and GENOTYPE_FILE is gz-compressed genotype file, OUTPUT_PREFIX is output prefix, and NUMBER_OF_BOOTSTRAPPING_REPLICATES is the number of bootstrapping replication.

When you finished running this file, the following two files will be created. (a) OUTPUT_PREFIX.bg.tbl => Distance matrix showing Euclidean distance between a pair of individiuals (b) OUTPUT_PREFIX.boot.tbl => Boostrapping distance matrixes generated by resampling

<3> Generating phylogenetic tree

You can use external software to generate a phylogenetic tree. For example, you can use FastME (http://www.atgc-montpellier.fr/fastme/).

<4> Generating a bootstrapping consensus tree

You can use consense in the Phylip package for this. http://evolution.genetics.washington.edu/phylip/

Citation

Please cite this paper if you use these scripts: https://link.springer.com/article/10.1186/s12862-020-01715-3