statgen/verifyBamID v1.1.3
This app runs verifyBamID to detect sample contamination from population allele frequencies.
verifyBamID models the sequence reads as mixture of two unknown samples based on the allele frequency information in a provided VCF file. Here, the VCF file used is the Omni 2.5M SNP array VCF which contains allele frequency information across individuals from the 1,000 Genomes project (chr20 only). Markers are selected from the VCF where AF >= 0.010000 and callRate >= 0.500000, and contaminated sites are determined based on greater than expected heterozygousity rates.
The required VCF is packaged with the app under resources/home/dnanexus. See documentation for further details.
This app should be executed stand-alone or as part of a DNA Nexus workflow for a single sample.
- Skip - Boolean True/False. If skip == True do not run the tool and no outputs are provided
- BAM file (*.bam)
- BAI file (*bai)
The app outputs three files, where [outPrefix] is the bam filename without extension:
- [outPrefix].selfSM - Per-sample statistics describing how well the sample matches to the annotated sample (tab-delimited file)
- [outPrefix].depthSM - The depth distribution of the sequence reads per sample (tab-delimited file)
- [outPrefix].log - verifyBamID log file
The column freeMix in the [outPrefix].selfSM file contains the % contamination predicted for the sample (where 0.1=10%).
The app runs verifyBamID using an input BAM file and uploads the outputs to DNA Nexus.
This app has been verified to run on WES samples only. verifyBamID will only assess autosomal chromosomes in the input VCF.