Vladimir Potapov and Jennifer L. Ong
This repository provides a set of scripts for a PacBio single-molecule sequencing assay used to comprehensively catalog the different types of errors introduced during PCR. The full details of the method are available in the following publication:
Potapov V. & Ong JL. Examining Sources of Error in PCR by Single-Molecule Sequencing. PLOS ONE. 2016. doi:10.1371/journal.pone.0169774. (View article)
The analysis workflow consists of several consecutive steps as outlined below. The correspondng scripts can be found in scripts/
directory.
-
Generating strand-specific consensus sequences (
ccs2-consensus.sh
). -
Mapping consensus sequences to a reference sequence and extracting mutations (
ccs2-mutation.sh
). -
Filtering and summarizing mutations (
ccs2-chimeric.sh
,ccs2-summary.sh
) -
Additional scripts to analyze template switching and PCR-mediated recombination can be found in
extra/
directory.
The included workflow provides detailed instructions for data analysis conducted in the original publication (1).
- SMRT Link. Scripts rely on
bax2bam
,ccs
andblasr
command-line utilities, which are developed by Pacific Biosciences, Inc. and distributed as part of the SMRT Link software. - SAMtools. Manipulating BAM files.
- BWA. Detecting chimeric reads.
- P7ZIP. Compressing output files to minimize disk usage.
- The R Project for Statistical Computing. Extracting and tabulating data.
- Potapov V. & Ong JL. Examining Sources of Error in PCR by Single-Molecule Sequencing. PLOS ONE. 2016. doi:10.1371/journal.pone.0169774. (View article)