This script runs all the steps for the sequence processing and filtering of class 1 integron-16S epicPCR products sequenced with Oxford Nanopore Technologies. The expected epicPCR products are generated from a fusion PCR that links class 1 integron gene cassette arrays amplified from single cells, isolated in aqueous phase droplets, to the 16S rRNA marker gene from the same cells. See a schematic of the experimental workflow. For more detailed information, see our paper describing the method.
As input, this pipeline takes a raw fastq file (WARNING: DO NOT trim adapters during basecalling), and outputs a single fasta file containing a set of full-length, primer-oriented epicPCR amplicon consensus sequences that contain a complete attI1 sequence, epicPCR bridging primer, and the 16S rRNA marker gene.
R519-qacE: 5'-GWATTACCGCGGCKGCTGGCGGAAGGGGCAAGCTTAGT-3' # Bridging primer that fuses class 1 integron cassette array amplicons with 16S amplicons
intI1_nested: 5'-CGAACGCAGCGGTGGTAA-3' # Nested forward primer that anneals to the class 1 integron-integrase gene (intI1).
AP27_short: 5'-GCTCTTCCGATCTGGACTACHVGGGTWTCTAAT-3' # Nested reverse primer that anneals to the 16S rRNA gene (R806 annealing site).
N.B. If custom primers have been used, the script must be modified with their respective sequences in lines 132-143 (nested primer pair) and line 264 for bridging primer. Also, the length of the bridging primer blast alignment should be modified from '34' in line 289 to suite the custom bridging primer (recommended ~90% of primer length). The NanoFilt minimum read length paramater (-l) on line 114 might also need adjusting. This should be less than the minimum epicPCR product length with a cassette-less integron.
After the below dependencies are installed, simply download the Int1-epicPCR.sh script and run. See usage details below.
A system-wide installation (e.g., via conda) is required for the following programs:
N.B. The versions listed have been tested using the pipeline, but newer and/or older versions will most likely also work
NanoFilt v2.8.0
Pychopper v2.7.0
isONclust v0.0.6.1
isONcorrect v0.0.8
Spoa v4.0.7
BBTools v35
BLAST v2.2.31
SAMtools v1.12
Metaxa2 v2.2.3
usage: ./Int1-epicPCR.sh -r [optional arguments]
Mandatory arguments:
-r : Nanopore reads as a single fastq file (accepts gz compressed or uncompressed format)
Optional arguments:
-t : number of CPUs | default: 1
-o : output directory | default: current directory
Qi, Q., Ghaly, T.M., Penesyan, A., Rajabal, V., Stacey, J.A., Tetu, S.G. and Gillings, M.R. 2023. Uncovering bacterial hosts of class 1 integrons in an urban coastal aquatic environment with a single-cell fusion-polymerase chain reaction technology. Environmental Science & Technology, 57(12): 4870-4879. https://doi.org/10.1021/acs.est.2c09739