Analysis of Amplicon sequences of the RbcL chloroplast gene Locus

Adam Rivers
October 17, 2023
USDA-ARS-GBRU

Project outline

This project processed 672 samples. The samples were multiplexed with combinatorial inline barcodes
and then second indexed with Illumina barcodes. The data were sequences in 2 x 151 mode on an Illumina Miseq

Data processing

The data were processed with bbtools, Ultraplex and Qiime2. the script qiime_runscript.sh provides a general overview of the workflow.

Classifier Training

The RbcL reference dataset used for Qiime2 classifier training is from https://doi.org/10.6084/m9.figshare.c.5504193.v1

Bell KL, Loeffler VM, Brosi BJ. An rbcL reference library to aid in the identification of plant species mixtures by DNA metabarcoding. Appl Plant Sci. 2017 Mar 10;5(3):apps.1600110. doi: 10.3732/apps.1600110.

This script splits the data in the Bell dataset into a two-column taxonomy.txt file and a fasta file.
Sequences are identified by an md5 hash rather than a taxonomy string with gaps that make the ids unreliable
Taxids are removed from the taxonomy string for use in Qiime2
Reformats some randomly interspersed RNA sequences to DNA
Removes exact sequence duplicates in from the reference data set, retaining the first entry encountered. An RNA sequence transcribed from the same DNA sequence is also considered identical and removed.

this was processed with the script rbcl_db_parser.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
testdata		testdata
manifest_filter.py		manifest_filter.py
qiime_runscript.sh		qiime_runscript.sh
rbcl_db_parser.py		rbcl_db_parser.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of Amplicon sequences of the RbcL chloroplast gene Locus

Project outline

Data processing

Classifier Training

About

Releases

Packages

Languages

USDA-ARS-GBRU/sewrl_stinkbug_diet

Folders and files

Latest commit

History

Repository files navigation

Analysis of Amplicon sequences of the RbcL chloroplast gene Locus

Project outline

Data processing

Classifier Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages