GSC2019-CompLab

Files for the NHGRI Genomics Short Course - Microbiome - Computational Lab

Now LIVE : Microbiome Virtual Lab Exploration

Introduction

One of the most basic questions a microbiome researcher can ask is: “What is in my sample?” There are lots of ways you can answer that question and the method you choose will determine how much biological detail you can resolve. While the microbiome refers to the collection of bacteria, fungi, viruses, protists and metazoans in a sample; Researchers are often specifically interested in the bacterial component. Identification of bacteria is based on a taxonomic hierarchy. For instance, most people are familiar with the bacteria Escherichia coli. The bacteria E. coli is in the family Enterobacteriaceae and the phylum Proteobacteria; here is the full taxonomic hierarchy for E. coli:

Bacteria (Kingdom); Proteobacteria (Phylum); Gammaproteobacteria (Class); Enterobacterales (Order); Enterobacteriaceae (Family); Escherichia (Genus); Escherichia coli (Species)

While taxonomic levels often stop at “species”, there are additional taxonomic levels that allow scientists to categorize bacteria in finer detail (e.g., strains). The level of detail you can get from a microbiome experiment depends on the experimental method, see Maiden et al, Nature Reviews 2013.

The files below are part of a larger lesson plan to expose students to microbiome sequences using 16S rRNA sequences. Please visit NHGRI Genomics Short Course for the full lesson plan.

Files

FASTA sequence files: ba04826.sub100.fasta, st06686.sub100.fasta, to10842.sub100.fasta, vf03604.sub100.fasta, DOK03.fasta
RDP classifier results: ba_rdp_result.pdf, st_rdp_result.pdf, to_rdp_result.pdf, vf_rdp_result.pdf, DOK03_rdp_result.pdf
DOK03_piechart.xlsx - Example pie chart derived from the DOK3 data. Note that piecharts are easy to make but aren't great for representing data (see https://en.wikipedia.org/wiki/Pie_chart)
nihms424103.pdf - Review paper by Grice and Segre
key.txt - Key describing each of the sequence files (also reproduced below)

How were the fasta files generated

The fasta files were generated by subsampling a larger sequence file, produced using pyrosequencing on a Roche 454 instrument. Subsampling was done using seqkit: seqkit sample -n 100 input.fasta > output.sub100.fasta

Key

ba04826.sub100.fasta

These sequences are from the skin, specifically the back. Students should be able to tell that this is a skin site because it is 97% Actinobacteria. The back, in particular, is considered an oily site and is dominated by the bacterial genus "Propionibacteria" (also called Cutibacteria)

st06686.sub100.fasta

The sequences are from the stool. The dominance of Bacteroidetes (22%) and Firmicutes (74%) is how you know.

to10842.sub100.fasta

This is the mouth (Tongue). Streptococcus (29%) is a common mouth bacterial genus.

vf03604.sub100.fasta

Skin, forearm. You can tell it is skin because of the dominant Actinobacteria (63%). Unlike the back, the forearm is considered more of a dry site and shows more diversity of bacterial genera.

DOK03.fasta is a published Agricultural soil microbiome. Included here as an example to work in class. It has 1904 sequences. Downloaded from the mothur website.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GSC2019-CompLab

Introduction

Files

How were the fasta files generated

Key

About

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
DOK03.fasta		DOK03.fasta
DOK03_piechart.xlsx		DOK03_piechart.xlsx
DOK03_rdp_result.pdf		DOK03_rdp_result.pdf
LICENSE		LICENSE
README.md		README.md
ba04826.sub100.fasta		ba04826.sub100.fasta
ba_rdp_result.pdf		ba_rdp_result.pdf
key.txt		key.txt
nihms424103.pdf		nihms424103.pdf
st06686.sub100.fasta		st06686.sub100.fasta
st_rdp_result.pdf		st_rdp_result.pdf
to10842.sub100.fasta		to10842.sub100.fasta
to_rdp_result.pdf		to_rdp_result.pdf
vf03604.sub100.fasta		vf03604.sub100.fasta
vf_rdp_result.pdf		vf_rdp_result.pdf

License

sconlan/GSC2019-CompLab

Folders and files

Latest commit

History

Repository files navigation

GSC2019-CompLab

Introduction

Files

How were the fasta files generated

Key

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages