Analysis pipeline for Johnson et al. 2021

This is the analysis pipeline for Johnson et al 2021, "Genomic and Chemical Diversity of Commercially Available Industrial Hemp Accessions." It contains all analysis scripts starting from turning aligned BAM files into genotypes. It also includes key intermediate files so that users can pick up partway instead of having to run the entire thing from scratch.

Setting up the Conda environment

This pipieline uses a Conda environment for reproducibility. To set up an identical environment:

Install Anaconda (I recommend the Miniconda version)
Follow the instructions in 0_CreateCondaEnvironment.sh to initialize an identical environment

Running the analysis

After setting up your conda enviroment (above), you can run 0_RerunHempGenotyping.sh. The script assumes you have already aligned the FASTQ files to a genome and gotten BAM files out. If you don't want to rerun the entire pipeline from scratch, you can start any of the key intermediate files included in this repo.

FASTQ files from the paper (adaptor-trimmed and restriction-fragment-verified): NCBI Bioproject PRJNA707556
Hemp genome: CBDRx version GCF_900626175.1

Key files in this repository

Genotypes: 1g_hemp_filtered.sorted.vcf.gz
Keyfile linking sample number to accession name: 0_HDGS_keyfile.txt
HPLC results from each sample: 0_HDGS_HPLC_Results.csv

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
0_Scripts		0_Scripts
2_RevisionAdditions		2_RevisionAdditions
bak_compare_genos_KEEP_FOR_REFERENCE		bak_compare_genos_KEEP_FOR_REFERENCE
.Renviron		.Renviron
0_CreateCondaEnvironment.sh		0_CreateCondaEnvironment.sh
0_HDGS_HPLC_Results.csv		0_HDGS_HPLC_Results.csv
0_RerunHempGenotyping.sh		0_RerunHempGenotyping.sh
0_bwa_commands.sh		0_bwa_commands.sh
0_sample_read_counts.tsv		0_sample_read_counts.tsv
1g_hemp_filtered.sorted.tre		1g_hemp_filtered.sorted.tre
1g_hemp_filtered.sorted.vcf.gz		1g_hemp_filtered.sorted.vcf.gz
1h_hemp_filtered.sorted.renamed.vcf.gz		1h_hemp_filtered.sorted.renamed.vcf.gz
2_AdditionalAnalyses.sh		2_AdditionalAnalyses.sh
README.md		README.md
conda_environment.txt		conda_environment.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis pipeline for Johnson et al. 2021

Setting up the Conda environment

Running the analysis

Key files in this repository

About

Releases

Packages

Contributors 2

Languages

wallacelab/paper-johnson-hemp-gs

Folders and files

Latest commit

History

Repository files navigation

Analysis pipeline for Johnson et al. 2021

Setting up the Conda environment

Running the analysis

Key files in this repository

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages