Skip to content

This is GATK pipeline customized for GBS/RAD/SLAF-seq data based SNP calling using HPC

Notifications You must be signed in to change notification settings

RimGubaev/GATK_pipeline_customized

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GATK_pipeline_customized

This collection of scripts was generated in order to adapt GATK pipeline for reduced representation sequencing approaches like GBS or RAD-seq to perform SNP calling.

The scripts are numbered in the order of application and perform the following functions:

1_GATK_pipeline_run_demultiplexing.sh - uses barcodes table to demultiplex reads into sample-specific fastq.gz bins

2_GATK_pipeline_run_trimmomatic2.sh - standart read filtering with trimmomatic, FastQC visualisation is highly recommended before filter application

3_GATK_generate_indexes.sh - creates indexes for alignment

4_GATK_pipeline_run_alignment.sh - alignment over multiple samples with bwa + samtools using "for" loop

5_GATK_pipeline_add_read_groups_run_sorting.sh - adds read groups to bam files and sorts corresponding bams. this step is OBLIGATORY for siccesefull calling

6_GATK_pipeline_generate_individual_vcfs.sh - performs SNP calling for individual bam files

7_GATK_pipeline_combine_individual_vcfs.sh - combines individual vcfs into one joint vcf. NOTE! the output vcf might be large!

8_GATK_pipeline_genotype_combined_vcf.sh - performs genotyping on combined vcf

9_GATK_pipeline_variant_filtration.sh - Basic variant filtration based on INFO field in vcf file. A useful manual for variant filtration id avilable here

Software used: axe-demux v.0.3.3-2-ge23af27 trimmomatic v.0.38 samtools v.1.9 bwa v.0.7.17 picard v.2.18.22 GATK v.4.1.0.0

Disclaimer: presented scripts are NOT ready-to-go solution, however may sereve a as an example on how to adapt GATK pipeline for GBS and RAD seq data. For more details and help don't hestitate to contact.

Email: [email protected]

This customized pipline was used in the following publication:

Gubaev, R.; Gorlova, L.; Boldyrev, S.; Goryunova, S.; Goryunov, D.; Mazin, P.; Chernova, A.; Martynova, E.; Demurin, Y.; Khaitovich, P. Genetic Characterization of Russian Rapeseed Collection and Association Mapping of Novel Loci Affecting Glucosinolate Content. Genes 2020, 11, 926. https://doi.org/10.3390/genes11080926

About

This is GATK pipeline customized for GBS/RAD/SLAF-seq data based SNP calling using HPC

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages