Comparing two RNAseq analysis tools

This repository is the final deliverable for our tutored project as a group of Master bioinformatics students.
The goal was to make and benchmark 3 pipelines using STAR+HTseq count, STAR quantmode, and Kallisto for RNAseq analysis. These were made with Snakemake.
The Snakefile contains all 3 pipelines. There is also a config file to allow easy customization of the pipeline.

Kallisto pipeline

It runs the 2 main Kallisto commands:

The first one indexes the cDNA fasta file
The second runs the quantification tool
We added an extra transcript-to-gene tool to convert the counts to gene counts

STAR-HTseq pipeline

Here we have to use 3 different pieces of software :

Star itself, to index the genome and map the reads on it
Samtools, to convert the .sam output of Star into a .bam, and then to sort and index this file
Htseq-count, which is the quantification tool for the reads we got using Star.

STAR quantmode pipeline

This pipeline uses the same index as the previous pipeline, followed by the built-in STAR quantmode command.

Index the reference genome with STAR
Count the reads by feature with STAR quantmode

Setting it up

Place the Snakefile in a folder along with the config.json and the /input/ folder.
Reads must be paired-end (2 files per condition), and placed in the /input/ folder.
You must change the read extension in the config file, ex: "READ_EXTENSION" : ".fastq"
You must change the paths for the DNA, cDNA and the GTF file in the config.json file, ending with a /
The genomes must be .fa
If the reads are compressed, change the SUP_STAR_COMPR parameter in the config file to "--readFilesCommand zcat", otherwise leave it empty: "".
You must add the list of experimental conditions as a list in the config file, ex: "SAMPLE": ["C1reads", "C2reads"]
You can also change the separator between the read name and number, ex: "C1reads_1" "C1reads_2" => "SEPARATOR":"_"
To launch: move to your directory and type snakemake in the terminal, add -j [number of threads] to choose the total number of threads to be used. You can change the number of threads per rule in the config file: 3 rules can be run at the same time, so the total number needs to be three times the per rule thread number. Example:
Console command: snakemake -j 12
In the config file:"THREADS" : "4"

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
README.md		README.md
Snakefile		Snakefile
config.json		config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparing two RNAseq analysis tools

Kallisto pipeline

STAR-HTseq pipeline

STAR quantmode pipeline

Setting it up

About

Releases

Packages

Contributors 2

Languages

Exenter/Comparing-two-RNAseq-analysis-tools-STAR-vs-Kallisto

Folders and files

Latest commit

History

Repository files navigation

Comparing two RNAseq analysis tools

Kallisto pipeline

STAR-HTseq pipeline

STAR quantmode pipeline

Setting it up

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages