de novo transcriptome assembly pipeline

This pipeline combines multiple assemblers and multiple paramters using the combined de novo transcriptome assembly pipelines. For more information, goto https://ncgas.org/WelcomeBasket_Pipeline.php

If you have any other questions, contact the NCGAS team ([email protected])

The pipeline is currently set up ONLY TORQUE job handlers (SLURM in progress and will be updated soon), with current set up being for IU Carbonate Cluster. If you want to run this on your own hardware, we can help you convert the scripts to run on your machine. READ THE READMEs. There are READMEs in every folder.

Step 0:

You will need to run the start up script to set up your email, project location, and designate if you are using single or double strands. This can be done with the following command:

Setup.ba -e -s -r -i @@ -29,21 +29,21 @@ You can get citations for all software by using: NOTE: You can change your email and strand setting several times, but the current working directory will only be run the first time. As long as you don't move the directory, this will not cause any issues. Also, this step can take a bit of time, and you can ignore any sed related errors - that is expected at this time!

Step 1:

Put all your reads into input_files - DO THIS AFTER STEP 0! Otherwise it will take forever! Read the README in input_files to get instructions for combining reads properly into input files. You can do this with symlink (use command "man ln" if you are unfamiliar with this command).

Then run the normalization command - this will normalize your data and make it take less time/resources without loss of information. Command: qsub RunTrinity.normalize.sh

Step 2: SOAP

Run RunSOAP1.sh and RunSOAP1b.sh at the same time. Command: qsub RunSOAP1* When they finish, run ./Combine.sh Command: ./Combine.sh

Step 2b: Velvet

Run RunVelvet1.sh and RunVelvet1b.sh at the same time. Command: qsub RunVelvet1* When BOTH above are complete, run RunVelvet2.sh and RunVelvet2b.sh at the same time. @@ -53,17 +53,17 @@ Command: qsub RunVelvet3* When they finish, run ./Combine.sh Command: ./Combine.sh

Step 2c: TransAbyss

Run RunTransAb1.sh and RunTransAb1b.sh at the same time. Command: qsub RunTransAbyss1* When they finish, run ./Combine.sh Command: ./Combine.sh

Step 2d: Trinity

Run RunTrinity.sh, there is no combine script for this assembler. Command: qsub RunTrinity.sh

Step 3: Combine all outputs

The outputs for each combined set will be placed automatically in final_assembly. Run ./Combine.sh FIRST to get one input for Evigenes Run RunEviGene.sh @@ -83,13 +83,15 @@ within okayset, you will set two sets of files: SEE http://arthropods.eugenes.org/EvidentialGene/trassembly.html for documentaiton! ALSO SEE https://ncgas.org/Blog_Posts/EviGene.php for explanation of EviGene's methods!

Step 4:

You can continue downstream analysis with the README's in final_assemblies,

annotation - using Trinotate
differential expression (DE) - using Kallisto
quality control (QC) - runs BUSCO and QUAST to calculate assembly statistics

This step can be done on data assembled with version 4 or previous versions (2 or 3) of the pipeline!

Indiana University Machine (Torque job handler)

Project Carbonate v1- original scripts as described at PAG 2018 Project Carbonate v2 - added updated transabyss (2.0) and defaulted everything everything to run on normalized files, added normalization script to input_files Project Carbonate v4 - original scripts as described in PAG 2019

Pittsburgh Supercomputing Center's Bridges system (SLURM job handler)

Project Bridges v1- original scripts as described at PAG 2018 Project Bridges v2 - added updated transabyss (2.0) and defaulted everything everything to run on normalized files, added normalization script to input_files Project Bridges v4- Work in prgress

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
Project_Carbonate_v4		Project_Carbonate_v4
Installation notes.md		Installation notes.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

de novo transcriptome assembly pipeline

Step 0:

Step 1:

Step 2: SOAP

Step 2b: Velvet

Step 2c: TransAbyss

Step 2d: Trinity

Step 3: Combine all outputs

Step 4:

Indiana University Machine (Torque job handler)

Pittsburgh Supercomputing Center's Bridges system (SLURM job handler)

About

Releases

Packages

Languages

License

pfrender-laboratory/de-novo-transcriptome-assembly-pipeline

Folders and files

Latest commit

History

Repository files navigation

de novo transcriptome assembly pipeline

Step 0:

Step 1:

Step 2: SOAP

Step 2b: Velvet

Step 2c: TransAbyss

Step 2d: Trinity

Step 3: Combine all outputs

Step 4:

Indiana University Machine (Torque job handler)

Pittsburgh Supercomputing Center's Bridges system (SLURM job handler)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages