Snakemake workflow to generate RNA-Seq assemblies and junctions
The below graph explains the basic workflow
snakemake-5.4.0 - https://snakemake.readthedocs.io/en/stable/
hisat-2.1.0 - https://ccb.jhu.edu/software/hisat2/index.shtml
samtools-1.7 - http://www.htslib.org/download/
stringtie-1.3.3 - https://ccb.jhu.edu/software/stringtie/#install
scallop-0.10.2 - https://github.com/Kingsford-Group/scallop/releases
portcullis-1.1.2 - https://portcullis.readthedocs.io/en/latest/
git clone https://github.com/gemygk/ANext.git path/to/workdir
cd path/to/workdir
Once you have working snakemake installed, please test the workflow
cd path/to/workdir
snakemake --snakefile ANext.smk --configfile config.yaml -np
The above should print the following:
Building DAG of jobs...
...
...
...
Job counts:
count jobs
1 all
3 hisat2
1 hisat2_build
1 portcullis
3 scallop
3 stringtie
12
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.
config file: config.yaml
An example command is provided below for SLURM scheduler
cd path/to/workdir
sbatch -c 1 --mem 5G -p ei-long -o out_ANext.%j.log -J ANext --wrap "source snakemake-5.4.0 && snakemake --latency-wait 120 --cluster-config cluster.json --configfile config.yaml --snakefile ANext.smk -p --jobs 100 --cluster \"sbatch -p {cluster.partition} -c {cluster.c} --mem {cluster.mem} -J {cluster.J} -o {cluster.o}\""
Please modify the cluster configuration file cluster.json as needed to work on any other HPC scheduler (like PBS Pro, LSF etc)