title

description

tags

How to run intron retention analysis

This code helps to run IRFinder in the cluster.

research

rnaseq

hpc	intro_retention

To run any of these commands, need to activate the bioconda IRFinder environment prior to running script.

First script creates reference build required for IRFinder

#SBATCH -t 24:00:00                    # Runtime in minutes
#SBATCH -n 4
#SBATCH -p medium                # Partition (queue) to submit to
#SBATCH --mem=128G        # 128 GB memory needed (memory PER CORE)
#SBATCH -o %j.out               # Standard out goes to this file
#SBATCH -e %j.err               # Standard err goes to this file
#SBATCH --mail-type=END         # Mail when the job ends

IRFinder -m BuildRefProcess -r reference_data/

NOTE: The files in the reference_data folder are sym links to the bcbio ref files and need to be named specifically genome.fa and transcripts.gtf:

genome.fa -> /n/app/bcbio/biodata/genomes/Hsapiens/hg19/seq/hg19.fa

transcripts.gtf -> /n/app/bcbio/biodata/genomes/Hsapiens/hg19/rnaseq/ref-transcripts.gtf

Second script (.sh) runs IRFinder and STAR on input file

#!/bin/bash

module load star/2.5.4a

IRFinder -r /path/to/irfinder/reference_data \
-t 4 -d results \
$1

Third script (.sh) runs a batch job for each input file in directory

#!/bin/bash

for fq in /path/to/*fastq
do

sbatch -p medium -t 0-48:00 -n 4 --job-name irfinder --mem=128G -o %j.out -e %j.err --wrap="sh /path/to/irfinder/irfinder_input_file.sh $fq"
sleep 1 # wait 1 second between each job submission

done

Fourth script takes output (IRFinder-IR-dir.txt) and uses the replicates to determine differential expression using the Audic and Claverie test (# replicates < 4). analysisWithLowReplicates.pl script comes with the IRFinder github repo clone, so I cloned the repo at https://github.com/williamritchie/IRFinder/. Notes on the Audic and Claverie test can be found at: https://github.com/williamritchie/IRFinder/wiki/Small-Amounts-of-Replicates-via-Audic-and-Claverie-Test.

#!/bin/bash

#SBATCH -t 24:00:00                    # Runtime in minutes
#SBATCH -n 4
#SBATCH -p medium                # Partition (queue) to submit to
#SBATCH --mem=128G        # 8 GB memory needed (memory PER CORE)
#SBATCH -o %j.out               # Standard out goes to this file
#SBATCH -e %j.err               # Standard err goes to this file
#SBATCH --mail-type=END         # Mail when the job ends

analysisWithLowReplicates.pl \
  -A A_ctrl/Pooled/IRFinder-IR-dir.txt A_ctrl/AJ_1/IRFinder-IR-dir.txt A_ctrl/AJ_2/IRFinder-IR-dir.txt A_ctrl/AJ_3/IRFinder-IR-dir.txt \
  -B B_nrde2/Pooled/IRFinder-IR-dir.txt B_nrde2/AJ_4/IRFinder-IR-dir.txt B_nrde2/AJ_5/IRFinder-IR-dir.txt B_nrde2/AJ_6/IRFinder-IR-dir.txt \
  > KD_ctrl-v-nrde2.tab

Output KD_ctrl-v-nrde2.tab file can be read directly into R for filtering and results exploration.
Rmarkdown workflow (included in report): IRFinder_report.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

running_IRFinder.md

running_IRFinder.md

Files

running_IRFinder.md

Latest commit

History

running_IRFinder.md

File metadata and controls