Skip to content

Trypanosomatid Regulatory Element Prediction Pipeline

Notifications You must be signed in to change notification settings

elsayed-lab/tryp-reg-predict

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prediction of Trypanosomatid Regulatory Elements

Overview

Features

  • Sequence motifs
    • 5' UTR
    • 3' UTR
    • Upstream gene's 3' UTR
    • Downstream genes 5' UTR
    • Upstream intergenic region
    • Downstream intergenic region
    • CDS
  • Sequence composition
    • 5' UTR GC/CT composition
    • 3' UTR GC/CT composition
    • CDS GC/CT composition
    • Polypyrimidine tract GC/CT composition
    • Kmer counts
  • Sequence lengths
    • 5' UTR length
    • 3' UTR length
    • Polypyrimidine tract length
    • Interenic region / inter-CDS length
  • Other
    • CDS codon adaptation index (CAI)

Installation

The Trypanosomatid Regulatory Elements prediction pipeline makes use a number of different R and Python packages, as well as several standalone tools.

Below is a list of all of the requirements needed to run this pipeline.

Requirements

Software requirements

Python requirements

R requirements

conda create -n reg-predict --file requirements.txt \
    --channel bioconda \
    --channel conda-forge \
    --channel pytorch

Note: In order to avoid running out of memory during execution, the hierarchical clustering portion of the EXTREME script run_consensus_clusering_using_wm.pl may need to be edited to increase the value Xmx, e.g.: -Xmx10000m.

Usage

TODO: describe software for predicting UTR boundaries, etc.

snakemake --configfile settings/config.yml combine_motifs

About

Trypanosomatid Regulatory Element Prediction Pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published