Skip to content

Commit

Permalink
STAR-Fusion container 0.0.1
Browse files Browse the repository at this point in the history
  • Loading branch information
jpfeil committed Nov 30, 2016
0 parents commit 0c3e634
Show file tree
Hide file tree
Showing 5 changed files with 542 additions and 0 deletions.
34 changes: 34 additions & 0 deletions star-fusion/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# STAR-Fusion Dockerfile
# https://github.com/STAR-Fusion/STAR-Fusion/wiki

FROM ubuntu:16.04

MAINTAINER Jacob Pfeil, [email protected]

RUN apt-get update --fix-missing && \
apt-get install -y python zlib1g-dev gzip perl libdb-dev \
build-essential wget make git

# Install perl libraries
RUN cpan App::cpanminus && cpanm Set::IntervalTree && cpanm DB_File && cpanm URI

WORKDIR /home

# Need STAR binary
RUN wget https://github.com/alexdobin/STAR/archive/2.5.2b.tar.gz && \
tar -xzf 2.5.2b.tar.gz && \
git clone --recursive https://github.com/STAR-Fusion/STAR-Fusion.git

# Add STAR and STAR-Fusion to path
ENV PATH "/home/STAR-2.5.2b/bin/Linux_x86_64/:/home/STAR-Fusion/:$PATH"

# Add wrapper scripts
ADD star_fusion_wrapper.sh /home/star_fusion_wrapper.sh
ADD star_fusion_pipeline.py /home/star_fusion_pipeline.py
ADD genelist.txt /home/genelist.txt

# Data processing occurs at /data
WORKDIR /data

ENTRYPOINT ["sh", "/home/star_fusion_wrapper.sh"]
CMD ["-h"]
66 changes: 66 additions & 0 deletions star-fusion/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@

STAR-Fusion for Treehouse RNA-seq analysis
====================


### Overview

Gene fusions play a major role in tumorigenesis, so it is crucial that Treehouse has a pipeline for detecting them. We have built a docker container that runs [STAR-Fusion](https://github.com/STAR-Fusion/STAR-Fusion/wiki) and filters the output against a list of known cancer fusion genes.

### Docker and usage

##### Image located on hub.docker.com

REPOSITORY: jpfeil/star-fusion

TAG: 0.0.1

IMAGE ID: 520e7a15847b


##### Input files

The pipeline requires paired-end fastq files, the output directory, and the genome library directory. The genelist is already baked into the docker container, but there is an option to include a different genelist. Please refer to the STAR-Fusion documentation for creating a genome library. You can also find a prebuilt genome library here: `http://ceph-gw-01.pod/references/STARFusion-GRCh38gencode23.tar.gz`

```
usage: star_fusion_pipeline.py [-h] --left_fq R1 --right_fq R2 --output_dir
OUTPUT_DIR --genome_lib_dir GENOME_LIB_DIR
[--CPU CPU] [--genelist GENELIST]
[--skip_filter] [--test]
Wraps STAR-Fusion program and filters output.
optional arguments:
-h, --help show this help message and exit
--left_fq R1 Fastq 1
--right_fq R2 Fastq 2
--output_dir OUTPUT_DIR
Output directory
--genome_lib_dir GENOME_LIB_DIR
Genome library directory
--CPU CPU Number of cores
--genelist GENELIST
--skip_filter
--test
```


##### Run command
```
docker run -it --rm -v `pwd`:/data jpfeil/star-fusion:0.0.1 \
--left_fq 1.fq.gz \
--right_fq 2.fq.gz \
--output_dir fusion_output \
--CPU `nproc` \
--genome_lib_dir STARFusion-GRCh38gencode23
```

### **Output**

There will be many files in the output directory, but you can find the fusion calls in these files:

- `star-fusion.fusion_candidates.final.abridged`
- `star-fusion.fusion_candidates.final.in_genelist.abridged`

The second file contains fusion calls were both fusion partners are in the genelist.
Loading

0 comments on commit 0c3e634

Please sign in to comment.