Skip to content

Commit

Permalink
Merge pull request Reed-CompBio#103 from livj4711/domino
Browse files Browse the repository at this point in the history
Add DOMINO pathway reconstruction algorithm
  • Loading branch information
agitter authored Aug 18, 2023
2 parents ca6bbea + 8fe055c commit 20160ca
Show file tree
Hide file tree
Showing 18 changed files with 2,068 additions and 732 deletions.
10 changes: 10 additions & 0 deletions .github/workflows/test-spras.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ jobs:
docker pull reedcompbio/pathlinker:latest
docker pull reedcompbio/meo:latest
docker pull reedcompbio/mincostflow:latest
docker pull reedcompbio/domino:latest
- name: Build Omics Integrator 1 Docker image
uses: docker/build-push-action@v1
with:
Expand Down Expand Up @@ -126,6 +127,15 @@ jobs:
tags: latest
cache_froms: reedcompbio/mincostflow:latest
push: false
- name: Build DOMINO Docker image
uses: docker/build-push-action@v1
with:
path: docker-wrappers/DOMINO/.
dockerfile: docker-wrappers/DOMINO/Dockerfile
repository: reedcompbio/domino
tags: latest
cache_froms: reedcompbio/domino:latest
push: false

# Run pre-commit checks on source files
pre-commit:
Expand Down
10 changes: 9 additions & 1 deletion config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
run2:
b: [2]
g: [3]

- name: "meo"
params:
include: true
Expand All @@ -71,6 +72,13 @@
flow: [1] # The flow must be an int
capacity: [1]

- name: "domino"
params:
include: true
directed: false
run1:
slice_threshold: [0.3]
module_threshold: [0.05]

# Here we specify which pathways to run and other file location information.
# DataLoader.py can currently only load a single dataset
Expand All @@ -88,7 +96,7 @@
-
label: data1
# Reuse some of the same sources file as 'data0' but different network and targets
node_files: ["sources.txt", "alternative-targets.txt"]
node_files: ["node-prizes.txt", "sources.txt", "alternative-targets.txt"]
edge_files: ["alternative-network.txt"]
other_files: []
# Relative path from the spras directory
Expand Down
10 changes: 10 additions & 0 deletions config/egfr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,16 @@ algorithms:
- 3
rand_restarts:
- 10
-
name: domino
params:
directed: false
include: true
run1:
slice_threshold:
- 0.3
module_threshold:
- 0.05
datasets:
-
data_dir: input
Expand Down
11 changes: 11 additions & 0 deletions docker-wrappers/DOMINO/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# DOMINO wrapper
# https://github.com/Shamir-Lab/DOMINO
FROM python:3.7

RUN pip install domino-python==0.1.1

# DOMINO requires data files in hard-coded locations
RUN cd /usr/local/lib/python3.7/site-packages/src/data && \
wget https://raw.githubusercontent.com/Shamir-Lab/DOMINO/master/src/data/ensg2gene_symbol.txt && \
wget https://raw.githubusercontent.com/Shamir-Lab/DOMINO/master/src/data/ensmusg2gene_symbol.txt && \
wget https://raw.githubusercontent.com/Shamir-Lab/DOMINO/master/src/data/graph.html.format
22 changes: 22 additions & 0 deletions docker-wrappers/DOMINO/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# DOMINO Docker image

A Docker image for [DOMINO](https://github.com/Shamir-Lab/DOMINO) that is available on [DockerHub](https://hub.docker.com/repository/docker/reedcompbio/domino).

DOMINO outputs multiple active modules, which SPRAS combines into a single pathway.
It is [non-deterministic](https://github.com/Shamir-Lab/DOMINO/issues/5) and cannot be made deterministic with a seed.

To create the Docker image run:
```
docker build -t reedcompbio/domino -f Dockerfile .
```
from this directory.

To inspect the installed Python packages:
```
winpty docker run reedcompbio/domino pip list
```
The `winpty` prefix is only needed on Windows.

## TODO
- Resolve upstream ValueError with small inputs https://github.com/Shamir-Lab/DOMINO/issues/11
- Use cache or reuse slices files from previous runs on the same network
11 changes: 6 additions & 5 deletions input/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ All other columns specify additional node attributes such as prizes.
Any nodes that are listed in a node file but are not present in one or more edges in the edge file will be removed.
For example:
```
NODEID prize sources targets
A 1.0 True
B 3.3 True
C 2.5 True
D 1.9 True True
NODEID prize sources targets active
A 1.0 True True
B 3.3 True True
C 2.5 True True
D 1.9 True True True
```

A secondary format provides only a list of node identifiers and uses the filename as the node attribute, as in the example `sources.txt`.
Expand Down Expand Up @@ -49,6 +49,7 @@ The files are originally from the [Temporal Pathway Synthesizer (TPS)](https://g
They have been lightly modified for SPRAS by lowering one edge weight that was greater than 1, removing a PSEUDONODE prize, adding a prize of 10.0 to EGF_HUMAN, and converting all edges to undirected edges.
The only source is EGF_HUMAN.
All proteins with phosphorylation-based prizes are also labeled as targets.
All nodes are considered active.

If you use any of the input files `tps-egfr-prizes.txt` or `phosphosite-irefindex13.0-uniprot.txt`, reference the publication

Expand Down
6 changes: 3 additions & 3 deletions input/node-prizes.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
NODEID prize
A 2
C 5.7
NODEID prize active
A 2 true
C 5.7 true
Loading

0 comments on commit 20160ca

Please sign in to comment.