NetMix is an algorithm for identifying altered subnetworks with node scores that are distributed differently from other nodes in the network. NetMix improves upon current methods by using a Gaussian Mixture Model to find a less biased estimate of the size of an altered subnetwork. This README is under construction.
The setup process for NetMix requires the following steps:
Download NetMix. The following command clones the current NetMix repository from GitHub:
git clone https://github.com/raphael-group/netmix.git
- Linux/Unix
- Python (2.7 or 3.6)
- NumPy (1.17)
- SciPy (1.3)
- h5py (2.10)
- heinz
Most likely, NetMix will work with other versions of the above software. We recommend using a Python virtual environment, which allows Python packages to be installed or updated independently of system packages. The heinz package must be installed before using NetMix, and the location of heinz must be specified by editing the line heinz_directory=""
in the NetMix script.
To test NetMix on example data, please run the following command:
./netmix.sh network.tsv scores.tsv output.txt
where network.tsv
is a tab-delimited edge list and scores.tsv
is a tab-delimited list of p-values on the nodes of network.tsv
. Please see below for more details.
NetMix has three main steps:
- Compute node scores.
- Find the maximum-weight connected subgraph using the node scores.
- (Optional) Compute statistical significance.
The NetMix manuscript defines these steps, and the NetMix script combines them. To compute permuted networks and scores for evaluating statistical signficance, see the Hierarchical HotNet paper and repository, which describes and implements multiple network and score permutation schemes.
These steps take the following inputs and return the following output.
NetMix has two input files that together define a network with scores on the nodes of the network. For example, the following example defines a network with nodes A
, B
, and C
, where C
is directly connected to both A
and B
and A
, B
, and C
have p-values of 0.1
, 0.5
, and 0.9
, respectively.
Each edge in this file corresponds to an edge in the network.
A C
B C
Each line in this file associates a node with a score:
A 0.1
B 0.5
C 0.9
NetMix reports a set of nodes corresponding to the maximum-weight connected subgraph (MWCS) for our node scores. For example, the MWCS includes nodes B
and C
but not node A
. Each line in the output file is a node:
B
C
See the examples
directory for an example that should complete in a few minutes on most machines.
If you are unable to run the example in the examples
directory, then please post an issue on GitHub.
Below are instructions for installing the heinz package on Mac Catalina 64-bit, courtesy of Javed Aman.
- download and install cplex_studio12100-osx from ibm. Get academic version. This is not the latest version!
- download and extract Lemon from wget http://lemon.cs.elte.hu/pub/sources/lemon-1.3.1.tar.gz
- inside the lemon directory run: cmake -DCMAKE_INSTALL_PREFIX=~/lemon
- then run make install
- leave lemon directory
- wget https://ogdf.uos.de/wp-content/uploads/2019/04/ogdf-snapshot-2015-05-30.zip, newer versions of OGDF will not work including the one mentioned in the heinz github!!!
- extract the zip file
- enter the ogdf folder
- run: cmake .; followed by make -j16
- create a lib directory instide of the ogdf folder and move the static libraries libCoin.a and libOGDF.a into the lib folder
- leave the ogdf directory and move to the heinz directory
- create a build folder, change into it, and run the command: cmake -DLIBOGDF_ROOT=/path/to/OGDF-snapshot ..
- go one directory up back into the root of the heinz directory and run: make
See LICENSE
for license information.
If you use NetMix in your work, then please cite the following manuscript.
M.A. Reyna*, U. Chitra*, R. Elyanow, B.J. Raphael. NetMix: A network-structured mixture model for reducing bias in the identification of altered subnetworks. RECOMB 2020.