Graph-DOM is a tool to analyze comprehensive fragmentation data for dissolved organic matter (DOM) using graph algorithms.
To setup and use Graph-DOM, follow the instructions below:
- System with Ubuntu 16.04 or later with at least 8 CPU cores and 120GBs of memory is recommended.
- Install Anaconda using these instructions.
- Download the source code.
- Extract the
tar.gz
file:
tar -xzf Graph-DOM-1.0.tar.gz
- Change the current working directory to the downloaded Graph-DOM directory in the previous step.
- Create virtual environment with project dependencies using the following command:
conda create --name graph-dom --file requirements.txt python=3.9
- Type
y
and press enter when promptedProceed ([y]/n)?
. - Activate the virtual environment:
conda activate graph-dom
- Set parameters in
config.ini
file. See the Config section for details about setting parameters. - Run Graph-DOM using the following command:
python3 main.py
- The program will generate pathways for each precursor and then analyze the pathways to generate families. Once completed, the output file and plots can located in the
output
directory. For details on the output files and plots, see the publication and cite:
Tariq, Muhammad Usman, Dennys Leyvay, Francisco Alberto Fernandez Limaz, and Fahad Saeed. "Graph Theoretic Approach for the Analysis of Comprehensive Mass-Spectrometry (MS/MS) Data of Dissolved Organic Matter." In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 3742-3746. IEEE, 2021.
Link: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9669289
config.ini
file contains user adjustable parameters to run Graph-DOM.
num_cores
: Number of cores to be used for generating pathways in parallel. Should be <= number of CPU cores of the system being used.use_NS
: Whether to use Nitrogen and Sulphur as elements for creating pathways.input_file_path
: Relative path to the input file. Input files are provided inside the Input directory.multiple
: Multiple of each neutral loss to consider to find the next peak.tolerance
Fragment tolerance for generating pathways.nominal_tolerance
: Fragments within +-nominal_tolerance Da will be considered precursors.overlap_len
: Overlap length threshold of pathways for creating families. Should be at least 2. To enforce complete overlap set it to a large number e.g. 100