Code and data for the paper
Pascal Welke, Florian Seiffarth, Michael Kamp and Stefan Wrobel: HOPS: Probabilistic Subtree Mining for Small and Large Graphs. KDD 2020
If you use this work, please cite our paper.
You can find the code for experiments and plot generation of Section 6.1: Approximate Counting in Large Graphs in the /largegraph/ folder. The code for experiments and plot generation of Section 6.2: Probabilistic Frequent Subtree Mining is located in the /smallgraphs/ folder.
The code has been tested on recent Ubuntu Linux distributions (18.04, 19.10).
Set up the experiments and evaluation:
-
(Clone the project)
-
Set up python3 conda environment for hops:
- conda create -n hops python=3.7 joblib matplotlib
- pip install tikzplotlib
-
Set up python2 for Ravkic algorithms:
- install python2.7: sudo apt install python2.7 python-pip
- sudo apt-get install python-tk
-
Set up experiments:
- in run_exp.py set main_path=".../largegraph"
- run run_exp.py with your favourite graph, pattern size and time limit
-
Set up evaluation:
- in evaluate.py set path=".../largegraph/
- run evaluate.py for evaluation
Set up the experiments and evaluation:
- (Clone the project)
- Download and unzip the graphs "com-amazon.ungraph", "com-orkut.ungraph", "com-lj.ungraph" from https://snap.stanford.edu/data/index.html into the folder snap_big_graphs
- Adjust paths in main_snap.py
- Install the required packages
- run main_snap.py
- (Clone the project)
- Install gnu parallel: sudo apt install parallel
- Run smallgraphs/runExperiments.sh
- Inspect results in the subfolders