Skip to content

Latest commit

 

History

History
139 lines (111 loc) · 8.14 KB

README.md

File metadata and controls

139 lines (111 loc) · 8.14 KB

GraSH: Successive Halving for Knowledge Graphs

This is the code and configuration accompanying the paper "Start Small, Think Big: On Hyperparameter Optimization for Large-Scale Knowledge Graph Embeddings" presented at ECML-PKDD 2022. The code extends the knowledge graph embedding library for distributed training Dist-KGE. For documentation on Dist-KGE refer to the Dist-KGE repository. We provide the hyperparameter settings for the searches and finally selected trials in /examples/experiments/.

UPDATE: GraSH was recently merged into our main library LibKGE. All configs from this repository, except the ones for Freebase that require distributed training, can be executed in LibKGE. Please use LibKGE for your own experiments with GraSH.

Table of contents

  1. Quick start
  2. Configuration of GraSH Search
  3. Run a GraSH hyperparameter search
  4. Results and Configurations
  5. How to cite

Quick start

Setup

# retrieve and install project in development mode
git clone https://github.com/uma-pi1/grash.git
cd grash
pip install -e .

# download and preprocess datasets
cd data
sh download_all.sh
cd ..

Training

# train an example model on a toy dataset (you can omit '--job.device cpu' when you have a gpu)
python -m kge start examples/toy-complex-train.yaml --job.device cpu

This example will train on a toy dataset in a sequential setup on CPU

GraSH Hyperparameter Search

# perform a search with GraSH on a toy dataset (you can omit '--job.device cpu' when you have a gpu)
python -m kge start examples/toy-complex-search-grash.yaml --job.device cpu

This example will perform a small GraSH search with 16 trials on a toy dataset in a sequential setup on CPU

Configuration of GraSH Search

The most important configuration options for a hyperparameter search with GraSH are:

dataset:
  name: yago3-10
grash_search:
  eta: 4
  num_trials: 64
  search_budget: 3
  variant: combined
  parameters: # define your search space here
job:
  type: search
model: complex
train:
  max_epochs: 400
  • eta defines the reduction factor during the search. Per round the number of remaining trials is reduced to 1/eta
  • search_budget is defined in "number of full training runs". The default choice search_budget=3, for example, corresponds to an overall search cost of three full training runs.
  • variant controls which reduction technique to use (only epoch, only graph, or combined)

Run a GraSH hyperparameter search

Run the default search on yago3-10 with the following command:

python -m kge start examples/experiments/search_configs/yago3-10/search-complex-yago-combined.yaml

The k-core subgraphs will automatically be generated and saved to data/yago3-10/subsets/k-core/. By default, each experiment will create a new folder in local/experiments/<timestamp>-<config-name> where the results can be found.

Results and Configurations

All results were obtained with the GraSH default settings (num_trials=64, eta=4, search_budget=3, variant=combined)

Yago3-10

Model Variant MRR Hits@1 Hits@10 Hits@100 config
ComplEx Epoch 0.536 0.460 0.672 0.601 config
ComplEx Graph 0.463 0.375 0.634 0.800 config
ComplEx Combined 0.528 0.455 0.660 0.772 config
RotatE Epoch 0.432 0.337 0.619 0.768 config
RotatE Graph 0.432 0.337 0.619 0.768 config
RotatE Combined 0.434 0.342 0.607 0.742 config
TransE Epoch 0.499 0.406 0.661 0.794 config
TransE Graph 0.422 0.311 0.628 0.802 config
TransE Combined 0.499 0.406 0.661 0.794 config

Wikidata5M

Model Variant MRR Hits@1 Hits@10 Hits@100 config
ComplEx Epoch 0.300 0.247 0.390 0.506 config
ComplEx Graph 0.300 0.247 0.390 0.506 config
ComplEx Combined 0.300 0.247 0.390 0.506 config
RotatE Epoch 0.241 0.187 0.331 0.438 config
RotatE Graph 0.232 0.169 0.326 0.432 config
RotatE Combined 0.241 0.187 0.331 0.438 config
TransE Epoch 0.263 0.210 0.358 0.483 config
TransE Graph 0.263 0.210 0.358 0.483 config
TransE Combined 0.268 0.213 0.363 0.480 config

Freebase

Model Variant MRR Hits@1 Hits@10 Hits@100 config
ComplEx Epoch 0.572 0.486 0.714 0.762 config
ComplEx Graph 0.594 0.511 0.726 0.767 config
ComplEx Combined 0.594 0.511 0.726 0.767 config
RotatE Epoch 0.561 0.522 0.625 0.679 config
RotatE Graph 0.613 0.578 0.669 0.719 config
RotatE Combined 0.613 0.578 0.669 0.719 config
TransE Epoch 0.261 0.078 0.518 0.636 config
TransE Graph 0.553 0.520 0.614 0.682 config
TransE Combined 0.553 0.520 0.614 0.682 config

How to cite

@inproceedings{kochsiek2022start,
  title={Start Small, Think Big: On Hyperparameter Optimization for Large-Scale Knowledge Graph Embeddings},
  author={Kochsiek, Adrian and Niesel, Fritz and Gemulla, Rainer},
  booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
  year={2022}
}