Glue Factory is designed for simple and tight integration between training and evaluation.
All benchmarks are designed around one principle: only evaluate on cached results.
This enforces reproducible baselines.
Therefore, we first export model predictions for each dataset (export
), and evaluate the cached results in a second pass (evaluation
).
We currently provide evaluation scripts for MegaDepth-1500, HPatches, and ETH3D. You can run them with:
python -m gluefactory.eval.<benchmark_name> --conf "a name in gluefactory/configs/ or path" --checkpoint "and/or a checkpoint name"
Each evaluation run is assigned a tag
, which can (optionally) be customized from the command line with --tag <your_tag>
.
To overwrite an experiment, add --overwrite
. To only overwrite the results of the evaluation loop, add --overwrite_eval
. We perform config checks to warn the user about non-conforming configurations between runs.
The following files are written to outputs/results/<benchmark_name>/<tag>
:
conf.yaml # the config which was used
predictions.h5 # cached predictions
results.h5 # Results for each data point in eval, in the format <metric_name>: List[float]
summaries.json # Aggregated results for the entire dataset <agg_metric_name>: float
<plots> # some benchmarks add plots as png files here
Some datasets further output plots (add --plot
to the command line).
[Configuration]
Each evaluation has 3 main configurations:
data:
... # How to load the data. The user can overwrite this only during "export". The defaults are used in "evaluation".
model:
... # model configuration: this is only required for "export".
eval:
... # configuration for the "evaluation" loop, e.g. pose estimators and ransac thresholds.
The default configurations can be found in the respective evaluation scripts, e.g. MegaDepth1500.
To run an evaluation with a custom config, we expect them to be in the following format (example):
model:
... # <your model configs>
benchmarks:
<benchmark_name1>:
data:
... # <your data configs for "export">
model:
... # <your benchmark-specific model configs>
eval:
... # <your evaluation configs, e.g. pose estimators>
<benchmark_name2>:
... # <same structure as above>
The configs are then merged in the following order (taking megadepth1500 as an example):
data:
default < custom.benchmarks.megadepth1500.data
model:
default < custom.model < custom.benchmarks.megadepth1500.model
eval:
default < custom.benchmarks.megadepth1500.eval
You can then use the command line to further customize this configuration.
Gluefactory offers a flexible interface to state-of-the-art robust estimators for points and lines. You can configure the estimator in the benchmarks with the following config structure:
eval:
estimator: <estimator_name> # poselib, opencv, pycolmap, ...
ransac_th: 0.5 # run evaluation on fixed threshold
#or
ransac_th: [0.5, 1.0, 1.5] # test on multiple thresholds, autoselect best
<extra configs for the estimator, e.g. max iters, ...>
For convenience, most benchmarks convert eval.ransac_th=-1
to a default range of thresholds.
Note
Gluefactory follows the corner convention of COLMAP, i.e. the top-left corner of the top-left pixel is (0, 0).
We provide a powerful, interactive visualization tool for our benchmarks, based on matplotlib. You can run the visualization (after running the evaluations) with:
python -m gluefactory.eval.inspect <benchmark_name> <experiment_name1> <experiment_name2> ...
This prints the summaries of each experiment on the respective benchmark and visualizes the data as a scatter plot, where each point is the result of from a experiment on a specific data point in the dataset.
- Clicking on one of the data points opens a new frame showing the prediction on this specific data point for all experiments listed.
- You can customize the x / y axis from the navigation bar or by clicking
x
ory
. - Hiting
diff_only
computes the difference between<experiment_name1>
and all other experiments. - Hovering over a point shows lines to the results of other experiments on the same data.
- You can switch the visualization (matches, keypoints, ...) from the navigation bar or by clicking
shift+r
. - Clicking
t
prints a summary of the eval on this data point. - Hitting the
left
orright
arrows circles between data points.shift+left
opens an extra window.
When working on a remote machine (e.g. over ssh), the plots can be forwarded to the browser with the option --backend webagg
. Note that you need to refresh the page everytime you load a new figure (e.g. when clicking on a scatter point). This part requires some more work, and we would highly appreciate any contributions!