diff --git a/src/documentation/how_tos/hot_to_implement_elo_evaluations.ipynb b/src/documentation/how_tos/hot_to_implement_elo_evaluations.ipynb new file mode 100644 index 000000000..56c0b7ccc --- /dev/null +++ b/src/documentation/how_tos/hot_to_implement_elo_evaluations.ipynb @@ -0,0 +1,23 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# How to implement elo evaluation runs\n", + "0. Run your tasks on the datasets you want to evaluate (see [here](./how_to_run_a_task_on_a_dataset.ipynb))\n", + " - When evaluating multiple runs, all of them need the same data types \n", + "2. Initialize all necessary repositories for the `IncrementalEvaluator`, and an `IncrementalEvaluationLogic`.\n", + "3. Run the evaluator to evaluate all examples and create a single `EvaluationOverview`\n", + "4. (Optional) Save the evaluation id for later use" + ] + } + ], + "metadata": { + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}