Skip to content

Commit

Permalink
fix: Start adding review comments
Browse files Browse the repository at this point in the history
TASK: IL-394
  • Loading branch information
SebastianNiehusAA committed May 17, 2024
1 parent 9e4187a commit 9511782
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
6 changes: 3 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,18 @@
## Unreleased

### Breaking Changes
- Changed the behavior of `IncrementalEvaluator::do_evaluate` such that it now promotes all output to `do_incremental_evaluate`instead of only the new outputs.
- Changed the behavior of `IncrementalEvaluator::do_evaluate` such that it now sends all `SuccessfulExampleOutput`s to `do_incremental_evaluate` instead of only the new `SuccessfulExampleOutput`s.
-
### New Features
- Add generic `EloEvaluator` class and `EloEvaluationLogic`for implementation of Elo evaluation use cases.
- Add generic `EloEvaluator` class and `EloEvaluationLogic` for implementation of Elo evaluation use cases.
- Add `EloQaEvaluator` and `EloQaEvaluationLogic` for Elo evaluation of QA runs.
- Add `IncrementalEloQaEvaluator` and `IncrementalEloQaEvaluationLogic` for Elo evaluation of QA runs with later addition of more runs to an existing evaluation.
- Add `EloAggregationAdapter` class to simplify using the `ComparisonEvaluationAggregationLogic` for different Elo use cases.
- Add `elo_qa_eval` tutorial notebook describing the use of an (incremental) Elo evaluation use case for QA models.
### Fixes
...
### Deprecations
...lint
...

## 1.2.0

Expand Down
6 changes: 3 additions & 3 deletions tests/evaluation/test_elo_evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,11 @@
ControlModel,
Language,
LuminousControlModel,
NoOpTracer,
TextChunk,
Tracer,
utc_now,
)
from intelligence_layer.core.tracer.tracer import NoOpTracer, Tracer
from intelligence_layer.evaluation import (
ComparisonEvaluation,
EloEvaluationLogic,
Expand All @@ -29,8 +30,6 @@
)
from intelligence_layer.examples import SingleChunkQaInput, SingleChunkQaOutput

load_dotenv()


class DummyEloQaEvalLogic(
EloEvaluationLogic[SingleChunkQaInput, SingleChunkQaOutput, SingleChunkQaOutput]
Expand All @@ -40,6 +39,7 @@ def __init__(
model: ControlModel,
tracer: Tracer = NoOpTracer(),
):
load_dotenv()
super().__init__()
self._model = model
self.tracer = tracer
Expand Down

0 comments on commit 9511782

Please sign in to comment.