Skip to content

Commit

Permalink
Set KV-cache to FP16 in LLM evaluation tests (#27956)
Browse files Browse the repository at this point in the history
Co-authored-by: Alina Kladieva <[email protected]>
  • Loading branch information
AlexKoff88 and akladiev authored Dec 11, 2024
1 parent 06de644 commit 516f2a3
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion tests/llm/accuracy_conformance.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ def teardown_module():
test_scope,
)
def test_accuracy_conformance(model_path, model_type, precision, gt_data, device):
target_model = OVModelForCausalLM.from_pretrained(model_path, device=device)
target_model = OVModelForCausalLM.from_pretrained(model_path, device=device, ov_config={"KV_CACHE_PRECISION": "f16"})
tokenizer = AutoTokenizer.from_pretrained(model_path)

evaluator = wwb.Evaluator(
Expand Down

0 comments on commit 516f2a3

Please sign in to comment.