Skip to content

Commit

Permalink
Update on "add eval for attention sink"
Browse files Browse the repository at this point in the history
This PR adds the function to evaluate the model's perplexity when AttentionSink is enabled.

This is mostly copied from https://github.com/mit-han-lab/streaming-llm/blob/main/examples/eval_long_ppl.py which is used by the AttentionSink paper to evaluate the model's perplexity when AttentionSink is enabled.

Differential Revision: [D66474732](https://our.internmc.facebook.com/intern/diff/D66474732/)

Perplexity measured for llama 3.2 1B and 1B_Instruct model up to 40k tokens with AttentionSink enabled:

<img width="966" alt="Screenshot 2024-11-25 at 2 46 04 PM" src="https://github.com/user-attachments/assets/ba7118f9-b5d7-4de8-b1fa-7d2ba0646515">


[ghstack-poisoned]
  • Loading branch information
helunwencser committed Dec 2, 2024
2 parents 38d9e1c + 493607e commit 2f4641f
Showing 1 changed file with 1 addition and 3 deletions.
4 changes: 1 addition & 3 deletions examples/models/llama/eval_llama_lib.py
Original file line number Diff line number Diff line change
Expand Up @@ -318,9 +318,7 @@ def eval_llama(
print(f"{task}: {res}")


def eval_llama_with_attention_sink(
model_name: str, args: argparse.ArgumentParser
):
def eval_llama_with_attention_sink(model_name: str, args: argparse.ArgumentParser):
"""
Evaluate the model's perplexity when AttentionSink is enabled.
Expand Down

0 comments on commit 2f4641f

Please sign in to comment.