Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add eval for attention sink #7070

Merged

Commits on Nov 25, 2024

  1. add eval for attention sink

    This PR adds the function to evaluate the model's perplexity when AttentionSink is enabled.
    
    This is mostly copied from https://github.com/mit-han-lab/streaming-llm/blob/main/examples/eval_long_ppl.py which is used by the AttentionSink paper to evaluate the model's perplexity when AttentionSink is enabled.
    
    Differential Revision: [D66474732](https://our.internmc.facebook.com/intern/diff/D66474732/)
    
    [ghstack-poisoned]
    helunwencser committed Nov 25, 2024
    Configuration menu
    Copy the full SHA
    a2cc4aa View commit details
    Browse the repository at this point in the history

Commits on Nov 26, 2024

  1. Update on "add eval for attention sink"

    This PR adds the function to evaluate the model's perplexity when AttentionSink is enabled.
    
    This is mostly copied from https://github.com/mit-han-lab/streaming-llm/blob/main/examples/eval_long_ppl.py which is used by the AttentionSink paper to evaluate the model's perplexity when AttentionSink is enabled.
    
    Differential Revision: [D66474732](https://our.internmc.facebook.com/intern/diff/D66474732/)
    
    Perplexity measured for llama 3.2 1B and 1B_Instruct model up to 40k tokens with AttentionSink enabled:
    
    <img width="966" alt="Screenshot 2024-11-25 at 2 46 04 PM" src="https://github.com/user-attachments/assets/ba7118f9-b5d7-4de8-b1fa-7d2ba0646515">
    
    
    [ghstack-poisoned]
    helunwencser committed Nov 26, 2024
    Configuration menu
    Copy the full SHA
    8c745db View commit details
    Browse the repository at this point in the history

Commits on Nov 27, 2024

  1. Update on "add eval for attention sink"

    This PR adds the function to evaluate the model's perplexity when AttentionSink is enabled.
    
    This is mostly copied from https://github.com/mit-han-lab/streaming-llm/blob/main/examples/eval_long_ppl.py which is used by the AttentionSink paper to evaluate the model's perplexity when AttentionSink is enabled.
    
    Differential Revision: [D66474732](https://our.internmc.facebook.com/intern/diff/D66474732/)
    
    Perplexity measured for llama 3.2 1B and 1B_Instruct model up to 40k tokens with AttentionSink enabled:
    
    <img width="966" alt="Screenshot 2024-11-25 at 2 46 04 PM" src="https://github.com/user-attachments/assets/ba7118f9-b5d7-4de8-b1fa-7d2ba0646515">
    
    
    [ghstack-poisoned]
    helunwencser committed Nov 27, 2024
    Configuration menu
    Copy the full SHA
    a3b8d91 View commit details
    Browse the repository at this point in the history

Commits on Dec 2, 2024

  1. Update on "add eval for attention sink"

    This PR adds the function to evaluate the model's perplexity when AttentionSink is enabled.
    
    This is mostly copied from https://github.com/mit-han-lab/streaming-llm/blob/main/examples/eval_long_ppl.py which is used by the AttentionSink paper to evaluate the model's perplexity when AttentionSink is enabled.
    
    Differential Revision: [D66474732](https://our.internmc.facebook.com/intern/diff/D66474732/)
    
    Perplexity measured for llama 3.2 1B and 1B_Instruct model up to 40k tokens with AttentionSink enabled:
    
    <img width="966" alt="Screenshot 2024-11-25 at 2 46 04 PM" src="https://github.com/user-attachments/assets/ba7118f9-b5d7-4de8-b1fa-7d2ba0646515">
    
    
    [ghstack-poisoned]
    helunwencser committed Dec 2, 2024
    Configuration menu
    Copy the full SHA
    38d9e1c View commit details
    Browse the repository at this point in the history
  2. Update on "add eval for attention sink"

    This PR adds the function to evaluate the model's perplexity when AttentionSink is enabled.
    
    This is mostly copied from https://github.com/mit-han-lab/streaming-llm/blob/main/examples/eval_long_ppl.py which is used by the AttentionSink paper to evaluate the model's perplexity when AttentionSink is enabled.
    
    Differential Revision: [D66474732](https://our.internmc.facebook.com/intern/diff/D66474732/)
    
    Perplexity measured for llama 3.2 1B and 1B_Instruct model up to 40k tokens with AttentionSink enabled:
    
    <img width="966" alt="Screenshot 2024-11-25 at 2 46 04 PM" src="https://github.com/user-attachments/assets/ba7118f9-b5d7-4de8-b1fa-7d2ba0646515">
    
    
    [ghstack-poisoned]
    helunwencser committed Dec 2, 2024
    Configuration menu
    Copy the full SHA
    2f4641f View commit details
    Browse the repository at this point in the history
  3. Update on "add eval for attention sink"

    This PR adds the function to evaluate the model's perplexity when AttentionSink is enabled.
    
    This is mostly copied from https://github.com/mit-han-lab/streaming-llm/blob/main/examples/eval_long_ppl.py which is used by the AttentionSink paper to evaluate the model's perplexity when AttentionSink is enabled.
    
    Differential Revision: [D66474732](https://our.internmc.facebook.com/intern/diff/D66474732/)
    
    Perplexity measured for llama 3.2 1B and 1B_Instruct model up to 40k tokens with AttentionSink enabled:
    
    <img width="966" alt="Screenshot 2024-11-25 at 2 46 04 PM" src="https://github.com/user-attachments/assets/ba7118f9-b5d7-4de8-b1fa-7d2ba0646515">
    
    
    [ghstack-poisoned]
    helunwencser committed Dec 2, 2024
    Configuration menu
    Copy the full SHA
    42f2282 View commit details
    Browse the repository at this point in the history
  4. Update on "add eval for attention sink"

    This PR adds the function to evaluate the model's perplexity when AttentionSink is enabled.
    
    This is mostly copied from https://github.com/mit-han-lab/streaming-llm/blob/main/examples/eval_long_ppl.py which is used by the AttentionSink paper to evaluate the model's perplexity when AttentionSink is enabled.
    
    Differential Revision: [D66474732](https://our.internmc.facebook.com/intern/diff/D66474732/)
    
    Perplexity measured for llama 3.2 1B and 1B_Instruct model up to 40k tokens with AttentionSink enabled:
    
    <img width="966" alt="Screenshot 2024-11-25 at 2 46 04 PM" src="https://github.com/user-attachments/assets/ba7118f9-b5d7-4de8-b1fa-7d2ba0646515">
    
    
    [ghstack-poisoned]
    helunwencser committed Dec 2, 2024
    Configuration menu
    Copy the full SHA
    a5420ae View commit details
    Browse the repository at this point in the history