Skip to content

Releases: TransformerLensOrg/TransformerLens

v2.11.0

31 Dec 02:04
f103deb
Compare
Choose a tag to compare

LLaMA 3.3 support! This release also includes a handful of usability improvements.

What's Changed

Full Changelog: v2.10.0...v2.11.0

v2.10.0

14 Dec 00:56
30c90f4
Compare
Choose a tag to compare

Huge update! This is likely going to be the last big 2.x update. This update greatly improves model implementation accuracy, and adds some of the newer Qwen models.

What's Changed

New Contributors

Full Changelog: v2.9.1...v2.10.0

v2.9.1

19 Nov 14:34
3267a43
Compare
Choose a tag to compare

Minor dependency change to address a change in an outside dependency

What's Changed

Full Changelog: v2.9.0...v2.9.1

v2.9.0

16 Nov 00:28
dc19c08
Compare
Choose a tag to compare

Lot's of accuracy improvements! A number of models are behaving closer to how they behave in Transformers, and a new internal configuration has been added to allow for more ease of use!

What's Changed

  • fix the bug that attention_mask and past_kv_cache cannot work together by @yzhhr in #772
  • Set prepend_bos to false by default for Bloom model family by @degenfabian in #775
  • Fix that if use_past_kv_cache is set to True models from the Bloom family produce weird outputs. by @degenfabian in #777

New Contributors

Full Changelog: v2.8.1...v2.9.0

v2.8.1

26 Oct 21:12
8f482fc
Compare
Choose a tag to compare

New notebook for comparing models, and bug fix with dealing with newer LLaMA models!

What's Changed

  • Logit comparator tool by @curt-tigges in #765
  • Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series by @Hzfinfdu in #764

New Contributors

Full Changelog: v2.8.0...v2.8.1

v2.8.0

22 Oct 00:32
b6e19d6
Compare
Choose a tag to compare

What's Changed

  • add transformer diagram by @akozlo in #749
  • Demo colab compatibility by @bryce13950 in #752
  • Add support for Mistral-Nemo-Base-2407 model by @ryanhoangt in #751
  • Fix the bug that tokenize_and_concatenate function not working for small dataset by @xy-z-code in #725
  • added new block for recent diagram, and colab compatibility notebook by @bryce13950 in #758
  • Add warning and halt execution for incorrect T5 model usage by @vatsalrathod16 in #757
  • New issue template for reporting model compatibility by @bryce13950 in #759
  • Add configurations for Llama 3.1 models(Llama-3.1-8B and Llama-3.1-70B) by @vatsalrathod16 in #761

New Contributors

Full Changelog: v2.7.1...v2.8.0

v2.7.1

04 Oct 23:12
1d8b1d8
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v2.7.0...v2.7.1

v2.7.0

26 Sep 23:56
395b237
Compare
Choose a tag to compare

Model 3.2 support! There is also a new compatibility added to the function test_promt to allow for multiple prompts, as well as a minor typo.

What's Changed

Full Changelog: v2.6.0...v2.7.0

v2.6.0

13 Sep 13:29
e64888d
Compare
Choose a tag to compare

Another nice little feature update! You now have the ability to ungroup the grouped query attention head component through a new config parameter ungroup_grouped_query_attention!

What's Changed

Full Changelog: v2.5.0...v2.6.0

v2.5.0

10 Sep 17:04
be334fb
Compare
Choose a tag to compare

Nice little release! This release adds a new parameter named first_n_layers that will allow you to specify how many layers of a model you want to load.

What's Changed

Full Changelog: v2.4.1...v2.5.0