Releases · TransformerLensOrg/TransformerLens

31 Dec 02:04

bryce13950

v2.11.0

f103deb

v2.11.0 Latest

Latest

LLaMA 3.3 support! This release also includes a handful of usability improvements.

What's Changed

Set prepend_bos to false by default for Qwen models by @degenfabian in #815
Throw error when using attn_in with grouped query attention by @degenfabian in #810
Feature llama 33 by @bryce13950 in #826

Full Changelog: v2.10.0...v2.11.0

Contributors

bryce13950 and degenfabian

Assets 2

14 Dec 00:56

bryce13950

v2.10.0

30c90f4

v2.10.0

Huge update! This is likely going to be the last big 2.x update. This update greatly improves model implementation accuracy, and adds some of the newer Qwen models.

What's Changed

Remove einsum in forward pass in AbstractAttention by @degenfabian in #783
Colab compatibility bug fixes by @degenfabian in #794
Remove einsum usage from create_alibi_bias function by @degenfabian in #781
Actions token access by @bryce13950 in #797
Remove einsum in apply_causal_mask in abstract_attention.py by @degenfabian in #782
clarified arguments a bit for hook_points by @bryce13950 in #799
Remove einsum in logit_attrs in ActivationCache by @degenfabian in #788
Remove einsum in compute_head_results in ActivationCache by @degenfabian in #789
Remove einsum usage in refactor_factored_attn_matrices in HookedTransformer by @degenfabian in #791
Remove einsum usage in _get_w_in_matrix in SVDInterpreter by @degenfabian in #792
Remove einsum usage in forward function of BertMLMHead by @degenfabian in #793
Set default_prepend_bos to False in Bloom model configuration by @degenfabian in #806
Remove einsum in complex_attn_linear by @degenfabian in #790
Add a demo of collecting activations from a single location in the model. by @adamkarvonen in #807
Add support for Qwen_with_Questions by @degenfabian in #811
Added support for Qwen2.5 by @israel-adewuyi in #809
Updated devcontainers to use python3.11 by @jonasrohw in #812

New Contributors

@israel-adewuyi made their first contribution in #809
@jonasrohw made their first contribution in #812

Full Changelog: v2.9.1...v2.10.0

Contributors

bryce13950, jonasrohw, and 3 other contributors

Assets 2

19 Nov 14:34

bryce13950

v2.9.1

3267a43

v2.9.1

Minor dependency change to address a change in an outside dependency

What's Changed

added typeguard dependency by @bryce13950 in #786

Full Changelog: v2.9.0...v2.9.1

Contributors

bryce13950

Assets 2

16 Nov 00:28

bryce13950

v2.9.0

dc19c08

v2.9.0

Lot's of accuracy improvements! A number of models are behaving closer to how they behave in Transformers, and a new internal configuration has been added to allow for more ease of use!

What's Changed

fix the bug that attention_mask and past_kv_cache cannot work together by @yzhhr in #772
Set prepend_bos to false by default for Bloom model family by @degenfabian in #775
Fix that if use_past_kv_cache is set to True models from the Bloom family produce weird outputs. by @degenfabian in #777

New Contributors

@yzhhr made their first contribution in #772
@degenfabian made their first contribution in #775

Full Changelog: v2.8.1...v2.9.0

Contributors

yzhhr and degenfabian

Assets 2

26 Oct 21:12

bryce13950

v2.8.1

8f482fc

v2.8.1

New notebook for comparing models, and bug fix with dealing with newer LLaMA models!

What's Changed

Logit comparator tool by @curt-tigges in #765
Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series by @Hzfinfdu in #764

New Contributors

@Hzfinfdu made their first contribution in #764

Full Changelog: v2.8.0...v2.8.1

Contributors

curt-tigges and Hzfinfdu

Assets 2

22 Oct 00:32

bryce13950

v2.8.0

b6e19d6

v2.8.0

What's Changed

add transformer diagram by @akozlo in #749
Demo colab compatibility by @bryce13950 in #752
Add support for Mistral-Nemo-Base-2407 model by @ryanhoangt in #751
Fix the bug that tokenize_and_concatenate function not working for small dataset by @xy-z-code in #725
added new block for recent diagram, and colab compatibility notebook by @bryce13950 in #758
Add warning and halt execution for incorrect T5 model usage by @vatsalrathod16 in #757
New issue template for reporting model compatibility by @bryce13950 in #759
Add configurations for Llama 3.1 models(Llama-3.1-8B and Llama-3.1-70B) by @vatsalrathod16 in #761

New Contributors

@akozlo made their first contribution in #749
@ryanhoangt made their first contribution in #751
@xy-z-code made their first contribution in #725
@vatsalrathod16 made their first contribution in #757

Full Changelog: v2.7.1...v2.8.0

Contributors

bryce13950, akozlo, and 3 other contributors

Assets 2

04 Oct 23:12

bryce13950

v2.7.1

1d8b1d8

v2.7.1

What's Changed

Updated broken Slack link by @neelnanda-io in #742
from_pretrained has correct return type (i.e. HookedSAETransformer.from_pretrained returns HookedSAETransformer) by @callummcdougall in #743
Avoid warning in utils.download_file_from_hf by @albertsgarde in #739

New Contributors

@albertsgarde made their first contribution in #739

Full Changelog: v2.7.0...v2.7.1

Contributors

albertsgarde, callummcdougall, and neelnanda-io

Assets 2

26 Sep 23:56

bryce13950

v2.7.0

395b237

v2.7.0

Model 3.2 support! There is also a new compatibility added to the function test_promt to allow for multiple prompts, as well as a minor typo.

What's Changed

Typo hooked encoder by @bryce13950 in #732
utils.test_prompt compares multiple prompts by @callummcdougall in #733
Model llama 3.2 by @bryce13950 in #734

Full Changelog: v2.6.0...v2.7.0

Contributors

bryce13950 and callummcdougall

Assets 2

13 Sep 13:29

bryce13950

v2.6.0

e64888d

v2.6.0

Another nice little feature update! You now have the ability to ungroup the grouped query attention head component through a new config parameter ungroup_grouped_query_attention!

What's Changed

Ungrouping GQA by @hannamw & @FlyingPumba in #713

Full Changelog: v2.5.0...v2.6.0

Contributors

FlyingPumba and hannamw

Assets 2

10 Sep 17:04

bryce13950

v2.5.0

be334fb

v2.5.0

Nice little release! This release adds a new parameter named first_n_layers that will allow you to specify how many layers of a model you want to load.

What's Changed

Fix typo in bug issue template by @JasonGross in #715
HookedTransformerConfig docs string: weight_init_mode => init_mode by @JasonGross in #716
Allow loading only first n layers. by @joelburget in #717

Full Changelog: v2.4.1...v2.5.0

Contributors

joelburget and JasonGross

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Releases: TransformerLensOrg/TransformerLens

v2.11.0

What's Changed

Contributors

v2.10.0

What's Changed

New Contributors

Contributors

v2.9.1

What's Changed

Contributors

v2.9.0

What's Changed

New Contributors

Contributors

v2.8.1

What's Changed

New Contributors

Contributors

v2.8.0

What's Changed

New Contributors

Contributors

v2.7.1

What's Changed

New Contributors

Contributors

v2.7.0

What's Changed

Contributors

v2.6.0

What's Changed

Contributors

v2.5.0

What's Changed

Contributors