-
Notifications
You must be signed in to change notification settings - Fork 288
Issues: TransformerLensOrg/TransformerLens
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug Report] hook_normalized is inconsistent between RMSNorm and LayerNorm
#747
opened Oct 6, 2024 by
neelnanda-io
[Proposal] Add example of collecting activations from a single layer.
demo
Creating a demo or tutorial
#746
opened Oct 5, 2024 by
adamkarvonen
1 task done
[Bug Report] Q cannot be reshaped correctly when model is loaded in 4bit
bug
Something isn't working
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#737
opened Sep 28, 2024 by
po13on
Fine tune model and using this framework
needs-information
More information is needed from the issue creator before moving forward.
question
Further information is requested
#730
opened Sep 26, 2024 by
nitay16
[Proposal] Guide to adding new models
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
documentation
Improvements or additions to documentation
#729
opened Sep 26, 2024 by
deven367
1 task done
[Proposal] Warn people when trying to load t5 into HookedTransformer
complexity-simple
Simple issues, which may be good for beginners
good first issue
Good for newcomers
#726
opened Sep 23, 2024 by
bryce13950
1 task done
[Bug Report] Review current matmul function usages
bug
Something isn't working
complexity-high
Very complicated changes for people to address who are quite familiar with the code
#720
opened Sep 10, 2024 by
bryce13950
1 task done
[Proposal] Add frequency-based RoPE support for Llama 3.1 models
#719
opened Sep 9, 2024 by
frances720
1 task done
[Proposal] Add MVP Support For 1-2 Models Per-Modality
complexity-high
Very complicated changes for people to address who are quite familiar with the code
discussion
No action needed yet
#710
opened Aug 31, 2024 by
4gatepylon
1 task done
[Bug Report]
tokenize_and_concatenate
doesn't work with small datasets.
#707
opened Aug 23, 2024 by
yash-srivastava19
1 task done
[Proposal] Add support for TracrBench
complexity-high
Very complicated changes for people to address who are quite familiar with the code
new-architecture
This card involves adding a new architecture .
#704
opened Aug 14, 2024 by
HannesThurnherr
How to get the Activation cache while the LLM is generating new tokens?
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
#697
opened Aug 7, 2024 by
Meehaohao
[Bug Report] Gemma-2-2b-it output logit doesn't match with huggingface
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
#693
opened Aug 2, 2024 by
yeutong
1 task done
[Proposal] Add Lllama 3.1 support
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
new-architecture
This card involves adding a new architecture .
#691
opened Jul 31, 2024 by
ssuukk
1 task done
[Bug Report] Different results from HuggingFace when using the GPT2 small example
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#685
opened Jul 27, 2024 by
nreHieW
1 task done
[Question] Why does Transformer Lens only support quantized LLaMA models?
#684
opened Jul 26, 2024 by
miguel-kjh
[Bug Report] Qwen model implementation is too inaccurate
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#683
opened Jul 23, 2024 by
bryce13950
1 task done
[Proposal] Allow tied embeddings
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
enhancement
New feature or request
#671
opened Jul 12, 2024 by
neelnanda-io
does run_with_cache method support data parallel , how can I do it ?
#669
opened Jul 12, 2024 by
Yang-bug-star
[Proposal] Allow recent versions of beartype
complexity-simple
Simple issues, which may be good for beginners
tooling
Anything pertaining to outside tools used within the codebase
#665
opened Jul 10, 2024 by
jettjaniak
1 task done
[Bug Report] Pythia output inconsistent across batch sizes when use_split_qkv_input=True
bug
Something isn't working
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
#661
opened Jul 8, 2024 by
oliveradk
1 task done
[Bug Report] RMSNormPre in Transformer_lens is maybe different from Llama source code?
#657
opened Jul 6, 2024 by
wangyifei0047
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.