Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P1] For left_padding in compute_metrics.py #110

Open
mrsempress opened this issue Jun 18, 2024 · 2 comments
Open

[P1] For left_padding in compute_metrics.py #110

mrsempress opened this issue Jun 18, 2024 · 2 comments
Assignees
Labels
question Further information is requested

Comments

@mrsempress
Copy link

mrsempress commented Jun 18, 2024

When I training using llama-7b and math, I found that the sizes of left_pdding and intervention_locations did not match. This is because the tokenizer. bos_token_id=0 of llama-7b has multiple positions of 0 in input.
If we use the following formula in the project: left_ adding=(inputs ["input_ids"]=tokenizer. bos_token_id). nonzero (as_tuple=True) [1], then the size of left_ adding is (N), where N is the number of inputs ["input_ids"] that are 0, rather than the desired size: (batch_size).
Therefore, I have changed it to the following code:

Mask=(inputs ["input_ids"]==tokenizer. bos_token_id)
Indications=torch. top (mask. int()), k=1, dim=-1).indices
Left_pdding=torch. where (mask. any (dim=-1), indices. reshape (mask. shape [: -1]), -1)

I hope the author can verify whether it is due to other issues that caused my error or if I understand the reason; Is the revised code correct.

@mrsempress
Copy link
Author

mrsempress commented Jun 18, 2024

The command I use is

CUDA_VISIBLE_DEVICES=6 python examples/loreft/train.py -task gsm8k -model models/Llama/Llama/llama-7b-hf/ -seed 42 -l all -r 4 -p f7+l7 -e 12 -lr 9e-4 -type NodireftIntervention -gradient_accumulation_steps 4 -batch_size 8 -eval_batch_size 4 --dropout 0.05 --test_split validation --use_normalized_template --greedy_decoding --warmup_ratio 0.00 --weight_decay 0.06 --save_model

@frankaging frankaging changed the title For left_padding in compute_metrics.py [P1] For left_padding in compute_metrics.py Jun 24, 2024
@frankaging frankaging self-assigned this Jun 24, 2024
@frankaging frankaging added the question Further information is requested label Jun 24, 2024
@frankaging
Copy link
Collaborator

@mrsempress Thanks for your question. Could you elaborate this?

When I training using llama-7b and math, I found that the sizes of left_pdding and intervention_locations did not match.

intervention_locations is determined by -p f7+l7 (first 7 and last 7 prompt tokens), which does not need to match the size of left_padding IIUC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants