Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different outputs when upgrading from adapter-transformers with LoRA #760

Open
2 of 4 tasks
jblamare opened this issue Nov 19, 2024 · 1 comment
Open
2 of 4 tasks
Labels
bug Something isn't working

Comments

@jblamare
Copy link

Environment info

  • adapters version: 1.0.1
  • transformers version: 4.45.2
  • Platform: Linux-5.15.0-91-generic-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.26.2
  • Safetensors version: 0.4.5
  • Accelerate version: not installed
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.5.1+cu124 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: No
  • Using GPU in script?: Yes
  • GPU type: NVIDIA RTX A6000

Information

Model I am using (Bert, XLNet ...): google/flan-t5-small

Language I am using the model on (English, Chinese ...): English

Adapter setup I am using (if any): LoRA

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

I have two environments:

  • env1
adapter-transformers==3.1.0
torch==1.13.1
  • env2
adapters==1.0.1
torch==2.5.1
transformers==4.45.2

I have some input_ids

input_ids = [262, 4, 4815, 10, 8668, 3, 5359, 27415, 5332, 3430, 276, 3577, 20186, 11951, 8472, 11359, 4209, 205, 20931, 23936, 3388, 27447, 8015]

I have a model checkpoint checkpoint.pth which has the T5 weights plus a LoRA adapter and was saved in env1:

with open("checkpoint.pth"), "wb") as f:
    torch.save(model.state_dict(), f)

From there I want to make sure I can load the model, run inference, and get the same outputs in env2. But the outputs are different. I run the following experiments:

  1. Create a T5 model, add the empty LoRA adapter, run inference - env1 and env2 get to the same output
  2. Create a T5 model, load the non-LoRA weights, run inference - env1 and env2 get to the same output
  3. Create a T5 model, add the LoRA adapter, load all the weights, run inference - env1 has the right output but env2 is different.

Here is the code I use (in env1 I just remove import adapters, adapters.init(model), and use adapter_config = transformers.adapters.LoRAConfig(r=8, alpha=16):

import adapters
import torch
import transformers

input_ids = [262, 4, 4815, 10, 8668, 3, 5359, 27415, 5332, 3430, 276, 3577, 20186, 11951, 8472, 11359, 4209, 205, 20931, 23936, 3388, 27447, 8015]
model = transformers.AutoModel.from_pretrained("google/flan-t5-small")
adapters.init(model)
adapter_config = adapters.LoRAConfig(r=8, alpha=16)
model.add_adapter("ct", config=adapter_config)
model.set_active_adapters("ct")
model = model.encoder
checkpoint = torch.load("checkpoint.pth", map_location=torch.device("cpu"))
model.load_state_dict(checkpoint, strict=False)
model = model.eval()
outputs = model(input_ids=torch.IntTensor([input_ids]))

Unfortunately I can't share the model weights. Any thoughts on what might be the reason I get different outputs only if I use LoRA and load my weights?

Expected behavior

Getting the same output in env1 and env2.

@jblamare jblamare added the bug Something isn't working label Nov 19, 2024
@jblamare
Copy link
Author

After diving into the codebase, I think I understand the difference. This looks like a bug in the LoRALinear class, but I might be missing something.

  • In adapter-transformers, the linear output and the delta are combined with result = lora.com(result, delta_w, scaling=gate). In particular, if lora.use_gating == False then gate is None which means the scaling used is 16/8=2 in my case.
  • In adapters, they are combined with scaling=1.0. There is a comment saying "scaling already applied in compose" but I don't think it is? compose will run compose_stack which will run LoRALinear's compose_single which will run LoRA's forward which does not involve any scaling.

Am I missing something?

@jblamare jblamare changed the title Different outputs when upgrading from adapter-transformers with T5, LoRA, and loaded weights Different outputs when upgrading from adapter-transformers with LoRA Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant