Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with implement of dualprompt #56

Closed
AmeliaBartxn opened this issue Sep 27, 2024 · 1 comment
Closed

Issue with implement of dualprompt #56

AmeliaBartxn opened this issue Sep 27, 2024 · 1 comment

Comments

@AmeliaBartxn
Copy link

Thank you for building Mammoth, which has helped me a lot. However, while using Mammoth to compare baseline methods, I accidentally discovered an error in dualprompt, specifically in lines 117-122 of ./mammoth/models/dualprompt_utils/prompt.py.

            if self.use_prefix_tune_for_e_prompt:
                batched_prompt_raw = self.prompt[:, :, idx]  # num_layers, B, top_k, length, C
                num_layers, dual, batch_size, top_k, length, num_heads, heads_embed_dim = batched_prompt_raw.shape
                batched_prompt = batched_prompt_raw.reshape(
                    num_layers, batch_size, dual, top_k * length, num_heads, heads_embed_dim
                )

While transforming the dimensions of a tensor, reshape should be used with caution, especially since the goal here is to swap two dimensions.

Furthermore, I checked the referenced code in Mammoth and found that this issue has already been raised by someone else: JH-LEE-KR/dualprompt-pytorch#8 (comment)

For the sake of reproducibility of the methods, I suggest making the necessary corrections.
Many thanks.

@loribonna
Copy link
Collaborator

Hi @AmeliaBartxn, yes I agree in general the permute should have been used instead of the reshape to swap the dimensions. However, since the operation is performed directly on a nn.Parameter it should not make a difference (and indeed I verified this by changing the operation with batched_prompt_raw = batched_prompt_raw.permute(0, 2, 1, 3, 4, 5, 6) and got the same results).

I think that this also addresses JH-LEE-KR/dualprompt-pytorch#8 (and also sun-hailong/LAMDA-PILOT#29 and CVIR/ConvPrompt#4).

I will leave the code with the reshape for the moment and maybe add an hyperparameter to conditionally select the permute operation, since the original implementation of dualprompt also uses the reshape instead of the permute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants