You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for the great work. I have a question regarding the used trainset for different types of models (Fully fine-tuned, Lora+, models for extra experiments in paper).
In the ReadMe, it states, "There is no need to make supervised fine-tuning upon the fine-tuned context extended models. It is all right to directly use the base model as Llama2-chat models, as the amount of long instruction following data is enough for SFT." While in the paper, Figure 5's caption suggests that Lora+ is trained with RedPajama.
I'm seeking clarification on the following points:
Do the released models refer to those that have undergone unsupervised fine-tuning on RedPajama and then tested on PG19?
Is Table 9, which evaluates the LongBench benchmark, the only one involving supervised fine-tuning with LongAlpaca-12k based on models fine-tuned with RedPajama?
Where can I find the performance of using only LongAlpaca-12k to derive the Lora adapter, embeds, and norm layer?
RedPajama (unsupervised)
LongAlpaca-12k (supervised)
Fully fine-tuned (readme)
√
Lora+ (readme)
√
Models for LongBench benchmark (paper)
√
√
I've drafted a table to summarize my understanding of the training configurations mentioned in both the ReadMe and the paper. Could you please confirm if this representation is correct?
The text was updated successfully, but these errors were encountered:
Hi, thanks for the great work. I have a question regarding the used trainset for different types of models (Fully fine-tuned, Lora+, models for extra experiments in paper).
In the ReadMe, it states, "There is no need to make supervised fine-tuning upon the fine-tuned context extended models. It is all right to directly use the base model as Llama2-chat models, as the amount of long instruction following data is enough for SFT." While in the paper, Figure 5's caption suggests that Lora+ is trained with RedPajama.
I'm seeking clarification on the following points:
I've drafted a table to summarize my understanding of the training configurations mentioned in both the ReadMe and the paper. Could you please confirm if this representation is correct?
The text was updated successfully, but these errors were encountered: