You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey @sleepinyourhat@zphang@jeswan@HaokunLiu! I noticed that you guys worked on adding new models to the JiantTransformersModel so tagged you here :) I was trying to run a RAG model for fine-tuning on the MrQA-NQ dataset using jiant but this does not seem to be supported. It throws a KeyError: rag when I run the following command:
where run_train_task.sh is a shell script that I wrote to run exactly the same commands as run_train_task.sbatch without using sbatch.
To Reproduce
Tell use which version of jiant you're using - I've git cloned the repo as it is, but since I'm running an IRT experiment, I'm using the irt_scripts/ directory in tandem with the jiant/ directory on the branch IRT_experiments
Describe the environment where you're using jiant, e.g, "2 P40 GPUs" - I'm using jiant in Google Colab along with Google Drive.
Expected behavior
I expected the RAG model to start finetuning and generate the cache files in the experiments/cache/ directory as outlined in the README.
Screenshots
Additional context
On investigating further, I realized that IRT_experiments was still using transformers==3.1.0 which does not support rag architectures. I uninstalled that version of transformers and tried upgrading to transformers>=3.5.0 to see if that would fix the issue. But that resulted in a new issue saying ModuleNotFoundError: No module named 'transformers.tokenization_bert'. Looks like transformers refactored their code in later versions while incorporating different models. If the IRT_experiments branch was up to date with the changes on the master branch, would that fix things? Because I noticed that jiant on master was using transformers==4.5.0. Is there any other way that I can use a RAG model along with the scripts in irt_scripts/ for my IRT research? Specifically, I need to use the fine-tuning, predicting and post-processing scripts which are available in the IRT_experiments branch in the irt_scripts/ directory. Please respond at your earliest convenience! Thanks!
The text was updated successfully, but these errors were encountered:
Even further investigation has led me to understand that in order to carry out my research with the existing irt_scripts/ framework and the latest version of jiant, I would have to port that directory to the master branch on my local machine and add the RAG model to the ModelArchitectures and TOKENIZER_DICT in jiant/proj/main/modeling/primary.py as outlined here: https://github.com/nyu-mll/jiant/blob/51e9be2a8ed8589e884ea927e348df8342c40fcf/guides/models/adding_models.md
I'm having trouble understanding how to implement the normalize_tokenizations(), get_mlm_weights_dict(), and get_feat_spec() functions in the subclass created for the RAG model. Any suggestions or advice on how to move forward @sleepinyourhat@zphang@jeswan@HaokunLiu? Thanks a lot!
pk1130
changed the title
Unable to run RAG model for fine-tuning and predicting with jiant
Unable to run RAG model for tokenizing, fine-tuning, and predicting with jiant on branch IRT_experimentsAug 2, 2021
normalize_tokenizations has to do with aligning token spans between raw text and the model tokenizer's tokens. Depending on which tokenizer you're using, you might be able to piggyback off an existing implementation.
get_mlm_weights_dict gets the weights for the MLM head from the pretrained model. In contrast to standard NLU tasks, which use a new classifier head, an MLM-task ought to reuse the MLM head from pretraining. Conversely, if you are not using an MLM task, this should not impact you.
get_feat_spec is a somewhat older abstraction for describing different tokenizer setups, e.g. padding IDs. Like with normalize_tokenizations, you might be able to piggyback off an existing implementation if you are using a similar tokenizer.
Describe the bug
Hey @sleepinyourhat @zphang @jeswan @HaokunLiu! I noticed that you guys worked on adding new models to the
JiantTransformersModel
so tagged you here :) I was trying to run a RAG model for fine-tuning on the MrQA-NQ dataset using jiant but this does not seem to be supported. It throws aKeyError: rag
when I run the following command:where
run_train_task.sh
is a shell script that I wrote to run exactly the same commands asrun_train_task.sbatch
without usingsbatch
.To Reproduce
jiant
you're using - I've git cloned the repo as it is, but since I'm running an IRT experiment, I'm using theirt_scripts/
directory in tandem with thejiant/
directory on the branchIRT_experiments
jiant
, e.g, "2 P40 GPUs" - I'm usingjiant
in Google Colab along with Google Drive.Expected behavior
I expected the RAG model to start finetuning and generate the
cache
files in theexperiments/cache/
directory as outlined in the README.Screenshots
Additional context
On investigating further, I realized that
IRT_experiments
was still usingtransformers==3.1.0
which does not supportrag
architectures. I uninstalled that version of transformers and tried upgrading totransformers>=3.5.0
to see if that would fix the issue. But that resulted in a new issue sayingModuleNotFoundError: No module named 'transformers.tokenization_bert'
. Looks like transformers refactored their code in later versions while incorporating different models. If theIRT_experiments
branch was up to date with the changes on themaster
branch, would that fix things? Because I noticed thatjiant
onmaster
was usingtransformers==4.5.0
. Is there any other way that I can use a RAG model along with the scripts inirt_scripts/
for my IRT research? Specifically, I need to use the fine-tuning, predicting and post-processing scripts which are available in theIRT_experiments
branch in theirt_scripts/
directory. Please respond at your earliest convenience! Thanks!The text was updated successfully, but these errors were encountered: