Update from pytorch-transformers to transformers library #61

andr-ec · 2020-02-21T18:46:23Z

updated dependencies and imports to use transformers.
updated ignore_index and ignored values in tensors to be -100, same as default in transformers and pytorch.
updated checkpoint save to use correct path #60

sshleifer · 2020-02-21T19:29:08Z

Thanks! Did you do anything to verify that this doesn't effect metrics or otherwise break?

andr-ec · 2020-02-21T19:54:30Z

I ran it on a smaller subset of the dataset and ran the interact script, it seemed to be fine. I'm currently training it on the full dataset. It'll take a while on my vm, but I'll report back once that finishes and I run the ConvAI2 evaluation scripts!

andr-ec · 2020-02-22T23:04:04Z

@sshleifer I trained with the parameters:
--model="openai-gpt" --device="cuda" --n_epochs=1
the evaluation script finished with

[ Finished evaluating tasks ['convai2:self'] using datatype valid ]
{'exs': 7801, 'hits@1': 0.75, 'hits@5': 0.961, 'hits@10': 0.992, 'hits@100': 1.0}
============================
FINAL Hits@1: 0.75

andr-ec · 2020-02-24T21:27:09Z

Do we need anything else to verify metrics?

sshleifer · 2020-02-25T21:28:30Z

Are those metrics ~equivalent to those in the preview here?

#30

Sorry I'm being so lazy, mostly focused on main repo :)

…o list

andr-ec · 2020-02-27T01:52:27Z

No worries, the tensorboard metrics are pretty close:
results from your PR

{'accuracy': 0.7466991411357519,
'average_accuracy': 0.74669914113575a19,
'average_nll': 2.6821035040007972,
'average_ppl': 14.615805388160778,
'nll': 2.6821035040007972}

results from my PR

{'accuracy': 0.7496474981307983,
'average_accuracy': 0.7496474981307983,
'average_nll': 2.6389272212982178,
'average_ppl': 13.99817943572998,
'nll': 2.6389272212982178}

julien-c · 2020-02-27T04:22:59Z

👍

g-karthik · 2020-02-27T07:36:57Z

@acarrera94 were you able to test these changes with gpt2? Specifically, did you try testing interact.py with gpt2?

g-karthik · 2020-02-27T07:48:32Z

When I try to interact with a model trained using gpt2, I get the following error:

    logits = logits[0, -1, :] / args.temperature
IndexError: too many indices for tensor of dimension 2

I think the logits = logits[0] fix is not applicable with the latest transformers and is just an artifact of pytorch_pretrained_bert, and hence can be removed.

@acarrera94 @julien-c @thomwolf

andr-ec · 2020-02-27T16:19:26Z

Good question, I didn’t have enough memory to train gpt2 on my vm, but if you have a checkpoint I can download id be happy to take a look at it

g-karthik · 2020-02-27T20:16:00Z

@acarrera94 you don't need to be able to train GPT-2 to figure out that this is a bug.

Do something like this in your Python interpreter:

>>> import pytorch_pretrained_bert
>>> from pytorch_pretrained_bert import GPT2LMHeadModel
>>> print(pytorch_pretrained_bert.__version__)
0.6.2
>>> model = GPT2LMHeadModel.from_pretrained("gpt2")
>>> tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
>>> input = "hello how are you"
>>> input_ids = tokenizer.convert_tokens_to_ids(tokenizer.tokenize(input))
>>> import torch
>>> input_ids = torch.tensor(input_ids)
>>> input_ids = input_ids.view(1,-1)
>>> outputs = model(input_ids)
>>> outputs[0].shape
torch.Size([1, 4, 50257])
>>> import transformers
>>> from transformers import GPT2LMHeadModel
>>> print(transformers.__version__)
2.3.0
>>> model = GPT2LMHeadModel.from_pretrained("gpt2")
>>> outputs = model(input_ids)
>>> outputs[0].shape
torch.Size([1, 4, 50257])

Clearly, the logits is a 3-dimensional tensor for gpt2. So the logits = logits[0] fix won't work for gpt2, even with pytorch_pretrained_bert 0.6.2.

cc @sshleifer since I think I remember you mentioning in one of your past pull requests that you were unable to interact with GPT-2 in the code as it currently stands -- UPDATE: I found it, this was the one where you mentioned this: #29

also cc @KasparPeterson since you were the one who introduced this fix in the first place in this pull request: #6

I think some changes were made to pytorch_pretrained_bert 0.6.2 after @KasparPeterson introduced this fix, rendering it unnecessary. Basically the logits used to be a 4-dimensional tensor for gpt2 before, hence making this fix necessary. @thomwolf can confirm since he approved Kaspar's pull request here.

) * updated dependencies, updated ignore_index and ignored values in tensors * removed idea project files * set transformers library version, updated additional special tokens to list

andr-ec added 2 commits February 21, 2020 11:36

updated dependencies, updated ignore_index and ignored values in tensors

371edeb

removed idea project files

1f8a0ce

andr-ec requested a review from sshleifer February 21, 2020 18:57

set transformers library version, updated additional special tokens t…

ebe2275

…o list

sshleifer merged commit 16074b2 into huggingface:master Feb 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update from pytorch-transformers to transformers library #61

Update from pytorch-transformers to transformers library #61

andr-ec commented Feb 21, 2020

sshleifer commented Feb 21, 2020

andr-ec commented Feb 21, 2020

andr-ec commented Feb 22, 2020

andr-ec commented Feb 24, 2020

sshleifer commented Feb 25, 2020 •

edited

Loading

andr-ec commented Feb 27, 2020

julien-c commented Feb 27, 2020

g-karthik commented Feb 27, 2020

g-karthik commented Feb 27, 2020 •

edited

Loading

andr-ec commented Feb 27, 2020

g-karthik commented Feb 27, 2020 •

edited

Loading

Update from pytorch-transformers to transformers library #61

Update from pytorch-transformers to transformers library #61

Conversation

andr-ec commented Feb 21, 2020

sshleifer commented Feb 21, 2020

andr-ec commented Feb 21, 2020

andr-ec commented Feb 22, 2020

andr-ec commented Feb 24, 2020

sshleifer commented Feb 25, 2020 • edited Loading

andr-ec commented Feb 27, 2020

julien-c commented Feb 27, 2020

g-karthik commented Feb 27, 2020

g-karthik commented Feb 27, 2020 • edited Loading

andr-ec commented Feb 27, 2020

g-karthik commented Feb 27, 2020 • edited Loading

sshleifer commented Feb 25, 2020 •

edited

Loading

g-karthik commented Feb 27, 2020 •

edited

Loading

g-karthik commented Feb 27, 2020 •

edited

Loading