Plan to publish the paper? #3

monkdou0 · 2020-12-09T06:53:49Z

This is the great work for appling pre-train model to the task of paraphrase!
Do you have the plan to publish a paper?

Vamsi995 · 2020-12-09T12:12:31Z

Umm, I don't mind a paper but I'm thinking is this paper worthy?? Also if we can make a paper, how to go about it?

monkdou0 · 2021-01-13T07:14:42Z

Maybe this is Lack of creativity a little.
By the way, I want to ask
the parameter in the training process:
train_batch_size=6,
eval_batch_size=6,
num_train_epochs':1,
is this the real used parameter?

Vamsi995 · 2021-01-13T07:21:33Z

Yes, and the num_train_epochs is 2.

Vamsi995 · 2021-01-13T07:22:06Z

I really thought of coming back to this and improving this further, but became a bit lazy.

monkdou0 · 2021-01-13T10:48:50Z

tokenizer = AutoTokenizer.from_pretrained("Vamsi/T5_Paraphrase_Paws")

i don't see the code related to save the tokenizer?
Is It same as t5-base tokenizer?

Vamsi995 · 2021-01-13T10:49:59Z

Yeah its the same as the t5-base tokenizer.

monkdou0 · 2021-01-13T10:55:54Z

Thank you very much!
and when i run this code, I get this bug.
the trainloss is nan and report this fault.
what's wrong?

Vamsi995 · 2021-01-13T11:04:41Z

If you are using a different dataset, you have to change the path to it in the T5FineTuner class, in the methods train_dataloader and val_dataloader

monkdou0 · 2021-01-13T11:15:39Z

I already changed it. maybe it is not this problem.
i see your code:

in the 63line, is this right?
it is self.forward()?

Vamsi995 · 2021-01-13T11:24:51Z

No i dont think thats an issue. It should be self only.

monkdou0 · 2021-01-13T12:22:10Z

Yeah! you are right, i know.
but i run your dataset, your code, noting changed

I print the outputs in this to debug and find it is empty list
why?

Vamsi995 · 2021-01-13T12:29:46Z

Let me figure this out. Right now I'm working on another project, it'll kind of take some time for me to get into this.

chris1899 · 2021-01-13T14:42:23Z

Yeah its the same as the t5-base tokenizer.

Are you sure it is base? When I use the t5-base tokenizer I get error:

Exception: expected value at line 1 column 1

When I use t5-small tokenizer it works fine.

Vamsi995 · 2021-01-13T16:34:27Z

pip install sentencepiece

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("t5-base") # t5-base works
model = T5ForConditionalGeneration.from_pretrained("Vamsi/T5_Paraphrase_Paws")

sentence = "This is something which i cannot understand at all"
text =  "paraphrase: " + sentence
encoding = tokenizer(text,padding=True, return_tensors="pt")
input_ids, attention_masks = encoding["input_ids"], encoding["attention_mask"]

outputs = model.generate(
    input_ids=input_ids, attention_mask=attention_masks,
    max_length=256,
    do_sample=True,
    top_k=200,
    top_p=0.95,
    early_stopping=True,
    num_return_sequences=5
)

for output in outputs:
    line = tokenizer.decode(output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
    print(line)

monkdou0 · 2021-01-14T05:03:30Z

I use your code to train, except for ignore the training loss Is nan fault,
same parameter,same dataset,same code but i get poor result compared with your model
this is my result

this is your model

I think this repo is useful for others, Please when you have space time, check your code
thank you very much! good luck!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plan to publish the paper? #3

Plan to publish the paper? #3

monkdou0 commented Dec 9, 2020

Vamsi995 commented Dec 9, 2020

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

chris1899 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021 •

edited

Loading

monkdou0 commented Jan 14, 2021

Plan to publish the paper? #3

Plan to publish the paper? #3

Comments

monkdou0 commented Dec 9, 2020

Vamsi995 commented Dec 9, 2020

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

monkdou0 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021

chris1899 commented Jan 13, 2021

Vamsi995 commented Jan 13, 2021 • edited Loading

monkdou0 commented Jan 14, 2021

Vamsi995 commented Jan 13, 2021 •

edited

Loading