Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan to publish the paper? #3

Open
monkdou0 opened this issue Dec 9, 2020 · 15 comments
Open

Plan to publish the paper? #3

monkdou0 opened this issue Dec 9, 2020 · 15 comments

Comments

@monkdou0
Copy link

monkdou0 commented Dec 9, 2020

This is the great work for appling pre-train model to the task of paraphrase!
Do you have the plan to publish a paper?

@Vamsi995
Copy link
Owner

Vamsi995 commented Dec 9, 2020

Umm, I don't mind a paper but I'm thinking is this paper worthy?? Also if we can make a paper, how to go about it?

@monkdou0
Copy link
Author

Maybe this is Lack of creativity a little.
By the way, I want to ask
the parameter in the training process:
train_batch_size=6,
eval_batch_size=6,
num_train_epochs':1,
is this the real used parameter?

@Vamsi995
Copy link
Owner

Yes, and the num_train_epochs is 2.

@Vamsi995
Copy link
Owner

I really thought of coming back to this and improving this further, but became a bit lazy.

@monkdou0
Copy link
Author

tokenizer = AutoTokenizer.from_pretrained("Vamsi/T5_Paraphrase_Paws")

i don't see the code related to save the tokenizer?
Is It same as t5-base tokenizer?

@Vamsi995
Copy link
Owner

Yeah its the same as the t5-base tokenizer.

@monkdou0
Copy link
Author

Thank you very much!
and when i run this code, I get this bug.
the trainloss is nan and report this fault.
what's wrong?
image

@Vamsi995
Copy link
Owner

If you are using a different dataset, you have to change the path to it in the T5FineTuner class, in the methods train_dataloader and val_dataloader

@monkdou0
Copy link
Author

I already changed it. maybe it is not this problem.
i see your code:
image
in the 63line, is this right?
it is self.forward()?

@Vamsi995
Copy link
Owner

No i dont think thats an issue. It should be self only.

@monkdou0
Copy link
Author

Yeah! you are right, i know.
but i run your dataset, your code, noting changed
image
I print the outputs in this to debug and find it is empty list
why?

@Vamsi995
Copy link
Owner

Let me figure this out. Right now I'm working on another project, it'll kind of take some time for me to get into this.

@chris1899
Copy link

Yeah its the same as the t5-base tokenizer.

Are you sure it is base? When I use the t5-base tokenizer I get error:

Exception: expected value at line 1 column 1

When I use t5-small tokenizer it works fine.

@Vamsi995
Copy link
Owner

Vamsi995 commented Jan 13, 2021

pip install sentencepiece
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("t5-base") # t5-base works
model = T5ForConditionalGeneration.from_pretrained("Vamsi/T5_Paraphrase_Paws")

sentence = "This is something which i cannot understand at all"
text =  "paraphrase: " + sentence
encoding = tokenizer(text,padding=True, return_tensors="pt")
input_ids, attention_masks = encoding["input_ids"], encoding["attention_mask"]

outputs = model.generate(
    input_ids=input_ids, attention_mask=attention_masks,
    max_length=256,
    do_sample=True,
    top_k=200,
    top_p=0.95,
    early_stopping=True,
    num_return_sequences=5
)

for output in outputs:
    line = tokenizer.decode(output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
    print(line)

@monkdou0
Copy link
Author

I use your code to train, except for ignore the training loss Is nan fault,
same parameter,same dataset,same code but i get poor result compared with your model
this is my result
image
this is your model
image

I think this repo is useful for others, Please when you have space time, check your code
thank you very much! good luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants