-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The warmup model for the doc datasat #18
Comments
You can use the provided STAR checkpoint trained on the passage dataset as a start. |
Hi @jingtaozhan, Thank you for sharing the code!
How can I obtain the BM25 Neg model used in the experiments using the MS MARCO Doc dataset? Thank you in advance! |
Thank you for your great question @ikuyamada |
Thank you very much for your prompt answer! I found the train data on the left of the STAR row in the Doc Retrieval table. Is it the data used to generate negatives in the STAR experiment? |
The train data is the retrieval results of STAR on training queries. It is not the one I used to generate negatives. |
Thanks so much! Looking forward to the release of the model! |
@ikuyamada |
@jingtaozhan Thank you for making the model available! I will evaluate the model and get back to you soon! |
Hi @jingtaozhan, thanks for the code and the checkpoints. I hope to apply it to my future work but I got into trouble when replicating the BM25 Neg model on the doc dataset. I fine-tuned the provided passage-star checkpoint with the official BM25 top100 dataset, but I only got 0.33 MRR@100 after 50K training steps. Could you share the doc BM25 negatives and the script to replicate the BM25 Neg model? Many thanks! |
Hi, jingtao
can you share the warmup model for the doc ranking data?
The text was updated successfully, but these errors were encountered: