Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRR@10=0.35 not achieved on fine-tuning monoBERT task #200

Open
d1shs0ap opened this issue Dec 30, 2021 · 13 comments
Open

MRR@10=0.35 not achieved on fine-tuning monoBERT task #200

d1shs0ap opened this issue Dec 30, 2021 · 13 comments
Assignees

Comments

@d1shs0ap
Copy link

d1shs0ap commented Dec 30, 2021

This is the task I replicated: https://github.com/capreolus-ir/capreolus/blob/feature/msmarco_psg/docs/reproduction/MS_MARCO.md, by following docs/reproduction/sample_slurm_script.sh.

Findings

"Mini" version

  • The task did not finish with the recommended time and compute settings, i.e. the following configs:

Screen Shot 2022-01-02 at 9 16 01 PM

  • After trying these configs(entire node), MRR@10=0.283 was achieved, slightly below the 0.295 given in the docs (finished in 21h)

Screen Shot 2022-01-02 at 9 16 37 PM

"Full" version

  • MRR@10=0.346 was achieved as opposed to the expected MRR@10=0.35+, with the following configs(entire node): (finished in 42h)

Screen Shot 2022-01-02 at 9 17 19 PM

@crystina-z
Copy link
Collaborator

Hi @d1shs0ap, thanks for helping to replicate. The link to the two config file seems to be broken, would u mind paste them into the issue? Also which commit are we using? Thanks!

@d1shs0ap
Copy link
Author

d1shs0ap commented Jan 3, 2022

Hey @crystina-z, updated the config screenshots. The commit that I ran the experiments on is e10928f. Thank you!

@d1shs0ap
Copy link
Author

d1shs0ap commented Jan 8, 2022

Hi @crystina-z, any updates on this issue?

@crystina-z
Copy link
Collaborator

hi @d1shs0ap sorry for the waiting, it took a long while for me to realize that it's missing one line to specify the decay rate in the config file - appending the reranker.trainer.decay=0.1 to the end of config should gives MRR@10 0.35+. I'll update it in the next PR. lmk if the issue is still there after adding this.

Thanks again for pointing this issue out!

crystina-z added a commit to nimasadri11/capreolus that referenced this issue Jan 9, 2022
@d1shs0ap
Copy link
Author

Ok great thanks, I'll test it out now!

@d1shs0ap
Copy link
Author

d1shs0ap commented Jan 16, 2022

Hey @crystina-z the experiment just finished, here are the results:

  • Mini version: MRR@10=0.293
  • Full version: MRR@10=0.347
  • Commit: e9cf9a6

Should I run the experiment again, with the latest commits?

@crystina-z
Copy link
Collaborator

crystina-z commented Jan 16, 2022

hi @d1shs0ap that would be nice. tho before that could u share the config file and command you used to run the scripts, just in case I missed anything there.

@d1shs0ap
Copy link
Author

d1shs0ap commented Jan 17, 2022

@crystina-z Here's the config file:

optimize=MRR@10
threshold=100
testthreshold=1

benchmark.name=msmarcopsg
rank.searcher.name=msmarcopsgbm25

reranker.name=TFBERTMaxP
reranker.pretrained=bert-base-uncased

reranker.extractor.usecache=True
reranker.extractor.numpassages=1
reranker.extractor.maxseqlen=512
reranker.extractor.maxqlen=50
reranker.extractor.tokenizer.pretrained=bert-base-uncased

reranker.trainer.usecache=True
reranker.trainer.niters=1
reranker.trainer.batch=4
reranker.trainer.evalbatch=256
reranker.trainer.itersize=48000
reranker.trainer.warmupiters=1
reranker.trainer.decay=0.1
reranker.trainer.decayiters=1
reranker.trainer.decaytype=linear

reranker.trainer.loss=pairwise_hinge_loss

I first ran

ENVDIR=$HOME/venv/capreolus-env
source $ENVDIR/bin/activate
module load java/11
module load python/3.7
module load scipy-stack

in the terminal, then sbatch docs/reproduction/sample_slurm_script.sh, which is the following:

#!/bin/bash
#SBATCH --job-name=msmarcopsg
#SBATCH --nodes=1
#SBATCH --gres=gpu:v100l:4
#SBATCH --ntasks-per-node=1
#SBATCH --mem=0
#SBATCH --time=48:00:00
#SBATCH --account=$SLURM_ACCOUNT
#SBATCH --cpus-per-task=32

#SBATCH -o ./msmarco-psg-output.log

niters=10
batch_size=16
validatefreq=$niters # to ensure the validation is run only at the end of training
decayiters=$niters   # either same with $itersize or 0
threshold=1000       # the top-k documents to rerank

python -m capreolus.run rerank.train with \
	file=docs/reproduction/config_msmarco.txt  \
	threshold=$threshold \
	reranker.trainer.niters=$niters \
	reranker.trainer.batch=$batch_size \
	reranker.trainer.decayiters=$decayiters \
	reranker.trainer.validatefreq=$validatefreq \
	fold=s1

I should also mention that this is ran on the forked repository nimasadri11/capreolus. Thanks!

@d1shs0ap
Copy link
Author

d1shs0ap commented Jan 25, 2022

@crystina-z Retrained with latest changes and got MRR@10=0.351. However, I ran this experiment on the nimasadri11 fork. Should I add a pull request on that fork? (Currently waiting for the experiment results for the original repo)

@crystina-z
Copy link
Collaborator

@d1shs0ap thanks for the update! yea for this issue let's wait for the result on the original repo for now? feel free to add another PR to nima's fork as well. thanks!

@d1shs0ap
Copy link
Author

@crystina-z The latest MRR I got after running on the original repo is 0.3496, is that good enough? Here's the output:

�[2;37m2022-01-26 13:41:35,370 - �[0m�[32mINFO - capreolus.trainer.tensorflow.train - dev metrics: MRR@10=0.350 P_1=0.230 P_10=0.064 P_20=0.036 P_5=0.105 judged_10=0.064 judged_20=0.036 judged_200=0.004 map=0.354 ndcg_cut_10=0.410 ndcg_cut_20=0.431 ndcg_cut_5=0.375 recall_100=0.814 recall_1000=0.853 recip_rank=0.359�[0m
�[2;37m2022-01-26 13:41:35,399 - �[0m�[32mINFO - capreolus.trainer.tensorflow.train - new best dev metric: 0.3496�[0m

@crystina-z
Copy link
Collaborator

@d1shs0ap the score still looks a bit lowish to me tho. maybe let's PR the record to nima's branch and I'll check the score here.

Could u please share your version of the transformers and all tensorflow related packages? Thanks so much!

@d1shs0ap
Copy link
Author

d1shs0ap commented Jan 27, 2022

@crystina-z Hey I made the PR to Nima's branch, below are my package versions:

tensorboard==2.7.0
tensorboard-data-server==0.6.1+computecanada
tensorboard-plugin-wit==1.8.0+computecanada
tensorflow==2.4.1+computecanada
tensorflow-addons==0.13.0+computecanada
tensorflow-datasets==4.4.0
tensorflow-estimator==2.4.0+computecanada
tensorflow-hub==0.12.0+computecanada
tensorflow-io-gcs-filesystem==0.22.0+computecanada
tensorflow-metadata==1.5.0
tensorflow-model-optimization==0.7.0
tensorflow-ranking==0.4.2
tensorflow-serving-api==2.7.0
tf-models-official==2.5.0
tf-slim==1.1.0

and

transformers==4.6.0

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants