1603TF

tf-idf Experiments

Word overlap is much better than cosine distance. BM25 is awesome (while treating s0 as the query, i.e. weighing based only on s1 occurences).

wang:

Model	trainAllMRR	devMRR	testMAP	testMRR	settings
termfreq	0.813992	0.829004	0.630100	0.765363	(defaults) termfreq-5e150127bfa12fab-00
termfreq	0.714169	0.725217	0.578200	0.708957	`freq_mode="tf"` termfreq-2d3b759c31ae7a0c-00
termfreq	0.602093	0.684234	0.545400	0.641078	`score_mode='cos'` termfreq-11d9aad0ee302e88-00
termfreq	0.601831	0.696384	0.549600	0.634582	`freq_mode="tf"` `score_mode='cos'` termfreq-5121bb88a5922f9-00

curatedv2:

Model	trainAllMRR	devMRR	testMAP	testMRR	settings
termfreq	0.483538	0.452647	0.294300	0.484530	(defaults) termfreq-7c2a88efab16d07d-00
termfreq	0.339544	0.324693	0.242700	0.337893	freq_mode="tf" termfreq-26a946355b7ba20d-00
termfreq	0.254189	0.214607	0.201000	0.275696	`score_mode='cos'` termfreq--4326af5eba873e89-00
termfreq	0.251412	0.238331	0.204800	0.278305	`freq_mode="tf"` `score_mode='cos'` termfreq--4e5be392f5f78798-00

large2470:

Model	trainAllMRR	devMRR	testMAP	testMRR	settings
termfreq	0.441573	0.432115	0.313900	0.490822	(defaults) termfreq-1c1547925afa2a69-00
termfreq	0.325390	0.328255	0.266800	0.362613	`freq_mode="tf"` termfreq--1146821b4b0960cf-00