-
Hi everyone, thank you for this amazing library. I am doing experiments on MSMARCOv1. I indexed the collection and searched with BM25, and got the 0.184 MRR@10 as expected. On my computer, search took around 2 minutes. Then I switched to an impact index. I only added the "-impact" flag on my indexing and search script. Everything else is the same. Search now takes slightly more than 3 minutes. In other impact experiments, when I increase the TF of certain terms in a document according to the weights output by some neural model (e.g., SPLADE), the difference between impact scoring and BM25 scoring becomes very large. Notice I'm not doing query/document expansion with new terms - just reweighing the existing terms, i.e., changing their query/document term frequency. What is the intuition behind search taking longer with an impact index? Am I missing something that's being done under the hood? The posting lists do not get larger, only the stored TFs are different. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
I think this will answer your question: https://dl.acm.org/doi/10.1145/3576922 Please read first and then ask if you have follow-up? @JMMackenzie @andrewtrotman |
Beta Was this translation helpful? Give feedback.
I think this will answer your question: https://dl.acm.org/doi/10.1145/3576922
Please read first and then ask if you have follow-up? @JMMackenzie @andrewtrotman