-
Notifications
You must be signed in to change notification settings - Fork 98
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
BM25: Boost file name matches at root
With this change we prioritize file name matches at the root of the repository. This is based on the intuition that more important files tend to be closer to the root. We also change the parameter b in the BM25 scoring function from 0.75 to 0.3 to reduce the impact of the document length on the final score. This is based on experiments that showed that our current scoring overly penalizes long but important documents. For example, we consider documents such as a README.md or CHANGELOG at the root of the repository of high quality. However, these documents also tend to be relatively long and are thus penalized. Test plan: Updated unit test
- Loading branch information
1 parent
c21df41
commit 9c75cfa
Showing
4 changed files
with
32 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters