-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: lowering references threshold to 0.5! #104
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,7 @@ | |
|
||
|
||
class PrepareAnswerSources: | ||
def __init__(self, threshold: float = 0.7) -> None: | ||
def __init__(self, threshold: float = 0.5) -> None: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Codebase verification Lowering the threshold requires test updates and monitoring The implementation shows that the threshold is used to filter URLs based on node scores, and the test suite still uses the old threshold (0.7). This change needs attention in two areas:
🔗 Analysis chainConsider the implications of lowering the similarity threshold Lowering the threshold from 0.7 to 0.5 will result in more reference nodes being included in the output. While this increases recall, it might:
Let's analyze the impact: Let me gather more information about how the threshold is actually used in the code. 🏁 Scripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Search for test files and test cases related to threshold validation
fd -e py -e yaml -e json test_ | xargs rg -A 5 "threshold.*0.[57]|PrepareAnswerSources"
# Look for any performance benchmarks or memory profiling
fd -e py benchmark | xargs rg -A 5 "PrepareAnswerSources"
Length of output: 3609 Script: #!/bin/bash
# Check how the threshold is used in the implementation
ast-grep --pattern 'def prepare_answer_sources($$$) { $$$ }' utils/query_engine/prepare_answer_sources.py
# Look for any filtering logic using the threshold
rg -A 10 "threshold" utils/query_engine/prepare_answer_sources.py
Length of output: 1331 |
||
self.threshold = threshold | ||
|
||
def prepare_answer_sources(self, nodes: list[NodeWithScore]) -> str: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Update test case to use new default threshold value
The verification found that
test_prepare_answer_sources.py
still uses the old threshold value of 0.7. This test case needs to be updated to match the new default value of 0.5.tests/unit/test_prepare_answer_sources.py
: UpdatePrepareAnswerSources(threshold=0.7)
to either remove the explicit threshold parameter or set it to 0.5🔗 Analysis chain
LGTM: Consistent with the new default threshold
The removal of the explicit threshold parameter is aligned with the updated default value in
PrepareAnswerSources
.Let's verify all instances have been updated:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
Length of output: 436