Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: lowering references threshold to 0.5! #104

Merged
merged 2 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion routers/amqp.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ async def ask(payload: Payload, logger: Logger):
response, references = query_data_sources(
community_id=community_id, query=question
)
prepare_answer = PrepareAnswerSources(threshold=0.7)
prepare_answer = PrepareAnswerSources()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Update test case to use new default threshold value

The verification found that test_prepare_answer_sources.py still uses the old threshold value of 0.7. This test case needs to be updated to match the new default value of 0.5.

  • tests/unit/test_prepare_answer_sources.py: Update PrepareAnswerSources(threshold=0.7) to either remove the explicit threshold parameter or set it to 0.5
🔗 Analysis chain

LGTM: Consistent with the new default threshold

The removal of the explicit threshold parameter is aligned with the updated default value in PrepareAnswerSources.

Let's verify all instances have been updated:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining instances of PrepareAnswerSources with explicit threshold
rg -A 2 "PrepareAnswerSources.*threshold.*0.7"

# Look for any configuration files that might override the threshold
fd -e yaml -e json -e ini -e conf | xargs rg "threshold.*0.7"

Length of output: 436

answer_reference = prepare_answer.prepare_answer_sources(nodes=references)

logger.info(f"COMMUNITY_ID: {community_id} Job finished")
Expand Down
2 changes: 1 addition & 1 deletion utils/query_engine/prepare_answer_sources.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@


class PrepareAnswerSources:
def __init__(self, threshold: float = 0.7) -> None:
def __init__(self, threshold: float = 0.5) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Lowering the threshold requires test updates and monitoring

The implementation shows that the threshold is used to filter URLs based on node scores, and the test suite still uses the old threshold (0.7). This change needs attention in two areas:

  • The test suite in tests/unit/test_prepare_answer_sources.py needs to be updated to use the new default threshold of 0.5 or explicitly test both thresholds
  • The logging statement indicates this is a critical filter - lowering it could lead to more URLs being included, requiring monitoring in production
🔗 Analysis chain

Consider the implications of lowering the similarity threshold

Lowering the threshold from 0.7 to 0.5 will result in more reference nodes being included in the output. While this increases recall, it might:

  1. Include less relevant references (more false positives)
  2. Generate longer responses with more references
  3. Potentially impact memory usage when processing more nodes

Let's analyze the impact:


Let me gather more information about how the threshold is actually used in the code.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for test files and test cases related to threshold validation
fd -e py -e yaml -e json test_ | xargs rg -A 5 "threshold.*0.[57]|PrepareAnswerSources"

# Look for any performance benchmarks or memory profiling
fd -e py benchmark | xargs rg -A 5 "PrepareAnswerSources"

Length of output: 3609


Script:

#!/bin/bash
# Check how the threshold is used in the implementation
ast-grep --pattern 'def prepare_answer_sources($$$) { $$$ }' utils/query_engine/prepare_answer_sources.py

# Look for any filtering logic using the threshold
rg -A 10 "threshold" utils/query_engine/prepare_answer_sources.py

Length of output: 1331

self.threshold = threshold

def prepare_answer_sources(self, nodes: list[NodeWithScore]) -> str:
Expand Down
4 changes: 1 addition & 3 deletions worker/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,7 @@ def ask_question_auto_search(
response, references = query_data_sources(
community_id=community_id, query=query
)
answer_sources = PrepareAnswerSources(threshold=0.7).prepare_answer_sources(
nodes=references
)
answer_sources = PrepareAnswerSources().prepare_answer_sources(nodes=references)
except Exception:
response = "Sorry, We cannot process your question at the moment."
answer_sources = None
Expand Down
Loading