-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/automatic enable disable datasource #52
Conversation
WalkthroughThe recent updates bring a comprehensive framework for data retrieval, enhancing adaptability and efficiency. Key additions include a GitHub query engine, a custom vector store retriever for error handling, and tests ensuring system reliability and robustness in handling diverse data queries. Changes
Recent Review DetailsConfiguration used: CodeRabbit UI Files selected for processing (2)
Files skipped from review as they are similar to previous changes (2)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Out of diff range and nitpick comments (1)
tests/unit/test_github_query_engine.py (1)
14-14
: Consider removing the
celery_app/tasks.py
Outdated
@@ -17,6 +17,7 @@ | |||
from tc_messageBroker.rabbit_mq.payload.payload import Payload | |||
from tc_messageBroker.rabbit_mq.queue import Queue | |||
from traceloop.sdk import Traceloop | |||
from utils.data_souce_selector import DataSourceSelector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix typo in the import statement.
- from utils.data_souce_selector import DataSourceSelector
+ from utils.data_source_selector import DataSourceSelector
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
from utils.data_souce_selector import DataSourceSelector | |
from utils.data_source_selector import DataSourceSelector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Out of diff range and nitpick comments (1)
utils/data_source_selector.py (1)
26-51
: Consider adding error handling for database operations to ensure robustness.
Returns | ||
---------- | ||
data_sources : dict[str, bool] | ||
a dictionary representing what data sources is selcted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct the typo in the word "selected".
platforms = list(map(lambda data: data["platform"]["name"], db_results)) | ||
data_sources = dict.fromkeys(platforms, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a dictionary comprehension for clarity and Pythonic style.
- platforms = list(map(lambda data: data["platform"]["name"], db_results))
- data_sources = dict.fromkeys(platforms, True)
+ data_sources = {data["platform"]["name"]: True for data in db_results}
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
platforms = list(map(lambda data: data["platform"]["name"], db_results)) | |
data_sources = dict.fromkeys(platforms, True) | |
data_sources = {data["platform"]["name"]: True for data in db_results} |
github_query_engine = GitHubQueryEngine(community_id=community_id).prepare() | ||
tool_metadata = ToolMetadata( | ||
name="GitHub", | ||
description="Hosts code repositories and project materials from the GitHub platform.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still the old description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice to save all descriptions and other prompts in a single document. Then import them from there whenever we need them like an env var file. That could make updating prompts more easy in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that was because I started it from the previous branch yesterday. will consider updating it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be resolved when merging with main.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Refactor