Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implemented aget_routes async method for pinecone index #397

Merged

Conversation

Vits-99
Copy link
Collaborator

@Vits-99 Vits-99 commented Aug 23, 2024

User description

Fixes issue #396


PR Type

enhancement


Description

  • Introduced an abstract aget_routes method in the base index class to be implemented by subclasses.
  • Implemented the aget_routes method in the Pinecone index to asynchronously retrieve route and utterance objects.
  • Added error logging for the aget_routes method in LocalIndex, PostgresIndex, and QdrantIndex to indicate it is not implemented.
  • Added helper methods in PineconeIndex for asynchronous data retrieval and metadata fetching.

Changes walkthrough 📝

Relevant files
Enhancement
base.py
Introduce abstract `aget_routes` method in base index class

semantic_router/index/base.py

  • Added aget_routes method as an abstract method to be implemented by
    subclasses.
  • Described the method's purpose and expected return type.
  • +11/-0   
    local.py
    Add `aget_routes` method with error logging in LocalIndex

    semantic_router/index/local.py

  • Implemented aget_routes method with a logger error message indicating
    it's not implemented.
  • +3/-0     
    pinecone.py
    Implement asynchronous `aget_routes` method in PineconeIndex

    semantic_router/index/pinecone.py

  • Implemented aget_routes method to asynchronously retrieve routes and
    utterances.
  • Added helper methods _async_get_all and _async_fetch_metadata for
    fetching data.
  • +113/-0 
    postgres.py
    Add `aget_routes` method with error logging in PostgresIndex

    semantic_router/index/postgres.py

  • Added aget_routes method with a logger error message indicating it's
    not implemented.
  • +4/-0     
    qdrant.py
    Add `aget_routes` method with error logging in QdrantIndex

    semantic_router/index/qdrant.py

  • Added aget_routes method with a logger error message indicating it's
    not implemented.
  • +3/-0     

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    @Vits-99 Vits-99 requested a review from jamescalam August 23, 2024 13:10
    @github-actions github-actions bot added enhancement Enhancement to existing features Review effort [1-5]: 3 labels Aug 23, 2024
    Copy link

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Key issues to review

    Incorrect Error Message
    The error message "Sync remove is not implemented for LocalIndex." seems incorrect. It should likely state that "Async get routes is not implemented for LocalIndex."

    Error Handling
    The method _async_get_routes prints metadata directly which might expose sensitive data. Consider handling this differently or removing the print statements.

    Exception Handling
    In the _async_fetch_metadata method, exceptions are caught generically. It's recommended to specify exception types to avoid catching unintended exceptions.

    Copy link

    github-actions bot commented Aug 23, 2024

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Enhancement
    Convert the method aget_routes to an asynchronous method

    The method aget_routes should be defined as an asynchronous method using async def
    instead of def to match its intended asynchronous behavior and the naming convention
    (aget_ prefix suggests asynchronous operation). This change will ensure that the
    method can perform asynchronous operations and be awaited in the calling code.

    semantic_router/index/base.py [93-101]

    -def aget_routes(self):
    +async def aget_routes(self):
         """
         Asynchronously get a list of route and utterance objects currently stored in the index.
         This method should be implemented by subclasses.
         
         :returns: A list of tuples, each containing a route name and an associated utterance.
         :rtype: list[tuple]
         :raises NotImplementedError: If the method is not implemented by the subclass.
         """
         raise NotImplementedError("This method should be implemented by subclasses.")
     
    Suggestion importance[1-10]: 9

    Why: The suggestion correctly identifies that the method aget_routes is intended to be asynchronous, as indicated by its name, and should be defined using async def to allow for asynchronous operations. This change is crucial for consistency and functionality.

    9
    Robustness
    Add error handling to the aget_routes method

    Add error handling for the aget_routes method to manage potential exceptions during
    the asynchronous operations, such as network issues or API errors. This will improve
    the robustness of the method and provide better error feedback to the calling
    functions.

    semantic_router/index/pinecone.py [531-541]

     async def aget_routes(self) -> list[tuple]:
         """
         Asynchronously get a list of route and utterance objects currently stored in the index.
         
         Returns:
             List[Tuple]: A list of (route_name, utterance) objects.
         """
         if self.async_client is None or self.host is None:
             raise ValueError("Async client or host are not initialized.")
         
    -    return await self._async_get_routes()
    +    try:
    +        return await self._async_get_routes()
    +    except Exception as e:
    +        logger.error(f"Failed to get routes: {e}")
    +        return []
     
    Suggestion importance[1-10]: 8

    Why: The suggestion enhances the robustness of the aget_routes method by adding error handling, which is important for managing exceptions and providing informative error messages, especially in asynchronous operations.

    8
    Best practice
    Replace print statements with logger calls for better logging management

    Replace the print statements with logger calls in the _async_get_all and
    _async_fetch_metadata methods to maintain consistency with logging practices and to
    better manage log levels and outputs in production environments.

    semantic_router/index/pinecone.py [631-680]

    -print(f"Error listing vectors: {response.status} - {error_text}")
    +logger.error(f"Error listing vectors: {response.status} - {error_text}")
     ...
    -print(f"RESPONSE: {response_data}")
    +logger.debug(f"RESPONSE: {response_data}")
     ...
    -print(f"Failed to decode JSON for vector {vector_id}: {e}")
    +logger.error(f"Failed to decode JSON for vector {vector_id}: {e}")
     
    Suggestion importance[1-10]: 7

    Why: Replacing print statements with logger calls is a best practice for managing logs, especially in production environments, as it allows for better control over log levels and outputs. This suggestion improves code quality and maintainability.

    7
    Maintainability
    Refactor pagination logic into a separate method for clarity and reusability

    Refactor the _async_get_all method to encapsulate the pagination logic into a
    separate method. This will improve the readability and maintainability of the code
    by separating concerns and making the pagination mechanism more reusable.

    semantic_router/index/pinecone.py [622-648]

    -while True:
    -    if next_page_token:
    -        params["paginationToken"] = next_page_token
    -    ...
    -    if not next_page_token:
    -        break
    +async def _paginate_vector_ids(self, list_url, params):
    +    all_vector_ids = []
    +    metadata = []
    +    while True:
    +        response_data = await self._fetch_page(list_url, params)
    +        if not response_data:
    +            break
    +        vector_ids = [vec["id"] for vec in response_data.get("vectors", [])]
    +        all_vector_ids.extend(vector_ids)
    +        if include_metadata:
    +            metadata.extend(await self._fetch_metadata_for_ids(vector_ids))
    +        next_page_token = response_data.get("pagination", {}).get("next")
    +        if not next_page_token:
    +            break
    +    return all_vector_ids, metadata
     
    Suggestion importance[1-10]: 6

    Why: The suggestion to refactor the pagination logic into a separate method enhances code readability and maintainability by separating concerns. While beneficial, it is not as critical as other suggestions, hence a slightly lower score.

    6

    @jamescalam jamescalam changed the title enhancement: Implemented aget_routes async method for pinecone index feat: Implemented aget_routes async method for pinecone index Aug 23, 2024
    @@ -584,5 +596,101 @@ async def _async_describe_index(self, name: str):
    async with self.async_client.get(f"{self.base_url}/indexes/{name}") as response:
    return await response.json(content_type=None)

    async def _async_get_all(
    self, prefix: str | None = None, include_metadata: bool = False
    Copy link
    Member

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    can't use these type annotations as we need to remain backwards compatible with python 3.9, should instead use Optional[str] = None in this case

    Copy link

    codecov bot commented Aug 23, 2024

    Codecov Report

    Attention: Patch coverage is 11.94030% with 59 lines in your changes missing coverage. Please review.

    Project coverage is 63.60%. Comparing base (b01cb98) to head (6a4bd38).
    Report is 1 commits behind head on main.

    Files Patch % Lines
    semantic_router/index/pinecone.py 7.01% 53 Missing ⚠️
    semantic_router/index/postgres.py 0.00% 3 Missing ⚠️
    semantic_router/index/base.py 50.00% 1 Missing ⚠️
    semantic_router/index/local.py 50.00% 1 Missing ⚠️
    semantic_router/index/qdrant.py 50.00% 1 Missing ⚠️
    Additional details and impacted files
    @@            Coverage Diff             @@
    ##             main     #397      +/-   ##
    ==========================================
    - Coverage   64.45%   63.60%   -0.85%     
    ==========================================
      Files          46       46              
      Lines        3376     3432      +56     
    ==========================================
    + Hits         2176     2183       +7     
    - Misses       1200     1249      +49     

    ☔ View full report in Codecov by Sentry.
    📢 Have feedback on the report? Share it here.

    @jamescalam jamescalam merged commit 567e4c9 into main Aug 23, 2024
    3 of 6 checks passed
    @jamescalam jamescalam deleted the vittorio/add-async-get-routes-method-to-pinecone-index branch August 23, 2024 15:33
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    enhancement Enhancement to existing features Review effort [1-5]: 3
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants