Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add caching if prompt request fails #148

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

Matthieu-OD
Copy link
Collaborator

@Matthieu-OD Matthieu-OD commented Nov 14, 2024

Add Prompt Caching Functionality

Changes

  • Implemented a prompt caching system using a separate class SharedCachePrompt

Implementation Details

  • Cache keys are generated in three formats:

  • id

  • name

  • tuple(name, version)

  • Caching logic implemented in both sync and async get_prompt methods

  • On each successful get prompt we will cache the prompt with keys id, name, tuple(name, version)

  • Added fallback to cached prompts when API calls fail, with warning logs

  • To keep in mind this cache will be cleared each time the application Is restarted

Test Commands

Test Code
from fastapi import FastAPI, HTTPException
from typing import Optional
from literalai import LiteralClient

app = FastAPI()
client = LiteralClient(
    url="http://localhost:3000", api_key="my-initial-api-key"
)

@app.get("/prompt")
async def get_prompt(
    id: Optional[str] = None,
    name: Optional[str] = None,
    version: Optional[int] = None,
    should_fail: bool = False  # For testing failure scenarios
):
    print(client.api._prompt_cache.keys())
    try:
        if should_fail:
            original_key = client.api.api_key
            client.api.api_key = "invalid_key"
            try:
                prompt_from_cache = client.api.get_prompt(
                    id=id, name=name, version=version
                )
                return {
                    "success": True,
                    "prompt_id": prompt_from_cache.id,
                    "cache_hit": True
                }
            finally:
                client.api.api_key = original_key
        else:
            prompt = client.api.get_prompt(id=id, name=name, version=version)
        return {
            "success": True,
            "prompt_id": prompt.id,
            "cache_hit": False
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Install dependencies

pip install fastapi uvicorn

Run test server

uvicorn test_cache:app --reload

Test normal flow

curl "http://localhost:8000/prompt?name=example_prompt"

Test cache with failure

First call to populate cache

curl "http://localhost:8000/prompt?name=example_prompt"

Second call with simulated failure

curl "http://localhost:8000/prompt?name=example_prompt&should_fail=true"

Test with ID

curl "http://localhost:8000/prompt?id=prompt_123"

Test with name and version

curl "http://localhost:8000/prompt?name=example_prompt&version=1"

@Matthieu-OD Matthieu-OD self-assigned this Nov 14, 2024
@Matthieu-OD Matthieu-OD force-pushed the matt/eng-2115-add-client-caching-for-prompts branch from 6136af7 to 8774a00 Compare November 14, 2024 10:07
@Matthieu-OD Matthieu-OD marked this pull request as ready for review November 14, 2024 10:24
@Matthieu-OD Matthieu-OD marked this pull request as draft November 14, 2024 13:23
@Matthieu-OD Matthieu-OD marked this pull request as ready for review November 14, 2024 14:15
@Matthieu-OD Matthieu-OD force-pushed the matt/eng-2115-add-client-caching-for-prompts branch from ae555e3 to 5bcdce4 Compare November 14, 2024 14:44
raise ValueError("Either the `id` or the `name` must be provided.")

sync_api = LiteralAPI(self.api_key, self.url)
cached_prompt = self.prompt_cache.get(id, name, version)
timeout = 1 if cached_prompt else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could move the cache logic in the get_prompt_helper to avoid duplicating it for the sync/async versions.

@@ -212,7 +265,7 @@ def raise_error(error):
self.graphql_endpoint,
json={"query": query, "variables": variables},
headers=self.headers,
timeout=10,
timeout=timeout,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the modification for the async version of make_gql_call

Copy link
Collaborator Author

@Matthieu-OD Matthieu-OD Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is lines 1519 and 1532 in the same file

Copy link
Contributor

@willydouhard willydouhard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks promising! I would love to see a test for it.

@Matthieu-OD Matthieu-OD force-pushed the matt/eng-2115-add-client-caching-for-prompts branch 7 times, most recently from 2f9628c to 3433ee1 Compare November 18, 2024 13:44
@Matthieu-OD Matthieu-OD force-pushed the matt/eng-2115-add-client-caching-for-prompts branch 2 times, most recently from 7ffb0b3 to 896e303 Compare November 18, 2024 13:54
@Matthieu-OD Matthieu-OD force-pushed the matt/eng-2115-add-client-caching-for-prompts branch from 896e303 to 3730581 Compare November 18, 2024 13:58
@Matthieu-OD
Copy link
Collaborator Author

After discussing with @clementsirieix, I'm changing the implementation to avoid a too specific or complex solution. I'm also increasing the timeout to 2 sec instead of 1.

@Matthieu-OD Matthieu-OD force-pushed the matt/eng-2115-add-client-caching-for-prompts branch from 0bd9044 to 85c72d1 Compare November 20, 2024 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants