Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GDrive] Add caching of downloaded files #492

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

scottmx81
Copy link
Contributor

@scottmx81 scottmx81 commented Sep 12, 2024

What's being changed:

This PR modifies the Google Drive connector to add an optional caching feature. It adds the ability to store the downloaded documents in Redis, or in the Python process itself with no dependency on another service, using cachetools.

By default it will store the downloaded documents for 1 hour.

The purpose of this change is to speed up the response time from the connector, in cases where a user continues asking questions that would trigger search queries that return the same documents repeatedly. My testing has shown that downloading the documents is the most time consuming part of responding to a search request, as Google is relatively quick at responding to the Google Drive search query itself.

How did you test this change (include any code snippets, API requests, screenshots, or gifs):

I made requests from Coral to the connector running in local env, via ngrok. In local I ran the connector with the new GDRIVE_CACHE_TYPE env var unset, with a blank value, and the memory and redis options. In provider/async_download.py I had log statements to show me what files were being downloaded and how long it took, and with the debug logging I could see if it was getting cache hits or not. I have not committed the debug logging that I was using.

@scottmx81 scottmx81 requested a review from a team as a code owner September 12, 2024 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant