Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable embedding creation #161

Open
m-i-l opened this issue Oct 18, 2024 · 1 comment
Open

Disable embedding creation #161

m-i-l opened this issue Oct 18, 2024 · 1 comment

Comments

@m-i-l
Copy link
Contributor

m-i-l commented Oct 18, 2024

Since the implementation of #99 in Aug 2023, the server has been chugging away producing vector embeddings for all indexed content. However, as per https://blog.searchmysite.net/posts/four-year-retrospective/ , the results of the vector search (used by the Retrieval Augmented Generation) haven't been good enough to make the links to the new search visible on the main site. So basically it has been burning a lot of CPU power for well over a year for no good reason.

This change is to disable the embedding creation, so that indexing can complete more quickly. It would be sensible to delete all existing embeddings, so that they don't become stale.

All the vector search code should be left in so that it can easily be re-enabled at a later date if required.

@m-i-l
Copy link
Contributor Author

m-i-l commented Nov 3, 2024

I've removed the models container, and stopped the content_chunks creation, so that should save a little on space and a lot on indexing time. I haven't deleted existing embeddings just yet, and there is still a lot of the vector search code and config around, so leaving this open for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant