Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update, delete, search by metadata methods #22

Open
Nixellion opened this issue Jul 17, 2024 · 1 comment
Open

update, delete, search by metadata methods #22

Nixellion opened this issue Jul 17, 2024 · 1 comment

Comments

@Nixellion
Copy link

Awesome little library, so much more robust and easy to work with compared to alternatives.

If I may suggest, it would be great to add update and delete functions, as well as expanding search to allow searching by metadata.

The usecase I have in mind is for it to periodically (or before processing each request to AI) check the files in documents folder and update itself. So the logic I have in mind is that I could metadata to store filepaths and hashsums, and then I could just walk the directory and first check if the filepath is already in Memory. If it's not - add it. If it is - check the hashsum. If hashsum does not match - update it. Something like:

for filepath in filepaths:
    existing_entry = memory.search_by_metadata({'filepath': filepath}) # Here we could pass a dict to look for entries where key=value for each entry in the dict (AND operator)
    if existing_entry is None:
        # text = read file
        memory.save([text], [meta])
    else:
        # text = read file
        memory.update({'filepath': filepath}, [text], [meta]) # Uses same logic to find an entry to update, if multiple entries are found - not sure, maybe error out. We need to assume that it uses unique identifier. 

Something like this would be sufficient and should be easy to implement. Alternatively it could use IDs in update, that would also work, but in that case it should also return entry ID with search.

Also here I use search_by_metadata, but instead search function could be extended to accept metadata dict. In which case it could also be used to filter results by metadata, for example to create categories. It would first perform metadata search, and then perform embeddings search. It would also speed up the process, I suppose. Or the other way around, if embeddings search if faster than metadata search.

And if the query is empty search could only perform metadata search. For example:

memory.search(None, metadata={"key": value})
memory.search('', metadata={"key": value})
@vprelovac
Copy link
Contributor

Feel free to submit a PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants