Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLSH_PG: Implement an index to speed up searches #1

Open
rjzak opened this issue Sep 26, 2023 · 2 comments
Open

TLSH_PG: Implement an index to speed up searches #1

rjzak opened this issue Sep 26, 2023 · 2 comments
Labels
performance Runtime Performance Improvements

Comments

@rjzak
Copy link
Member

rjzak commented Sep 26, 2023

After a few million records, searching for similar files becomes slow.
See: https://tlsh.org/papers.html

Ref: malwaredb/malwaredb-rs#165

@rjzak rjzak changed the title Implement an index to speed up searches TLSH: Implement an index to speed up searches Sep 26, 2023
@rjzak rjzak added the performance Runtime Performance Improvements label Sep 26, 2023
@rjzak rjzak added this to MalwareDB Sep 26, 2023
@rjzak rjzak moved this to Backlog in MalwareDB Sep 26, 2023
@rjzak rjzak changed the title TLSH: Implement an index to speed up searches TLSH_PG: Implement an index to speed up searches Sep 26, 2023
@zmt-Eason
Copy link

Is this in progress?Do we have milestone for this performance?

@rjzak
Copy link
Member Author

rjzak commented Oct 10, 2023

It's not yet in progress and I don't have any milestones yet. I've been reading up on Postgres internals but haven't started work on it yet.

I still need to understand the TLSH indexes properly and see how the Postgres indexing API works. I'm hoping to get some inspiration from the pg_vector project which does this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Runtime Performance Improvements
Projects
Status: Backlog
Development

No branches or pull requests

2 participants