Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add on-disk node resolution for HNSW #347

Open
7 of 13 tasks
kevmo314 opened this issue Jun 18, 2024 · 0 comments
Open
7 of 13 tasks

Add on-disk node resolution for HNSW #347

kevmo314 opened this issue Jun 18, 2024 · 0 comments
Assignees

Comments

@kevmo314
Copy link
Owner

kevmo314 commented Jun 18, 2024

The goal of this feature is to tie the HNSW in-memory map of nodes to adjacencies to disk. This will need a few steps:

  • Create a HNSWAdjacencyPage struct that is of type:
type HNSWAdjacencyPage [16][8]uint32

Note that this is exactly 4kB by design.

  • Create an adjacency manager that has the following ops:
    • GetAdjacencies(x) [8]uint32
    • AddAdjacency(x, y)
    • RemoveAdjacency(x, y)
    • AddNode(x) <-- this function is new, it will do nothing in the map case but we need to wire it up.

Observe that this essentially wraps a map for an in-memory structure so it will replace the current friends lookups directly. For this item, just implement it as a thin wrapper around the existing map.

  • Update the adjacency manager to create a constructor that takes in a B+ tree and update the operations:

    • For GetAdjacencies, look up the memory pointer in the B+ tree and read the 8 * 4 bytes following to get the 8 adjacencies
    • For AddAdjacencies, look up the memory pointer and set the next zero element, writing it back to disk.
    • For RemoveAdjacencies, ...
  • Update AddNode(x) to:

    • Check if it's in the B+ tree, if it is, do nothing
    • If it's not, find the newest HNSWAdjancencyPage and add a new block of 4 bytes representing its adjacencies, then add this to the B+ tree.

On the last item, it may be beneficial to write a page manager sort of like how we have the page file, or maybe it's too much boilerplate. Up to you. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants