Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build KNN index in parallel #386

Merged
merged 5 commits into from
Dec 27, 2023

Conversation

small-turtle-1
Copy link
Contributor

@small-turtle-1 small-turtle-1 commented Dec 27, 2023

What problem does this PR solve?

Create hnsw index in parallel.

Issue link:
#341

What is changed and how it works?

Note: Use round robin schedule strategy in "task_scheduler.cpp" can apply this pr.

  1. Add mutex in hnsw algorithm to support access in multiple thread.
  2. Add physical_node: CreateIndexPrepare, CreateIndexDo, CreateIndexFinish.
  3. The 3 physical_node has different parallel number, so in 3 different fragment.
  4. Use QueueSink and QueueSource to ensure the task execute in topic topological order.
  5. Change QueueSource/Sink queue's content into an base class. So queue can transfer different information.

Code changes

  • Has Code change
  • Has CI related scripts change

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Note for reviewer

Add simd func for residual dimension embedding.
Merge with yangzq. Remove some heap op in hnsw alg.
@yingfeng yingfeng changed the title Concurrent index. Build KNN index in parallel Dec 27, 2023
@JinHai-CN JinHai-CN merged commit f7dbfa2 into infiniflow:main Dec 27, 2023
2 checks passed
@small-turtle-1 small-turtle-1 deleted the concurrent_index branch December 27, 2023 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants