generated from JetBrains/intellij-platform-plugin-template
-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
实现快速的相似度搜索 #200
Labels
help wanted
Extra attention is needed
Comments
我们在 VSCode 版本实现了类似的功能,但是导致插件体积太大,暂时没有精力迁移到 IDEA 中。详细可以参考:https://github.com/unit-mesh/auto-dev-vscode 比较理想的形式应该是使用额外的 embedding 包和向量数据库。 欢迎来 PR |
有没有这方面embedding包和向量数据库推荐呢~ |
可以参考 VSCode 版本 |
方式 1:使用 TFIDF 算法。Copilot 主要用的就是他,和 embedding 之类的相比,还是相当靠谱的。 |
phodal
added a commit
that referenced
this issue
Aug 7, 2024
Deleted the `StandardTextChunk.kt` file from the `src/main/kotlin/cc/unitmesh/devti/agent/model` directory as it was not in use.
phodal
added a commit
that referenced
this issue
Aug 7, 2024
This commit introduces the LocalEmbedding class, which provides functionality to generate text embeddings using an ONNX model and HuggingFace tokenizer. The class includes a suspendable embed function and supports parallel processing for embedding generation. It also features a companion object to create instances of the LocalEmbedding class with a default model.
phodal
added a commit
that referenced
this issue
Aug 7, 2024
…search indices #200 This commit introduces two classes, `InMemoryEmbeddingSearchIndex` and `DiskSynchronizedEmbeddingSearchIndex`, which implement the `EmbeddingSearchIndex` interface. These classes provide methods for addingfeat, updating(embed,ding and): deleting add embedding entries in,-memory as and well disk-sync as searchinged for embedding search the closest index embeddings to This commit a introduces given a query new embedding. in The-memory ` and diskIn-sMemoryynchronizedEmbed embeddingding searchSearch indexIndex.` The stores in all-memory embeddings index in stores memory embeddings, in while memory the and ` supportsDisk concurrentS readynchronized operationsEmbed,ding whileSearch theIndex disk`-s synchronynchronizedizes index index maintains changes index with synchronization disk with storage disk. storage Additionally., Both the commit indices implement includes the a Embed `dingLockedSearchSequenceWrapperIndex interface`, class providing to methods safely for iterate adding over entries embeddings, under saving a/loading lock from, disk as, well finding as closest utility embeddings functions, for and calculating more embedding. similarity Additionally and, normalization the. Locked OverallSequence,Wrapper these ensures classes thread provide-safe efficient iteration and over thread the-safe index ways. to manage and search embedding indices.
phodal
added a commit
that referenced
this issue
Aug 8, 2024
…urrentHashMap.newKeySet #200 This change replaces the usage of `ConcurrentCollectionFactory` with `ConcurrentHashMap.newKeySet` for creating a concurrent set of unchecked IDs, simplifying the dependency and utilizing a more direct approach for concurrent set creation in `DiskSynchronizedEmbeddingSearchIndex`.
phodal
added a commit
that referenced
this issue
Aug 8, 2024
…th ConcurrentHashMap.newKeySet #200 This change improves the performance and memory usage by utilizing the built-in `ConcurrentHashMap.newKeySet()` for the `uncheckedIds` set, which provides a more efficient concurrent implementation.
牛 真的在实现了, 我试试 |
看了一下代码 还在实现中~~~~ 加油~ |
只是接口上支持,功能还没实现 |
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The text was updated successfully, but these errors were encountered: