Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cluster selection #182

Merged
merged 3 commits into from
Oct 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions docs/guides/keyllm.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ This data was chosen to show the different use cases and techniques. As you migh

Let's start with `KeyLLM` only.


# Use Cases

If you want the full performance and easiest method, you can skip the use cases below and go straight to number 5 where you will combine `KeyBERT` with `KeyLLM`.
Expand Down Expand Up @@ -180,6 +181,13 @@ If you have embeddings of your documents, you could use those to find documents
--8<-- "docs/images/efficient.svg"
</div>

!!! Tip
Before you get started, it might be worthwhile to uninstall sentence-transformers and re-install it from the main branch.
There is an issue with community detection (cluster) that might make the model run without finishing. It is as straightforward as:
`pip uninstall sentence-transformers`
`pip install --upgrade git+https://github.com/UKPLab/sentence-transformers`


```python
import openai
from keybert.llm import OpenAI
Expand Down Expand Up @@ -224,6 +232,13 @@ This is the best of both worlds. We use `KeyBERT` to generate a first pass of ke
--8<-- "docs/images/keybert_keyllm.svg"
</div>

!!! Tip
Before you get started, it might be worthwhile to uninstall sentence-transformers and re-install it from the main branch.
There is an issue with community detection (cluster) that might make the model run without finishing. It is as straightforward as:
`pip uninstall sentence-transformers`
`pip install --upgrade git+https://github.com/UKPLab/sentence-transformers`


```python
import openai
from keybert.llm import OpenAI
Expand Down
2 changes: 1 addition & 1 deletion keybert/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from keybert._llm import KeyLLM
from keybert._model import KeyBERT

__version__ = "0.8.2"
__version__ = "0.8.3"
2 changes: 1 addition & 1 deletion keybert/_llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ def extract_keywords(
if in_cluster:
selected_docs = [docs[cluster[0]] for cluster in clusters]
if candidate_keywords is not None:
selected_keywords = [candidate_keywords[cluster[0]] for cluster in in_cluster]
selected_keywords = [candidate_keywords[cluster[0]] for cluster in clusters]
else:
selected_keywords = None
in_cluster_keywords = self.llm.extract_keywords(
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
setup(
name="keybert",
packages=find_packages(exclude=["notebooks", "docs"]),
version="0.8.2",
version="0.8.3",
author="Maarten Grootendorst",
author_email="[email protected]",
description="KeyBERT performs keyword extraction with state-of-the-art transformer models.",
Expand Down