Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-shot predefined Topics #2164

Open
mahmawad opened this issue Oct 3, 2024 · 1 comment
Open

Zero-shot predefined Topics #2164

mahmawad opened this issue Oct 3, 2024 · 1 comment

Comments

@mahmawad
Copy link

mahmawad commented Oct 3, 2024

Hi, Thanks again for your great tool,

I have a question regarding predefined Topics, whenver I add a list of zeroshot_topic_list, I got different generated topics and not the one I added, is there a way to only do topicmodeling based only on these zeroshot_topic_list ?

Code :

from bertopic import BERTopic
# Initialize and train BERTopic model
topic_model = BERTopic(
    embedding_model=embedding_model,
    vectorizer_model=vectorizer_model,
    umap_model=umap_model,
    calculate_probabilities=True,
    #hdbscan_model=hdbscan_model,
    representation_model=representation_model,
    verbose=True,
    nr_topics=15,
    min_topic_size=25,
    zeroshot_topic_list=zeroshot_topic_list,
        zeroshot_min_similarity=.85

)

# Fit the topic model and transform the data
topics, probs = topic_model.fit_transform(df['PreprocessedText'].values)
@MaartenGr
Copy link
Owner

Yes, you only need to set zeroshot_min_similarity to 0 and it will select topics only from zeroshot_topic_list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants