Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contextual_top2vec error #362

Open
kirrat975 opened this issue Nov 14, 2024 · 2 comments
Open

Contextual_top2vec error #362

kirrat975 opened this issue Nov 14, 2024 · 2 comments

Comments

@kirrat975
Copy link

kirrat975 commented Nov 14, 2024

@ddangelov
An error is occuring since new contextual_top2vec is added.When i am using embedding_model='doc2vec' then at time of finding topics this error occurs:
2024-11-14 14:56:44,195 - top2vec - INFO - Pre-processing documents for training
INFO:top2vec:Pre-processing documents for training
2024-11-14 14:56:45,567 - top2vec - INFO - Creating joint document/word embedding
INFO:top2vec:Creating joint document/word embedding
2024-11-14 15:00:05,393 - top2vec - INFO - Creating lower dimension embedding of documents
INFO:top2vec:Creating lower dimension embedding of documents
/usr/local/lib/python3.10/dist-packages/umap/umap_.py:1952: UserWarning: n_jobs value 1 overridden to 1 by setting random_state. Use no seed for parallelism.
warn(
2024-11-14 15:00:20,966 - top2vec - INFO - Finding dense areas of documents
INFO:top2vec:Finding dense areas of documents
2024-11-14 15:00:21,130 - top2vec - INFO - Finding topics
INFO:top2vec:Finding topics
AttributeError Traceback (most recent call last)

in

/usr/local/lib/python3.10/dist-packages/top2vec/top2vec.py in init(self, documents, contextual_top2vec, c_top2vec_smoothing_window, min_count, topic_merge_delta, ngram_vocab, ngram_vocab_args, embedding_model, embedding_model_path, embedding_batch_size, split_documents, document_chunker, chunk_length, max_num_chunks, chunk_overlap_ratio, chunk_len_coverage_ratio, sentencizer, speed, use_corpus_file, document_ids, keep_documents, workers, tokenizer, use_embedding_model_tokenizer, umap_args, gpu_umap, hdbscan_args, gpu_hdbscan, index_topics, verbose)
780 self.topics_indexed = False
781
--> 782 self.compute_topics(umap_args=umap_args,
783 hdbscan_args=hdbscan_args,
784 topic_merge_delta=topic_merge_delta,

/usr/local/lib/python3.10/dist-packages/top2vec/top2vec.py in compute_topics(self, umap_args, hdbscan_args, topic_merge_delta, gpu_umap, gpu_hdbscan, index_topics, contextual_top2vec, c_top2vec_smoothing_window)
1597 self.hierarchy = None
1598
-> 1599 if self.contextual_top2vec & contextual_top2vec:
1600
1601 # smooth document token embeddings

AttributeError: 'Top2Vec' object has no attribute 'contextual_top2vec'.
KINDLY GIVE ME A WAY TO RESOLVE THIS I AM USING THIS MODEL IN PROJECT AND DEADLINE IS NEAR.

@ddangelov
Copy link
Owner

I just pushed a fix, good catch. Any specific reason you are using doc2vec as embedding model?

@kirrat975
Copy link
Author

@ddangelov at first i was using universalsentence encoder but it was causing import error so i used this,can you suggest an embedding model that is good for domain specific topics(technical)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants