-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel problem in MALLET LDA (gensim wrapper) #176
Comments
I have the same problem, for 12077 files ~ 5 Gb it takes 4hrs. It doesn't seem to be utilizing all the cores. |
Unless this can be replicated in the java-only version there's not much to do here -- I'd check with gensim. |
@thisray This thread has been dormant for a while, but have you checked how many cores/threads you have in your computer? It could be that your number of cores/threads are less than 16, so 16 slows you down. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I use the
gensim
wrapper,LdaMallet()
[link], to runMALLET
.Gensim library provide a parameter
workers
to assign the--num-threads
argument inMALLET
.(Ref: Gensim Code - line274)
But I found the
workers
seems not working, here is the different setting and running time:No matter I run this on my computer:
or on the Colab:
the results are similar, more workers spent more time.
(and I have also tried
mallet-2.0.8
&mallet-2.0.7
)Dose it means I am not using a proper way to run MALLET LDA in parallel?
Thanks!
reference code:
The text was updated successfully, but these errors were encountered: