A request for evaluation script #2

lei-liu1 · 2022-09-16T13:49:29Z

Hello,
Could you please share the script used for evaluation? The obtained results may differ with different evaluation scripts.
Thanks a lot!

Best wishes,
Lei

yumeng5 · 2022-09-16T20:15:48Z

Hi,

The evaluation code should be very straightforward: For topic coherence evaluation, you could pass the output topics to the evaluation pipeline of the gensim library. For document clustering, you could refer to the README file. I hope these help!

Best,
Yu

lei-liu1 · 2022-09-19T02:26:51Z

Hello,

Thanks for your reply.
I wrote a function using gensim library as you mentioned, which was put into TopClusUtils class:

from gensim.models.coherencemodel import CoherenceModel
from gensim.corpora import Dictionary

def compute_coherence(self, data_dir, res_dir):
    topics = []
    with open(os.path.join(res_dir, 'topics_final.txt'), encoding='utf-8') as f:
        for line in f.readlines():
            topic = line.split()[2].split(',')
            topics.append(topic)

    docs = []
    with open(os.path.join(data_dir, 'texts.txt'), encoding="utf-8") as f:
        for line in f.readlines():
            doc = self.tokenizer.tokenize(line)
            docs.append(doc)

    dct = Dictionary(docs)
    corpus = [dct.doc2bow(doc) for doc in docs]
    UMass_cm = CoherenceModel(topics=topics, corpus=corpus, dictionary=dct, coherence='u_mass')
    umass = UMass_cm.get_coherence()

    UCI_cm = CoherenceModel(topics=topics, texts=docs, dictionary=dct, coherence='c_uci')
    uci = UCI_cm.get_coherence()
    print('UMass: ', umass, 'UCI: ', uci)

And I called it using trainer.utils.compute_coherence(trainer.data_dir, trainer.res_dir), but got results UMass: -5.8793029613305725 UCI: -3.1203332131498644.
I wonder if there is any issue within the code. Thanks in advance.

Best wishes,
Lei

yumeng5 · 2022-09-19T03:12:26Z

Hi,

The results look quite off from those reported in the paper, and I believe there are a few points you could double check:

Check the output topics of TopClus; if they do not make sense, it's likely to be a problem during model training;
UCI is an extrinsic metric which evaluates the topics on an external general corpus such as Wikipedia instead of the training corpus, so you'll need to pass in a general corpus;
We evaluated the results with the top-5 words of each topic, and you'll need to specify this in the gensim function call.

I hope these are helpful.

Best,
Yu

lei-liu1 · 2022-09-19T03:25:59Z

Hello,

Thanks for your quick reply.
Um, I used the training corpus for evaluating UCI score and top-10 words of the topics. I will fix this and see the results.
Thanks for your tips again.

Best,
Lei

lei-liu1 · 2022-09-20T14:14:34Z

Hello,

I have fixed the issue by evaluating topic coherence with top-5 words, but the UMass score dropped to -6.49. As for the UCI score, I cannot find out how to evaluate the topics on Wikipedia with gensim library. Should I pass the whole Wiki data into the texts parameter of CoherenceModel?
Looking forward to your reply. Thank you very much!

Best,
Lei

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A request for evaluation script #2

A request for evaluation script #2

lei-liu1 commented Sep 16, 2022

yumeng5 commented Sep 16, 2022

lei-liu1 commented Sep 19, 2022

yumeng5 commented Sep 19, 2022

lei-liu1 commented Sep 19, 2022

lei-liu1 commented Sep 20, 2022

A request for evaluation script #2

A request for evaluation script #2

Comments

lei-liu1 commented Sep 16, 2022

yumeng5 commented Sep 16, 2022

lei-liu1 commented Sep 19, 2022

yumeng5 commented Sep 19, 2022

lei-liu1 commented Sep 19, 2022

lei-liu1 commented Sep 20, 2022