Word cloud: add number of documents containing each word to Word Count output #1040

wvdvegte · 2024-03-07T11:01:59Z

Is your feature request related to a problem? Please describe.
It would be interesting to not only know the total number of occurrences of each word in a corpus, but also in how many documents a word appears at least once. This number is already considered when using the Document Frequency filter in Preprocess text, but it would also be nice to have it in a table.

Describe the solution you'd like
The most obvious place to include this is in the Word Count output of Word Cloud, I think.

Describe alternatives you've considered
The numbers are somehow hidden in the output of Bag of Words, with Term Frequency set to Binary and Document Frequency & Regularization set to None, but I have no idea how to extract them from the sparse data.

ajdapretnar self-assigned this Mar 19, 2024

ajdapretnar mentioned this issue Mar 22, 2024

Word Cloud: add document count to Word Count output #1050

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Word cloud: add number of documents containing each word to Word Count output #1040

Word cloud: add number of documents containing each word to Word Count output #1040

wvdvegte commented Mar 7, 2024

Word cloud: add number of documents containing each word to Word Count output #1040

Word cloud: add number of documents containing each word to Word Count output #1040

Comments

wvdvegte commented Mar 7, 2024