Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word cloud: add number of documents containing each word to Word Count output #1040

Open
wvdvegte opened this issue Mar 7, 2024 · 0 comments
Assignees

Comments

@wvdvegte
Copy link

wvdvegte commented Mar 7, 2024

Is your feature request related to a problem? Please describe.
It would be interesting to not only know the total number of occurrences of each word in a corpus, but also in how many documents a word appears at least once. This number is already considered when using the Document Frequency filter in Preprocess text, but it would also be nice to have it in a table.

Describe the solution you'd like
The most obvious place to include this is in the Word Count output of Word Cloud, I think.

Describe alternatives you've considered
The numbers are somehow hidden in the output of Bag of Words, with Term Frequency set to Binary and Document Frequency & Regularization set to None, but I have no idea how to extract them from the sparse data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants