some inconsistencies in generating word clouds? #1073

mw0000 · 2024-03-15T13:43:12Z

Dear Lexos team :)

I have recently started using Lexos, unfortunately I have noticed some inconsistencies in generating word clouds. I am attaching a file with the data, and in the Scrub options, I chose to remove spaces. The system generates different word clouds (see lexus4.png and lexus5.png). The words 'woman' and 'man' are the most popular in the text.

Did I make any mistake in preparing the data? The BubbleViz visualization works correctly.

Thanks!

words.txt

scottkleinman · 2024-03-16T16:32:20Z

Thank you for reporting this. The issue relates to our use of the d3.js library, which sometimes removes high-frequency terms if they don't fit the layout. The issue was addressed for older versions of Lexos here, however, it may have been neglected in the latest release. From what I can tell from discussion in the d3 repo, the issue was not fixed in the d3 library and individual users have come up with their own workarounds. We will investigate whether we have one and, if so, whether it can be improved.

In the meantime, the problem can often be solved by re-generating the word cloud (click the "Generate" button). If you are unsure if the top terms are represented in the word cloud, you can check by going to the Prepare > Tokenize screen. You can do this by opening a new tab in your browser. Select the Raw and Descending radio buttons, and then click "Generate". You will be able to see the most frequent terms. If they don't match what you are seeing in the word cloud, try regenerating the word cloud until they do.

scottkleinman added the bug label Mar 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some inconsistencies in generating word clouds? #1073

some inconsistencies in generating word clouds? #1073

mw0000 commented Mar 15, 2024

scottkleinman commented Mar 16, 2024

some inconsistencies in generating word clouds? #1073

some inconsistencies in generating word clouds? #1073

Comments

mw0000 commented Mar 15, 2024

scottkleinman commented Mar 16, 2024