Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add readability scores functions #293

Open
javaberlin opened this issue Dec 10, 2024 · 4 comments
Open

Add readability scores functions #293

javaberlin opened this issue Dec 10, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@javaberlin
Copy link

It seems there is no readability tool for processing Danish texts. I tried with spacy_readability, loaded a danish language model with it, but it seems that the spacy_readability works with the english language ( it has a hardcoded list of english words in it).

Can Dacy do something similar to spacy_readability ?

Or can Dacy count Danish syllables in a word ? That would already be a great help, from there one could implement the readability scores ( flesh-kincaid, etc)

@javaberlin javaberlin added the enhancement New feature or request label Dec 10, 2024
@KennethEnevoldsen
Copy link
Collaborator

I think you are looking for this tutorial:
https://centre-for-humanities-computing.github.io/DaCy/tutorials/textdescriptives.html

Which uses DaCy in integration with textdescriptives, which calculates readability metrics.

@javaberlin
Copy link
Author

I

I think you are looking for this tutorial: https://centre-for-humanities-computing.github.io/DaCy/tutorials/textdescriptives.html

Which uses DaCy in integration with textdescriptives, which calculates readability metrics.

I am having problems to install compatible versions of spacy, dacy and textdescriptives. Do you maybe know which versions are compatible ? I tried many combinations, but there is always something wrong.

Thanks

'''
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spacy-experimental 0.6.4 requires spacy<3.8.0,>=3.3.0, but you have spacy 3.8.3 which is incompatible.
da-dacy-small-trf 0.2.0 requires spacy<3.6.0,>=3.5.2, but you have spacy 3.8.3 which is incompatible

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
textdescriptives 2.8.0 requires pydantic>=2.0, but you have pydantic 1.10.19 which is incompatible.
textdescriptives 2.8.0 requires spacy[lookups]>=3.6.0, but you have spacy 3.5.3 which is incompatible.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
textdescriptives 2.4.0 requires spacy[lookups]<3.5.0,>=3.1.0, but you have spacy 3.5.3 which is incompatible.
Successfully installed spacy-3.5.3

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spacy-transformers 1.2.5 requires spacy<4.0.0,>=3.5.0, but you have spacy 3.4.4 which is incompatible.
da-dacy-small-trf 0.2.0 requires spacy<3.6.0,>=3.5.2, but you have spacy 3.4.4 which is incompatible.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
textdescriptives 2.8.2 requires pydantic>=2.0, but you have pydantic 1.10.19 which is incompatible.
da-dacy-small-trf 0.2.0 requires spacy<3.6.0,>=3.5.2, but you have spacy 3.7.5 which is incompatible.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
textdescriptives 2.8.2 requires pydantic>=2.0, but you have pydantic 1.10.19 which is incompatible.
textdescriptives 2.8.2 requires spacy[lookups]>=3.6.0, but you have spacy 3.5.2 which is incompatible.

import spacy
import dacy
nlp = dacy.load("small") # load the latest version of the small model
C:\Users\kh\PycharmProjects\EntropySentiments.venv39\lib\site-packages\spacy\util.py:910: UserWarning: [W095] Model 'da_dacy_small_trf' (0.2.0) was trained with spaCy v3.5.2 and may not be 100% compatible with the current version (3.8.3). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate
'''

@KennethEnevoldsen
Copy link
Collaborator

Hmm on a google colab instance this seems to work without issue:

# install dependencies (if you do this from the terminal remove the !
!pip install dacy textdescriptives
# ...
# dacy-2.7.8 ftfy-6.3.1 pyphen-0.17.0 spacy-alignments-0.9.1 spacy-experimental-0.6.4 spacy-lookups-data-1.0.5 spacy-transformers-1.3.5 spacy-wrap-1.4.5 textdescriptives-2.8.2 tokenizers-0.15.2 transformers-4.36.2

# load model
import dacy
nlp = dacy.load("small")

@javaberlin
Copy link
Author

Thanks ! it works now with

pip install dacy==2.7.8 ftfy==6.3.1 pyphen==0.17.0 spacy-alignments==0.9.1 spacy-experimental==0.6.4 spacy-lookups-data==1.0.5 spacy-transformers==1.3.5 spacy-wrap==1.4.5 textdescriptives==2.8.2 tokenizers==0.15.2 transformers==4.36.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants