-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kur with Arabic unicharset (similar to old tessdata) #23
Comments
This is kur_ara.lstm-unicharset in both tessdata_best and tessdata_fast.
|
This is the kur.unicharset from tessdata
|
Shreeshrii
changed the title
kur_ara does not have Arabic unicharset.
Add kur with Arabic unicharset (similar to old tessdata)
Sep 17, 2018
Kurdish in Latin script is supported as kmr. Kurdish in Arabic script (which was kur in tessdata) is missing in tessdata_best/tessdata_fast. |
MerlijnWajer
added a commit
to MerlijnWajer/tesseract
that referenced
this issue
Dec 1, 2020
"kur" no longer exists, might be named "kur_ara" (the old "kur_ara" is now "kmr", which is actually Latin) now, but "kur" is not present in tessdata_fast nor in tessdata_best. [1] [2] "tgl" (Tagalo) is now named "fil" (Filipino) [3] [1] tesseract-ocr/langdata#124 [2] tesseract-ocr/tessdata_best#23 [3] tesseract-ocr/langdata#84 "kur" no longer exists, might be named "kur_ara" now, but it is not present in tessdata_fast nor in tessdata_best. "kmr" is the Latin version (Kurmanji) "tgl" (Tagalo) is now named "fil" (Filipino)
MerlijnWajer
added a commit
to MerlijnWajer/tesseract
that referenced
this issue
Dec 1, 2020
"kur" no longer exists, might be named "kur_ara" (the old "kur_ara" is now "kmr", which is actually Latin) now, but "kur" is not present in tessdata_fast nor in tessdata_best. [1] [2] "tgl" (Tagalo) is now named "fil" (Filipino) [3] [1] tesseract-ocr/langdata#124 [2] tesseract-ocr/tessdata_best#23 [3] tesseract-ocr/langdata#84
MerlijnWajer
added a commit
to MerlijnWajer/tesseract
that referenced
this issue
Dec 1, 2020
"kur" no longer exists, might be named "kur_ara" (the old "kur_ara" is now "kmr", which is actually Latin) now, but "kur" is not present in tessdata_fast nor in tessdata_best. [1] [2] "tgl" (Tagalo) is now named "fil" (Filipino) [3] [1] tesseract-ocr/langdata#124 [2] tesseract-ocr/tessdata_best#23 [3] tesseract-ocr/langdata#84
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
See related issue in langdata.
tesseract-ocr/langdata#116
The text was updated successfully, but these errors were encountered: