Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Syriac / Assyrian Support #239

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

esoleyman
Copy link

@esoleyman esoleyman commented Sep 29, 2022

Description

Adds Syriac / Assyrian support

Type of PR

  • Feature implementation

Testing

  • pytest test/test_format_syr.py
  • pytest test/test_parse_syr.py

Notes

Syriac support was implemented in glibc 2.36 under the syr locale. This may be a stumbling block to get Syriac tested appropriately. I can vouch that this works on my system currently.

Fork the Farsi language implementation and begin the Syriac
implementation.
More changes towards support
It is easier to pronounce words with vowels than without.
Re-do some of the datetime and number extraction function.
Pass more tests.
@devs-mycroft devs-mycroft added the CLA: Yes Contributor License Agreement exists (see https://github.com/MycroftAI/contributors) label Sep 30, 2022
@krisgesling
Copy link
Contributor

Hi Emil,

First, congratulations on what I know is a lot of work!

Right now, we're in the midst of releasing the first production Mark II devices. So we're pretty swamped, but will definitely make some time to review this in detail over the next few weeks.

I'll also have a think about what we can do around the need for the latest glibc. This might just be documentation so that users know what's required.

Beyond Lingua Franca, do you have any recommendations for STT and TTS options?

@esoleyman
Copy link
Author

Hi Emil,

First, congratulations on what I know is a lot of work!

Right now, we're in the midst of releasing the first production Mark II devices. So we're pretty swamped, but will definitely make some time to review this in detail over the next few weeks.

I'll also have a think about what we can do around the need for the latest glibc. This might just be documentation so that users know what's required.

Beyond Lingua Franca, do you have any recommendations for STT and TTS options?

Given that we have low resources for our language, we don't have any STT and TTS options at present. We are working on creating a language corpora among other initiatives including updating Unicode CLDR and getting Syriac onto Mozilla Common Voice.

@krisgesling
Copy link
Contributor

Yeah, there are a lot of barriers to getting started unfortunately but it sounds like you are ticking them off one by one!

For TTS, if you have or know of any openly licensed voice datasets that we might be able to use, then we could look at training a Mimic 3 model.

@esoleyman
Copy link
Author

esoleyman commented Oct 4, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA: Yes Contributor License Agreement exists (see https://github.com/MycroftAI/contributors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants