-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improved localization #52
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for submitting this PR, this looks super useful!
The comments I've made are all minor issues and the code looks really good in general!
The thing that may be a problem is the iso-639 to OWM region code. Looks like it works in many cases but may fail in certain regions.
__init__.py
Outdated
elif lang[0:2] in owmMulti: | ||
owmLang = lang[0:2] | ||
LOG.debug("lang: %s owmLang: %s" % (lang, owmLang)) | ||
return owmLang |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this will work 100% accurately.
for swedish the language code in mycroft is sv-se
which will not work with the lang[0:2].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through the lang codes in OWM and most will work.
Maybe you could add special handling for the ones that doesn't work:
ISO OWM Language
sv-* se Swedish
cs-cz cz Czech
ko-kr kr Korean
lv-lv la Latvian
uk-ua ua Ukrainan
__init__.py
Outdated
@@ -194,6 +197,7 @@ def handle_forecast(self, message): | |||
report = self.__initialize_report(message) | |||
|
|||
# Get a date from spoken request | |||
LOG.debug("lang %s" % self.lang) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a critical thing but I think it's better to use the new style 'lang {}'.format(self.lang)
instead of the old % notation.
__init__.py
Outdated
@@ -152,13 +156,12 @@ def handle_current_weather(self, message): | |||
# Get a date from requests like "weather for next Tuesday" | |||
today = extract_datetime(" ")[0] | |||
when, _ = extract_datetime( | |||
message.data.get('utterance'), lang=self.lang) | |||
message.data.get('utterance'), lang=self.lang) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space at the end.
pushed some changes with a931d47 |
No, you don't need to resolve it. |
Apparently Swedish will still be a problem. Sami has the code se-se, andI think that would be mapped to Swedish in this solution? I think the simplest way to handle this would be to use a lookup table instead of looking at the second part of the lang code. lookup : {
'sv': 'se',
'cs': 'cz',
'ko': 'kr',
'lv': 'la'
'uk': 'ua'
}
if lang[0] in lookup:
return lookup[lang[0]] |
I've tested it and other than that it seems to work excellent! |
Made a small change to the lang-code detection: a match in the second part of the lang-code (after hyphen) overwrites the match from the first one. The problem is that what OWM calls "lang-code" is more a country-code than a lang-code. The BCP47 specification which underlies Mycroft's langcode-tagging is way too complex to get a exact mapping for the purpose here... And yes, this will still not cover all cases - my apologies to all people speaking Sami. |
Yeah, which may be a problem in of itself. sv-fi for example, Swedish spoken in Finland. It's not likely we'd have that support in the near future so you can leave it as it is and we can replace it with some more accurate iso -> owm conversion later. |
included your suggestion with lookup for special cases... |
Note that the code "se" for Sámi is a macrocode and doesn't refer to any particular sámi language (or maybe it just refers to North Sámi). The right codes to use for the sámi languages are the three letter ones: sme, smj, smn, sma, etc. which allow you to distinguish between North/Lule/Inari etc. |
@forslund Is there anything remaining to do to merge this pull request? |
@domcross I needed to make a couple of changes to get everything working.
As I dropped the ball on reviewing this I've gone ahead and fixed these to do a quick merge. Thanks for the work it's quite an improvement |
Rebase of #52 Improved Localization
This PR adresses issue #49
Use OWM weather condition with Mycrofts language settings when language is supported by OWM. (Use English otherwise). This will save us from a lot of translation work as OWM can answer with more than 50 different weather conditions.