Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improved localization #52

Closed
wants to merge 6 commits into from
Closed

improved localization #52

wants to merge 6 commits into from

Conversation

domcross
Copy link
Contributor

This PR adresses issue #49
Use OWM weather condition with Mycrofts language settings when language is supported by OWM. (Use English otherwise). This will save us from a lot of translation work as OWM can answer with more than 50 different weather conditions.

Copy link
Collaborator

@forslund forslund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for submitting this PR, this looks super useful!

The comments I've made are all minor issues and the code looks really good in general!

The thing that may be a problem is the iso-639 to OWM region code. Looks like it works in many cases but may fail in certain regions.

__init__.py Outdated
elif lang[0:2] in owmMulti:
owmLang = lang[0:2]
LOG.debug("lang: %s owmLang: %s" % (lang, owmLang))
return owmLang
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this will work 100% accurately.

for swedish the language code in mycroft is sv-se which will not work with the lang[0:2].

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the lang codes in OWM and most will work.

Maybe you could add special handling for the ones that doesn't work:

ISO	OWM	Language
sv-* 	se	Swedish
cs-cz	cz	Czech
ko-kr	kr	Korean
lv-lv	la	Latvian
uk-ua	ua	Ukrainan

__init__.py Outdated Show resolved Hide resolved
__init__.py Outdated
@@ -194,6 +197,7 @@ def handle_forecast(self, message):
report = self.__initialize_report(message)

# Get a date from spoken request
LOG.debug("lang %s" % self.lang)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a critical thing but I think it's better to use the new style 'lang {}'.format(self.lang) instead of the old % notation.

__init__.py Outdated
@@ -152,13 +156,12 @@ def handle_current_weather(self, message):
# Get a date from requests like "weather for next Tuesday"
today = extract_datetime(" ")[0]
when, _ = extract_datetime(
message.data.get('utterance'), lang=self.lang)
message.data.get('utterance'), lang=self.lang)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space at the end.

__init__.py Outdated Show resolved Hide resolved
@domcross
Copy link
Contributor Author

pushed some changes with a931d47
I am still a bit unfamiliar with PR-handling in Github, do i still need to resolve above conversation?

@forslund
Copy link
Collaborator

No, you don't need to resolve it.

@forslund
Copy link
Collaborator

Apparently Swedish will still be a problem.

Sami has the code se-se, andI think that would be mapped to Swedish in this solution? I think the simplest way to handle this would be to use a lookup table instead of looking at the second part of the lang code.

    lookup : {
        'sv': 'se',
        'cs': 'cz',
        'ko': 'kr',
        'lv': 'la'
        'uk': 'ua'
    }
    if lang[0] in lookup:
        return lookup[lang[0]]

@forslund
Copy link
Collaborator

I've tested it and other than that it seems to work excellent!

@domcross
Copy link
Contributor Author

domcross commented Nov 19, 2018

Made a small change to the lang-code detection: a match in the second part of the lang-code (after hyphen) overwrites the match from the first one.

The problem is that what OWM calls "lang-code" is more a country-code than a lang-code. The BCP47 specification which underlies Mycroft's langcode-tagging is way too complex to get a exact mapping for the purpose here...

And yes, this will still not cover all cases - my apologies to all people speaking Sami.

@forslund
Copy link
Collaborator

Yeah, which may be a problem in of itself. sv-fi for example, Swedish spoken in Finland.

It's not likely we'd have that support in the near future so you can leave it as it is and we can replace it with some more accurate iso -> owm conversion later.

@domcross
Copy link
Contributor Author

included your suggestion with lookup for special cases...

@ftyers
Copy link

ftyers commented Dec 3, 2018

Apparently Swedish will still be a problem.

Sami has the code se-se, andI think that would be mapped to Swedish in this solution? I think the simplest way to handle this would be to use a lookup table instead of looking at the second part of the lang code.

Note that the code "se" for Sámi is a macrocode and doesn't refer to any particular sámi language (or maybe it just refers to North Sámi). The right codes to use for the sámi languages are the three letter ones: sme, smj, smn, sma, etc. which allow you to distinguish between North/Lule/Inari etc.

@JuliaC29
Copy link

JuliaC29 commented Dec 3, 2018

@forslund Is there anything remaining to do to merge this pull request?

@forslund
Copy link
Collaborator

forslund commented Dec 3, 2018

@JuliaC29 @domcross I think it should be good now. I'll try it out tomorrow and merge if it's ok

@forslund
Copy link
Collaborator

forslund commented Dec 4, 2018

@domcross I needed to make a couple of changes to get everything working.

  • My example lookup table was not correct syntax so that didn't work.
  • For Swedish the encoding on the condition got messed up so I've added an encoding conversion step

As I dropped the ball on reviewing this I've gone ahead and fixed these to do a quick merge. Thanks for the work it's quite an improvement

@forslund forslund mentioned this pull request Dec 4, 2018
@forslund forslund closed this in #55 Dec 4, 2018
forslund added a commit that referenced this pull request Dec 4, 2018
Rebase of #52 Improved Localization
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants