Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ISO 639-2 languages #13

Open
piranna opened this issue Jun 17, 2019 · 9 comments
Open

Add support for ISO 639-2 languages #13

piranna opened this issue Jun 17, 2019 · 9 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@piranna
Copy link

piranna commented Jun 17, 2019

Expand databases to add tranlations of 639-2 languages, that's it, the ones with three-letter codes without an equivalent two-letter code. The list of 639-2 languages codes is available at https://en.wikipedia.org/wiki/List_of_ISO_639-2_codes.

@e110c0 e110c0 added enhancement New feature or request help wanted Extra attention is needed labels Jun 17, 2019
@e110c0
Copy link
Member

e110c0 commented Jun 17, 2019

this means a major refactoring of the library since currently the translations are based on the 2-letter codes. Generally speaking, I think it's a good idea, but there is a thing to consider first:

how would non-existent 2-letter codes be handled? Right now, there is always a match between the different codes since the Library is using the smallest group.

@piranna
Copy link
Author

piranna commented Jun 17, 2019

From what I've investigated, seems new 639-1 codes are not added in case there's one for 639-2 list, so 639-1 list doesn't need to be changed for systems compatible with both standards giving priority to 639-1. Having this in account, I would follow this path and just only add 639-2 codes for the missing languages. It's said, just only expand the current languages list, no need for library refactor at all.

@piranna
Copy link
Author

piranna commented Jun 17, 2019

Are you talking about https://github.com/cospired/i18n-iso-languages/blob/master/index.js#L165? Yes, that conversion could be problematic... :-/ Being that just the only place where is being internally used, maybe it could be considered a special case for it. Honestly, I'm currently only using getNames(), so having the full list I just only need to access the correct translation by its index...

@e110c0
Copy link
Member

e110c0 commented Jun 17, 2019

the library keeps the list of translations as an object with [two-letter code]: [translation] to get the translation of a language code. In addition to that, it allows conversion of different ISO639 parts.

In order to support translations for 639-2 languages that do not have a 2-letter code. the translation files have to be based off of 639-2 three-letter codes.

So the work I see here:

  • convert all translation files to be based on 3letter codes (and there: either 639-2 Alpha 3 B or T!)
  • extend translations to include 3letter code languages currently missing
  • extend codes.json to be able to map between codes and translation. Since the entry 0 of each array is the 2-letter code, this now might become undefined (see next point)
  • add handling what happens if there is no equivalent 2 letter code for a given 3 letter code
  • refactor the code so the new translation files are used

@piranna
Copy link
Author

piranna commented Jun 17, 2019

The main problem I see here is that I'm not fully sure if 639-2 is an expansion of 639-1, I think there are some 639-1 codes that doesn't have an equivalent 639-2 code, hope I'm wrong so your roadmap could be valid...

@e110c0
Copy link
Member

e110c0 commented Jun 17, 2019

well, this part I didn't even consider yet. My roadmap is for the case that the 639-1 list is fully included in the 639-2 list. Hopefully, the B/T parts are also mutually included into each other.

@piranna
Copy link
Author

piranna commented Jun 17, 2019

Yes, it seems 639-2 is a superset of 639-1, or at least it was originally designed that way, so seems we are safe here and we can just use 639-3 codes as indexes. Regarding B/T, seems one is in English and the other one is localized in the language itself idiom. Not sure what could be the correct one to use... maybe english?

@e110c0
Copy link
Member

e110c0 commented Jun 17, 2019

I'd say English would be a valid choice

@gajus
Copy link

gajus commented Oct 12, 2019

Just in case, our use case requires converting between ISO 639-1, ISO 639-2/T, ISO 639-2/B and ISO 639-3. It would be nice if this library supported all of the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants