Skip to content

Latest commit

 

History

History
executable file
·
280 lines (198 loc) · 21.6 KB

admin-language-support.md

File metadata and controls

executable file
·
280 lines (198 loc) · 21.6 KB
copyright lastupdated keywords subcollection
years
2015, 2024
2024-10-10
global support, universal language, universal model, another language
watson-assistant

{{site.data.keyword.attribute-definition-list}}

Adding support for global audiences

{: #admin-language-support}

Your customers come from all around the globe. You need an assistant that can talk to them in their own language and in a familiar style. Choose the approach that best fits your business needs.

  • Quickest solution: The simplest way to add language support is to author the assistant in a single language. You can translate each message that is sent to your assistant from the customer's local language to the assistant language. Later you can translate each response from the assistant language back to the customer's local language.

    This approach simplifies the process of authoring and maintaining the conversation. You can build one assistant and use it for all languages. However, the intention and meaning of the customer message can be lost in the translation.

    For more information about webhooks you can use for translation, see Webhook overview.

  • Most precise solution: If you have the time and resources, the best user experience can be achieved when you build multiple assistants, one for each language that you want to support. {{site.data.keyword.conversationfull}} has built-in support for all languages. Use one of 13 language-specific models or the universal model, which adapts to any other language you want to support.

    When you build an assistant that is dedicated to a language, a language-specific classifier model is used by the assistant. The precision of the model means that your assistant can better understand and recognize the goals of even the most colloquial message from a customer.

    Use the universal language model to create an assistant that is fluent even in languages that {{site.data.keyword.conversationshort}} doesn't support with built-in models.

    To deploy, use the web chat integration with your French-speaking assistant to deploy to a French-language page on your website. Deploy your German-speaking assistant to the German page of your website. Maybe you have a support phone number for French customers. You can configure your French-speaking assistant to answer those calls, and configure another phone number that German customers can use.

    You can enable the download of language data files, in CSV format, so you can translate training examples and assistant responses from English into other languages and use in other assistants. For more information, see Using multilingual downloads for translation.

Understanding the universal language model

{: #admin-language-support-universal}

An assistant that uses the universal language model applies a set of shared linguistic characteristics and rules from multiple languages as a starting point. It then learns from the training data that you add to it.

The universal language classifier can adapt to a single language per assistant. It cannot be used to support multiple languages within a single assistant. However, you can use the universal language model in one assistant to support one language, such as Russian, and in another assistant to support another language, such as Hindi. The key is to add enough training examples or intent user examples in your target language to teach the model about the unique syntactic and grammatical rules of the language.

Use the universal language model when you want to create a conversation in a language where no model is available, and which is unique enough that an existing model is insufficient.

As you follow the normal steps to design a conversational flow, you teach the universal language model about the language you want your skill to support. It is by adding training data that is written in the target language that the universal model is constructed.

For more information about feature support in the universal language model, see Supported languages.

Integration considerations

{: #admin-language-support-integrations}

Keep these tips in mind for integrations:

  • Web chat: Web chat has some hardcoded strings that you can customize to reflect your target language. For more information, see Supporting global audiences in web chat.
  • Phone integration: If you want to deploy an assistant that uses the universal language model, you must connect to custom Speech service language models that can understand the language you're using. For more information about supported language models, see the Speech to Text{: external} and Text to Speech{: external} documentation.
  • Search integration: If you build an assistant that specializes in a single language, be sure to connect it to data collections that are written in that language. For more information about the languages that are supported by {{site.data.keyword.discoveryshort}}, see Language support{: external}.

Supported languages

{: #admin-language-support-codes}

{{site.data.keyword.conversationfull}} supports individual features to varying degrees per language. It has classifier models that are designed specifically to support conversations in the following languages:

Language Code
English en-us
Arabic ar
Chinese (Simplified) zh-cn
Chinese (Traditional) zh-tw
Czech cs
Dutch nl
French fr
German de
Italian it
Japanese ja
Korean ko
Portuguese (Brazilian) pt-br
Spanish es
Universal* xx
{: caption="Supported languages" caption-side="top"}

*If you want to support conversations in a language for which {{site.data.keyword.conversationshort}} does not have a dedicated model, such as Russian, use Universal.

Changing an assistant language

{: #admin-language-support-change-language}

After an assistant is created, its language cannot be modified.

Working with accented characters

{: #admin-language-support-accents}

In a conversational setting, users might or might not use accents with the {{site.data.keyword.conversationshort}} service. As such, both accented and non-accented versions of words might be treated the same for intent detection and entity recognition.

However, for some languages like Spanish, some accents can alter the meaning of the entity. Thus, for entity detection, although the original entity might implicitly have an accent, your assistant can also match the non-accented version of the same entity, but with a slightly lower confidence score.

For example, for the word "barrió", which has an accent and corresponds to the past tense of the verb "barrer" (to sweep), your assistant can also match the word "barrio" (neighborhood), but with a slightly lower confidence.

The system provides the highest confidence scores in entities with exact matches. For example, barrio isn't detected if barrió is in the training set; and barrió isn't detected if barrio is in the training set.

You are expected to train the system with the proper characters and accents. For example, if you are expecting barrió as a response, put barrió into the training set.

Although not an accent mark, the same applies to words that use the Spanish letter ñ versus the letter n, such as "uña" versus "una". In this case, the letter ñ is not an n with an accent; it is a unique, Spanish-specific letter.

Using multilingual downloads for translation

{: #admin-language-support-multilingual}

You can enable the download of language data files, in CSV format, so you can translate training examples and assistant responses into other languages and use in other assistants.

Each CSV file includes translatable_string data that you can use with a machine or human translation service.

Each CSV file also includes id, resource_type, and locator data that {{site.data.keyword.conversationshort}} can use in another assistant to re-create your source assistant. You don't need to edit this information.

The overview of the multilingual process is:

Enabling multilingual download

{: #admin-language-support-multilingual-enable}

To enable multilingual download:

  1. Open Assistant settings.

  2. In the Download/Upload section, click Enable multilingual download.

    Enabling multilingual might take a few minutes to process, but you can work elsewhere in the assistant. The Download/Upload button is disabled until this process finishes. After the multilingual download is enabled in an assistant, it can't be disabled. {: note}

  3. Click Download/Upload files.

  4. On the Download tab, choose Multilingual file package.

  5. Select a published version, then click Download. You need at least one version for the download to be available.

    Downloading your first file might take a few minutes to process, but you can work elsewhere in the assistant. The Download/Upload button is disabled until this process finishes. {: note}

  6. When the download finishes, your .zip file contains:

    • action-responses.csv
    • action-training.csv
    • Data (Do not edit) folder, which contains assistant.bin

    If you are using dialog, the .zip file also contains:

    • dialog-responses.csv
    • dialog-training.csv

Translating content

{: #admin-language-support-multilingual-translate}

To translate content:

  1. Translate the content in the CSV files, for example, from English to French. Save files with translations as CSV UTF-8 (Comma Delimited) to account for any language-specific characters.

  2. When you are finished with the translation process, create a new .zip file that includes:

    • action-responses.csv with your translations. Don't change the name of the file.
    • action-training.csv with your translations. Don't change the name of the file.
    • The original, untouched Data (Do not edit) folder, which contains assistant.bin

    If you translated dialog content, also include dialog-responses.csv and dialog-training.csv in the .zip file.

Uploading to language-specific assistants

{: #admin-language-support-multilingual-upload}

To upload to a language-specific assistant:

  1. Create or switch to a destination assistant that uses the language for your translations.

  2. In the destination assistant, open Assistant settings.

  3. If you translated dialog training and responses, ensure that dialog is active in the destination assistant.

  4. In the Download/Upload section, click Download/Upload files.

    You don't need to enable multilingual download in the destination assistant. {: note}

  5. On the Upload tab, choose Multilingual file package.

  6. Attach your multilingual .zip file package, then click Upload. The translated content is added to your draft environment, so you can work on publishing the translated assistant.

Content language support

{: #admin-language-support-content}

These languages are supported for content in actions, dialog, and the search integration.

Language Actions Dialog Search integration
English (en) checkmark icon checkmark icon checkmark icon
Arabic (ar) checkmark icon checkmark icon checkmark icon
Chinese (Simplified) (zh-cn) checkmark icon checkmark icon checkmark icon
Chinese (Traditional) (zh-tw) checkmark icon checkmark icon checkmark icon
Czech (cs) checkmark icon checkmark icon checkmark icon
Dutch (nl) checkmark icon checkmark icon checkmark icon
French (fr) checkmark icon checkmark icon checkmark icon
German (de) checkmark icon checkmark icon checkmark icon
Italian (it) checkmark icon checkmark icon checkmark icon
Japanese (ja) checkmark icon checkmark icon checkmark icon
Korean (ko) checkmark icon checkmark icon checkmark icon
Portuguese (Brazilian) (pt-br) checkmark icon checkmark icon checkmark icon
Spanish (es) checkmark icon checkmark icon checkmark icon
Universal (xx) checkmark icon checkmark icon checkmark icon
{: caption="Content support details" caption-side="top"}

The {{site.data.keyword.conversationshort}} service supports multiple languages as noted, but the user interface itself (such as descriptions and labels) is in English. All supported languages can be input and trained through the English interface. {: note}

GB18030 compliance: GB18030 is a Chinese standard that specifies an extended code page for use in the Chinese market. This code page standard is important for the software industry because the China National Information Technology Standardization Technical Committee mandates that any software application that is released for the Chinese market after September 1, 2001, be enabled for GB18030. The {{site.data.keyword.conversationshort}} service supports this encoding, and is certified GB18030-compliant

Dialog language support

{: #language-support-dialog}

For these dialog features, language support differs depending on the language.

Intent feature support

{: #language-support-intents}

Language Content Catalog Algorithm version
English (en) checkmark icon checkmark icon
Arabic (ar) checkmark icon (except Covid-19) checkmark icon
Chinese (Simplified) (zh-cn) checkmark icon
Chinese (Traditional) (zh-tw) checkmark icon
Czech (cs) checkmark icon
Dutch (nl) checkmark icon
French (fr) checkmark icon checkmark icon
German (de) checkmark icon (except Covid-19) checkmark icon
Italian (it) checkmark icon (except Covid-19) checkmark icon
Japanese (ja) checkmark icon (except Covid-19) checkmark icon
Korean (ko) checkmark icon
Portuguese (Brazilian) (pt-br) checkmark icon checkmark icon
Spanish (es) checkmark icon checkmark icon
Universal (xx)
{: caption="Intent feature support details" caption-side="bottom"}

User input processing support

{: #language-support-input}

Language Dictionary-based entity support Fuzzy matching (Misspelling) Fuzzy matching (Stemming) Fuzzy matching (Partial match) Autocorrection
English (en) checkmark icon checkmark icon checkmark icon checkmark icon checkmark icon
Arabic (ar) checkmark icon checkmark icon
Chinese (Simplified) (zh-cn) checkmark icon
Chinese (Traditional) (zh-tw) checkmark icon
Czech (cs) checkmark icon checkmark icon checkmark icon
Dutch (nl) checkmark icon checkmark icon
French (fr) checkmark icon checkmark icon checkmark icon Beta
German (de) checkmark icon checkmark icon checkmark icon
Italian (it) checkmark icon checkmark icon
Japanese (ja) checkmark icon checkmark icon
Korean (ko) checkmark icon checkmark icon
Portuguese (Brazilian) (pt-br) checkmark icon checkmark icon
Spanish (es) checkmark icon checkmark icon
Universal (xx) checkmark icon checkmark icon
{: caption="User input processing support details" caption-side="bottom"}

Entity feature support

{: #language-support-entities}

Language Contextual entities System entities
English (en) checkmark icon checkmark icon
Arabic (ar) checkmark icon
Chinese (Simplified) (zh-cn) checkmark icon
Chinese (Traditional) (zh-tw) checkmark icon
Czech (cs) checkmark icon
Dutch (nl) checkmark icon
French (fr) Beta checkmark icon
German (de) checkmark icon
Italian (it) checkmark icon
Japanese (ja) checkmark icon
Korean (ko) checkmark icon
Portuguese (Brazilian) (pt-br) checkmark icon
Spanish (es) checkmark icon
Universal (xx) checkmark icon
{: caption="Entity feature support details" caption-side="bottom"}