General lack of clarity about input/output/context languages #16

domenic · 2024-11-25T06:40:47Z

If you try to summarize Japanese text, should you expect a Japanese summary? Or an English summary?

What if you provide your { context } or { sharedContext } in a third or fourth language?

How do the answers to these questions interact with summarizerCapabilities.languageAvailable()? Currently it's only intended to give an answer for input language support.

Should we allow web developers to specify the output language more tightly? If so, how could we guarantee the result---would we pass it through translation APIs behind the scenes? Or just fail if it's not supported, and let developers do the translation themselves?

The text was updated successfully, but these errors were encountered:

These solve the problem discussed in webmachinelearning/prompt-api#29 and #16. They provide a mechanism for web developers to tell the browser to download additional material to support additional languages, and for web developers to get early errors if they know they will be trying to use a language that isn't supported. It also clearly separates input, context, and output languages, with a requirement on how the output language is produced by default (match the input). This removes the languageAvailable() API, folding it into createOptionsAvailable(). Further work might remove the AISummarizerCapabilities object altogether, since now it's mostly a wrapper around the single createOptionsAvailable() method.

etiennenoel · 2024-12-09T22:17:03Z

I assume that this would use the LLM and not the translate API to do the translation right?

In this case, what is the difference between the rewriteAPI and the translateAPI if you can use the rewriteAPI simply for translation purposes?

domenic · 2024-12-10T04:24:11Z

I agree this is confusing and unsatisfactory.

One could argue that there's a difference between "rewriting" and "translating", similar to the difference between "summarizing" and "rewriting". But I'm not sure the argument is very solid.

I think the more practical issue is just about expected language support in current implementations, and how that affects the combinations. Currently we'd expect:

Rewrite supports the various options (tone, format, length, context). It supports 1-5 input and output languages. It supports multiple input languages in the same string. This kind of capability is naturally emergent from the language model we plan to use.
Translate supports zero options. It supports many languages. It only supports a single input language per string. This kind of capability is naturally emergent from the translation model we plan to use.

Our current strategy is to signal this clearly via different API entrypoints: translate doesn't have any configurable options, for example, and expectedInputLanguages is optional for rewriter.

You could imagine an alternate strategy where we try to fit everything into the rewriter API. This would have some sharp edges, though. For example:

Even if both the language and the translation models support a given language pair, you could see dramatically different translations by just tweaking the options slightly. E.g., if you use the { context } option, which only the language model supports, your translation will suddenly change in ways that are not related to the context.
Translating between a given language pair might be supported using the translation model, but then when you ask it to make the result shorter, we either fail (because we can't do everything with the language model) or we have to do something like translate to English, make shorter, translate to destination language, which could introduce unexpected artifacts.

Do we think it might be worth pursuing this road anyway?

An alternate strategy would be to get rid of the outputLanguage setting for the writing assistance APIs and say that the output language is always derived from the input language. Then, web developers have to explicitly use the translation APIs if they want. But that runs into the issue where it's not clear what kind of results should occur for multilingual input; as #22 states,

If the outputLanguage is not supplied, the default behavior is to produce the output in "the same language as the input". For the multilingual input case, what this means is left implementation-defined for now, and implementations should err on the side of rejecting with a "NotSupportedError" DOMException. For this reason, it's strongly recommended that developers supply outputLanguage.

These solve the problem discussed in webmachinelearning/prompt-api#29 and #16. They provide a mechanism for web developers to tell the browser to download additional material to support additional languages, and for web developers to get early errors if they know they will be trying to use a language that isn't supported. It also clearly separates input, context, and output languages, with a requirement on how the output language is produced by default (match the input). This removes the languageAvailable() API, folding it into createOptionsAvailable(). Further work might remove the AISummarizerCapabilities object altogether, since now it's mostly a wrapper around the single createOptionsAvailable() method.

michaelwasserman · 2024-12-12T18:35:11Z

WDYT about combining input, context, and shared context languages into a single list?
Those lists will likely undergo identical impl support checks, and most dev usage will likely have identical lists. Is there a clear user/dev/impl benefit to splitting them up?
Also, WDYT about handling a list for output languages?
Perhaps a single response could be multi-lingual, or separate responses could be in separate languages? This might also help coalesce dev inquiries and creation requests for multiple output languages, say for translation.
It might be nice if responses include a string description or codes regarding incompatibilities (e.g. "No multi-lingual output", or NotSupportedInputLanguage, even NotSupportedLengthAndToneCombination or similar)

domenic linked a pull request Dec 6, 2024 that will close this issue

Overhaul availability testing and add expected language options #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

General lack of clarity about input/output/context languages #16

General lack of clarity about input/output/context languages #16

domenic commented Nov 25, 2024

etiennenoel commented Dec 9, 2024

domenic commented Dec 10, 2024 •

edited

Loading

michaelwasserman commented Dec 12, 2024

General lack of clarity about input/output/context languages #16

General lack of clarity about input/output/context languages #16

Comments

domenic commented Nov 25, 2024

etiennenoel commented Dec 9, 2024

domenic commented Dec 10, 2024 • edited Loading

michaelwasserman commented Dec 12, 2024

domenic commented Dec 10, 2024 •

edited

Loading