[ENHANCEMENT] Avoid un-necessary predictions in ingest processors #2413

mingshl · 2024-05-07T16:37:42Z

Is your feature request related to a problem?
When re-indexing, text_embedding processors and ml_inference processors will run the prediction again even when the inference fields are already existed in the document. text_embedding processors will over-write the the inference field, while ml_inference will not write to the inference field, then throw exception or skip writing to the document.

What solution would you like?
In the ingest processor that used ml inference, we should check the model output field name, for example, text_embedding field exists in the document before the prediction tasks run. If the field already exists, skip the predictions.

In this case, when re-index happens, it won't run the prediction tasks again if the field is existed.

What alternatives have you considered?
Welcome any other suggestions.

The text was updated successfully, but these errors were encountered:

IanMenendez · 2024-05-07T17:11:17Z

Maybe it would be nice to add a parameter on ml ingest processors called overwrite: Option[Boolean] if true it overwrites current embeddings upon reindexing if false it does not overwrite

It is nice to still have the option to overwrite in case we change the ML model to one that outputs different embeddings

ylwu-amzn · 2024-06-14T22:54:06Z

Added in this PR #2508

mingshl added enhancement New feature or request untriaged labels May 7, 2024

mingshl changed the title ~~[ENHANCEMENT]~~ [ENHANCEMENT] Avoid un-necessary predictions in ingest processors May 7, 2024

dhrubo-os added this to ml-commons projects May 7, 2024

dhrubo-os moved this to On-deck in ml-commons projects May 7, 2024

dhrubo-os assigned mingshl May 7, 2024

dhrubo-os added v2.15.0 and removed untriaged labels May 7, 2024

ylwu-amzn assigned rbhavna Jun 14, 2024

ylwu-amzn closed this as completed Jun 14, 2024

github-project-automation bot moved this from On-deck to Done in ml-commons projects Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENHANCEMENT] Avoid un-necessary predictions in ingest processors #2413

[ENHANCEMENT] Avoid un-necessary predictions in ingest processors #2413

mingshl commented May 7, 2024

IanMenendez commented May 7, 2024 •

edited

Loading

ylwu-amzn commented Jun 14, 2024

[ENHANCEMENT] Avoid un-necessary predictions in ingest processors #2413

[ENHANCEMENT] Avoid un-necessary predictions in ingest processors #2413

Comments

mingshl commented May 7, 2024

IanMenendez commented May 7, 2024 • edited Loading

ylwu-amzn commented Jun 14, 2024

IanMenendez commented May 7, 2024 •

edited

Loading