Support ML Inference Search Processor Writing to Search Extension #3061

mingshl · 2024-10-03T18:28:07Z

Description

Previously, ml inference search processor only writing prediction results to the hits. Now, Support ML Inference Search Processor Writing to Search Extension when many to one inference.

Note that it's not supported for one to one inference because the order of one to one inference matters and other processors might rerank the order and mess up the order to match one to one model input and prediction results

Related Issues

#2878

Sample Test Case


PUT /review_string_index/_doc/1
{
  "review": "Dr. Eric Goldberg is a fantastic doctor who has correctly diagnosed every issue that my wife and I have had. Unlike many of my past doctors, Dr. Goldberg is very accessible and we have been able to schedule appointments with him and his staff very quickly. We are happy to have him in the neighborhood and look forward to being his patients for many years to come." ,
  "label":"5 stars"
}

PUT /review_string_index/_doc/2
{
  "review": "happy visit" ,
  "label":"5 stars"
}


PUT /review_string_index/_doc/3
{
  "review": "sad place" ,
  "label":"1 stars"
}

PUT /_search/pipeline/my_pipeline_request_review_llm
{
  "response_processors": [
    {
      "ml_inference": {
        "tag": "ml_inference",
        "description": "This processor is going to run llm",
        "model_id": "uhkETJIB5-xYSMo_SPet",
        "function_name": "REMOTE",
        "input_map": [
          {
            "context": "review"
          }
        ],
        "output_map": [
          {
            "ext.ml_inference.params.model_response": "response" 
          }
        ],
        "model_config": {
          "prompt":"\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${parameters.context.toString()}. \n\n Human: please summarize the documents \n\n Assistant:"
        },
        "ignore_missing": false,
        "ignore_failure": false,
        "one_to_one":false
      }
    }
  ]
}

GET /review_string_index/_search?search_pipeline=my_pipeline_request_review_llm
{"query":{
  "match_all": {}
}
}

returnning

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "review_string_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "review": "Dr. Eric Goldberg is a fantastic doctor who has correctly diagnosed every issue that my wife and I have had. Unlike many of my past doctors, Dr. Goldberg is very accessible and we have been able to schedule appointments with him and his staff very quickly. We are happy to have him in the neighborhood and look forward to being his patients for many years to come.",
          "label": "5 stars"
        }
      },
      {
        "_index": "review_string_index",
        "_id": "2",
        "_score": 1,
        "_source": {
          "review": "happy visit",
          "label": "5 stars"
        }
      },
      {
        "_index": "review_string_index",
        "_id": "3",
        "_score": 1,
        "_source": {
          "review": "sad place",
          "label": "1 stars"
        }
      }
    ]
  },
  "ext": {
    "ml_inference": {
      "llm_response": """ Based on the context provided:

- The first document is a positive review of Dr. Eric Goldberg from a patient. It praises Dr. Goldberg for correctly diagnosing issues for the patient and their wife. It also notes that Dr. Goldberg is very accessible and appointments can be scheduled quickly with him and his staff. The patient expresses happiness that Dr. Goldberg is in their neighborhood and looks forward to being his patient for many years.

- The second document just says "happy visit". 

- The third document says "sad place".

- In summary, the first document positively reviews a doctor, Dr. Eric Goldberg. The other two documents don't provide much context on their own, just mentioning a "happy visit" and "sad place"."""
    }
  }
}

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Mingshi Liu <[email protected]>

ohltyler · 2024-10-03T19:47:57Z

Does this change the default functionality of many-to-one, such that the outputs in the output map will always be placed outside of the individual document _sources, or is that still an option? I don't know if there's any good use cases to support the latter.

mingshl · 2024-10-03T20:23:41Z

Does this change the default functionality of many-to-one, such that the outputs in the output map will always be placed outside of the individual document _sources, or is that still an option? I don't know if there's any good use cases to support the latter.

The output mapping will allow users to define whether they want to save to extension or the document source. If the output mappings points to prefix ext.ml_inference, the model output will save to extension, else other mappings will default saving to document source

ohltyler · 2024-10-03T22:00:35Z

Does this change the default functionality of many-to-one, such that the outputs in the output map will always be placed outside of the individual document _sources, or is that still an option? I don't know if there's any good use cases to support the latter.

The output mapping will allow users to define whether they want to save to extension or the document source. If the output mappings points to prefix ext.ml_inference, the model output will save to extension, else other mappings will default saving to document source

Got it, thanks.

add MLInferenceSearchResponse

8381354

Signed-off-by: Mingshi Liu <[email protected]>

mingshl requested review from b4sjoo, dhrubo-os, jngz-es, model-collapse, rbhavna, ylwu-amzn, zane-neo, Zhangxunmt, austintlee, HenryL27, samuel-oci and xinyual as code owners October 3, 2024 18:28

mingshl had a problem deploying to ml-commons-cicd-env-require-approval October 3, 2024 18:28 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support ML Inference Search Processor Writing to Search Extension #3061

Support ML Inference Search Processor Writing to Search Extension #3061

mingshl commented Oct 3, 2024 •

edited

Loading

ohltyler commented Oct 3, 2024

mingshl commented Oct 3, 2024

ohltyler commented Oct 3, 2024

Support ML Inference Search Processor Writing to Search Extension #3061

Are you sure you want to change the base?

Support ML Inference Search Processor Writing to Search Extension #3061

Conversation

mingshl commented Oct 3, 2024 • edited Loading

Description

Related Issues

Sample Test Case

Check List

ohltyler commented Oct 3, 2024

mingshl commented Oct 3, 2024

ohltyler commented Oct 3, 2024

mingshl commented Oct 3, 2024 •

edited

Loading