-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] ML Inference Search Processors should have different model input format when input mapping has dollar symbols #2974
Comments
will add documentations in both ingest and search pipelines. |
need to check the jsonpath logics in other APIs, for example predict API and create connectors. |
checked on ConnectorUtils, it will return its original form of object Line 222 in 6a6cac1
|
Instead of allowing different configurations when providing with/without dollar symbol. It's better to use standard configuration across the ml-commons repo. If users would like to use different jsonpath configurations, we should open up a new parameters to change the jsonpath configuration settings. |
fixed in #2985 |
What is the bug?
In ML Inference Processors,
For ingest side, we consider the use case of getting value from nested object and getting value from object, so we support both dot path notation to get field value, and support json path notation that can get nested value in a list.
But for search processors, we support json path to read the object, the behavior is different, it always return the original data format. We want to have two ways of reading the object similar to ingest side.
How can one reproduce the bug?
In 2.14, using ml_inference ingest processor:
for nested document like this sample book index,
we can config the input maps as
{"input": "$.book.*.chunk.text.*.context"}
, which fetch the model input asfor simple object like this sample
item
index,if configuring input_map as
{"input": "item.text"}
the model input will be in string representation.
if configuring input_map as
{"input": "$.item.text"}
the model input will be list representation.
But in search processors:
for the same
item
index, if configuring input_map as{"input": "$.item.text"}
,the model input will be in string representation.
What is the expected behavior?
we would like the similar logic as ingest processors, when using dot path notation without dollar symbol '$', it will get the original data format, but when using dollar symbol '$', it should return a list of value representations.
in search processors:
for the same
item
index, if configuring input_map as{"input": "$.item.text"}
,the model input will be in list representation.
for the same
item
index, if configuring input_map as{"input": "item.text"}
,the model input will be in list representation.
What is your host/environment?
The text was updated successfully, but these errors were encountered: