[BUG] Neural search: 4xx error ingesting data with Sagemaker external model #12774

tiagoshin · 2024-03-19T20:33:58Z

Describe the bug

I'm trying to use a model hosted in a Sagemaker endpoint in the same AWS Account as the Opensearch cluster to perform a Neural search.
The issue that I observe is that, while ingesting data into the index, I observe the following error for many documents:

        {
            "index": {
                "_index": "my-index",
                "_id": "id",
                "status": 400,
                "error": {
                    "type": "status_exception",
                    "reason": "Error from remote service: {\"message\":null}"
                }
            }
        }

I don't see any logs in OpenSearch error logs, and I don't see any 4xx or 5xx requests in Sagemaker.
This error only happens with a reasonable amount of data in bulk ingestion, which in this case is 250 records. When I ingest only 20 records, it works.
I already tested getting some documents that failed and tried to ingest them separately, and it worked. So, the issue is not with the document or with the Sagemaker model.

Related component

Plugins

To Reproduce

First, deploy the bge-base-en-v1.5 embedding model in Sagemaker using this python script:

from sagemaker.jumpstart.model import JumpStartModel

model_id = "huggingface-sentencesimilarity-bge-base-en-v1-5"
env = {
	'MMS_JOB_QUEUE_SIZE': '100000',
}
text_embedding_model = JumpStartModel(
    model_id=model_id,
    env=env,
    role="<YOUR-SAGEMAKER-ROLE>",
)

predictor = text_embedding_model.deploy(
	initial_instance_count=1, 
	instance_type='ml.g5.xlarge'
)

Once it's deployed, get the SageMaker endpoint.
Create a Sagemaker connector in OpenSearch:

POST {{host}}/_plugins/_ml/connectors/_create
{
  "name": "Amazon Sagemaker connector",
  "description": "The connector to Sagemaker",
  "version": 1,
  "protocol": "aws_sigv4",
  "credential": {
    "roleArn": "<YOUR-ROLE>"
  },
  "parameters": {
    "region": "<YOUR-REGION>",
    "service_name": "sagemaker"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "headers": {
        "content-type": "application/json"
      },
      "URL": "<YOUR-SAGEMAKER-ENDPOINT>",
      "request_body": "{ \"text_inputs\": \"${parameters.text_inputs}\", \"mode\": \"embedding\" }",
      "pre_process_function": "\n    StringBuilder builder = new StringBuilder();\n    builder.append(\"\\\"\");\n    String first = params.text_docs[0];\n    builder.append(first);\n    builder.append(\"\\\"\");\n    def parameters = \"{\" +\"\\\"text_inputs\\\":\" + builder + \"}\";\n    return  \"{\" +\"\\\"parameters\\\":\" + parameters + \"}\";",
      "post_process_function": "\n      def name = \"sentence_embedding\";\n      def dataType = \"FLOAT32\";\n      if (params.embedding == null || params.embedding.length == 0) {\n        return params.message;\n      }\n      def shape = [params.embedding.length];\n      def json = \"{\" +\n                 \"\\\"name\\\":\\\"\" + name + \"\\\",\" +\n                 \"\\\"data_type\\\":\\\"\" + dataType + \"\\\",\" +\n                 \"\\\"shape\\\":\" + shape + \",\" +\n                 \"\\\"data\\\":\" + params.embedding +\n                 \"}\";\n      return json;\n    "
    }
  ]
}

Get the connector id
Create a model group:

POST {{host}}/_plugins/_ml/model_groups/_register
{
    "name": "sagemaker-model-group",
    "description": "Semantic search model group sagemaker"
}

Get the model group id
Upload the model into OpenSearch:

POST {{host}}/_plugins/_ml/models/_register
{
    "name": "bge-base",
    "function_name": "remote",
    "model_group_id": "<YOUR-MODEL-GROUP-ID>",
    "description": "test model",
    "connector_id": "<YOUR-CONNECTOR-ID>"
}

Get the model_id
Load the model

POST {{host}}/_plugins/_ml/models/<YOUR-MODEL-ID>/_load

Create an ingestion pipeline:

PUT {{host}}/_ingest/pipeline/{{pipeline_name}}
{
    "description": "pipeline",
    "processors": [
        {
            "set": {
                "field": "passage_text",
                "value": "{{{field1}}}, {{{field2}}}"
            }
        },
        {
            "text_embedding": {
                "model_id": "<YOUR-MODEL-ID>",
                "field_map": {
                    "passage_text": "passage_embedding"
                }
            }
        }
    ]
}

Create an index:

PUT {{host}}/<YOUR-INDEX-NAME>

{"mappings": <YOUR-MAPPINGS>,
"settings": ...,
                "passage_embedding": {
                    "type": "knn_vector",
                    "dimension": 768,
                    "method": {
                        "engine": "nmslib",
                        "space_type": "cosinesimil",
                        "name": "hnsw",
                        "parameters": {
                            "ef_construction": 512,
                            "m": 16
                        }
                    }
                },
                "passage_text": {
                    "type": "text"
                },
}

Bulk ingest the data

PUT {{host}}/<YOUR-INDEX-NAME>/_bulk

Expected behavior

It's expected that all documents have the following status in ingestion:

{
            "index": {
                "_index": "index",
                "_id": "id",
                "_version": 1,
                "result": "created",
                "_shards": {
                    "total": 2,
                    "successful": 2,
                    "failed": 0
                },
                "_seq_no": 1,
                "_primary_term": 1,
                "status": 201
            }

Additional Details

Plugins
Neural Search plugin

Host/Environment (please complete the following information):
I'm running it in the AWS OpenSearch managed version 2.11.

The text was updated successfully, but these errors were encountered:

navneet1v · 2024-03-19T22:06:32Z

This issue needs to be moved to @opensearch-project/ml-commons.

chishui · 2024-03-20T08:32:03Z

Highly likely, your requests got throttled by sagemaker, either because it reached rate limit or its CPU usage was high.

tiagoshin · 2024-03-20T14:41:56Z

moved to @opensearch-project/ml-commons at: opensearch-project/ml-commons#2249

tiagoshin · 2024-03-20T14:46:49Z

@chishui the Sagemaker rate for endpoint requests is 10,000 per second, we're ingesting only 250 documents.
CPU, GPU and memory usage is very low during the execution and Sagemaker doesn't register any 4xx or 5xx requests

andrross · 2024-03-20T15:11:39Z

[Triage - attendees 1 2 3 4 5 6]
Thanks @tiagoshin, closing this issue since it is now a duplicate of opensearch-project/ml-commons#2249.

tiagoshin added bug Something isn't working untriaged labels Mar 19, 2024

github-actions bot added the Plugins label Mar 19, 2024

andrross closed this as completed Mar 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Neural search: 4xx error ingesting data with Sagemaker external model #12774

[BUG] Neural search: 4xx error ingesting data with Sagemaker external model #12774

tiagoshin commented Mar 19, 2024 •

edited

Loading

navneet1v commented Mar 19, 2024

chishui commented Mar 20, 2024

tiagoshin commented Mar 20, 2024

tiagoshin commented Mar 20, 2024

andrross commented Mar 20, 2024

[BUG] Neural search: 4xx error ingesting data with Sagemaker external model #12774

[BUG] Neural search: 4xx error ingesting data with Sagemaker external model #12774

Comments

tiagoshin commented Mar 19, 2024 • edited Loading

Describe the bug

Related component

To Reproduce

Expected behavior

Additional Details

navneet1v commented Mar 19, 2024

chishui commented Mar 20, 2024

tiagoshin commented Mar 20, 2024

tiagoshin commented Mar 20, 2024

andrross commented Mar 20, 2024

tiagoshin commented Mar 19, 2024 •

edited

Loading