Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http connector with hugging face: enabling rag pipeline #2325

Closed
MiaNCSU opened this issue Apr 16, 2024 · 5 comments
Closed

http connector with hugging face: enabling rag pipeline #2325

MiaNCSU opened this issue Apr 16, 2024 · 5 comments
Assignees

Comments

@MiaNCSU
Copy link

MiaNCSU commented Apr 16, 2024

I am trying to implement a rag pipeline using the pretrained HuggingFace model. I am having trouble building a custom blueprint for the HuggingFace model. I have adopted this connector, with modifications, from a previous issue #1468:

{
    "name": "sentence-transformers/all-MiniLM-L6-v2",
    "description": "The connector to Hugging Face GPT Language Model",
    "version": "1.0.1",
    "protocol": "http",
    "parameters": {
        "endpoint": "api-inference.huggingface.co",
        "model": "gpt2",
        "temperature": 1.0
    },
    "credential": {
        "HF_key": "my_API_key"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "url": "https://api-inference.huggingface.co/models/sentence-transformers/all-MiniLM-L6-v2",
            "headers": {
                "Authorization": "Bearer ${credential.HF_key}"
            },
            "request_body": """{ 
                "model": "${parameters.model}", 
                "messages": ${parameters.messages},
                "temperature": ${parameters.temperature},
                "inputs" : {"source_sentence": "source", "sentences": ["sentence"]}
            }"""
        }
    ]
}

I have set up the cluster settings, registered and deployed the model using this connector, and have set up the rag pipeline.
I have set up the index to use rag by default like this:

index_body = {
      "settings": {
          "index.number_of_shards" : 4, 
          "index.search.default_pipeline" : "rag_pipeline"
      },
      "mappings": {
          "properties": {
          "text": {
              "type": "text"
          }
          }
      }
}

response = self.client.indices.create(name, index_body)

I try to make a search this way:

query = {
    "query": {
        "match": {
        "text": q
        }
    },
    "ext": {
        "generative_qa_parameters": {
        "llm_question": q,
        "llm_model": self.llm_model,
        "context_size": 10,
        "timeout": 60
        }
    }
}
response = self.client.search(
    body = query,
    index = index_name
)

And I get the following error:
opensearchpy.exceptions.TransportError: TransportError(500, 'null_pointer_exception', 'Cannot invoke "java.util.Map.get(Object)" because "error" is null')

I think this is an error with the connector, specifically with HuggingFace's Inference API endpoints. How do I configure the connector to correctly interact with HuggingFace API? The request body contains all parameters needed to make a request, according to the HF documentation. What should I fix?

@MiaNCSU MiaNCSU added enhancement New feature or request untriaged labels Apr 16, 2024
@ylwu-amzn
Copy link
Collaborator

Not quite get you, the connector you shared is a text embedding model. Are you using it as LLM in RAG pipeline ? Can you share the RAG pipeline configuration?

@ylwu-amzn ylwu-amzn removed the enhancement New feature or request label Apr 26, 2024
@MiaNCSU
Copy link
Author

MiaNCSU commented Apr 26, 2024

Yes, I am using it as an LLM in the RAG pipeline. I put the following request body into the /_search/pipeline/rag_pipeline endpoint:

  data = {
      "response_processors": [
          {
              "retrieval_augmented_generation": {
                  "tag": "rag_pipeline",
                  "description": "HuggingFace Connector pipeline",
                  "model_id": MODEL_ID,
                  "context_field_list": ["text"],
                  "system_prompt": "You are a dataset search tool",
                  "user_instructions": "For a given search query, list the top 10 most relevant datasets as answers."
              }
          }
      ]
  }

The connector I shared is for the huggingface/sentence-transformers/all-MiniLM-L6-v2 model, which is one of the pretrained models listed on Opensearch:
supported pretrained models

Is this model not suitable for the rag pipeline?

@ylwu-amzn
Copy link
Collaborator

@MiaNCSU , huggingface/sentence-transformers/all-MiniLM-L6-v2 is not a LLM for RAG pipeline, it's a model for generate text embedding. You can try LLM like OpenAI GPT, Anthropic Claude 3 etc.

@ylwu-amzn
Copy link
Collaborator

@MiaNCSU , will close this issue since no reply after 2 weeks. Feel free to reply and open if you still have question.

@github-project-automation github-project-automation bot moved this from In Progress to Done in ml-commons projects May 23, 2024
@MiaNCSU
Copy link
Author

MiaNCSU commented May 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

2 participants