You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem?
I'm trying to create a Chatbot experience using the tutorials online in OpenSearch. The memory system seems to only hold the "final response" from the LLM, which doesn't have the full context of what it retrieved from X tools. This is a problem since the Chatbot can't answer follow-up questions because the final response doesn't include the entirety of the document that it referenced.
For example:
I ask to show documents from the year 2023. The system uses a PPLTool for retrieval, gets ~5 documents, and summarizes those documents to the user (Usually just something like title of document in a bullet list or something like that).
I ask a follow-up question to tell me more about document number 1. The system only has the final message that it sent last time, so it doesn't know additional details that it already retrieved from step 1. It then uses the new user question (which has no context about anything) to try to perform RAG again using the various tools I've created: "Tell me more about document number 1" - this question is worthless since it has no keywords. At this point the chat experience fails, the LLM starts to hallucinate, etc.
What solution would you like?
I would like the "Chain of Thought" internal system to analyze the user question as part of its workflow. The chain of thought should determine if the user is asking questions about the previous response. If so, the chain of thought should add the previous RAG lookup "internal context" to the current context of the question. The chain of thought process could also skip performing RAG again if the question is about the previous question. This would result in less LLM usage as you would short-circuit more RAG lookups and use the previous context that the system is already holding.
For example -
{
"thought": "The user is asking about something I just responded with, I should include internal context from my last response."
}
What alternatives have you considered?
Including the entire context from the previous RAG lookups
Do you have any additional context?
n/a
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
I'm trying to create a Chatbot experience using the tutorials online in OpenSearch. The memory system seems to only hold the "final response" from the LLM, which doesn't have the full context of what it retrieved from X tools. This is a problem since the Chatbot can't answer follow-up questions because the final response doesn't include the entirety of the document that it referenced.
For example:
I ask to show documents from the year 2023. The system uses a PPLTool for retrieval, gets ~5 documents, and summarizes those documents to the user (Usually just something like title of document in a bullet list or something like that).
I ask a follow-up question to tell me more about document number 1. The system only has the final message that it sent last time, so it doesn't know additional details that it already retrieved from step 1. It then uses the new user question (which has no context about anything) to try to perform RAG again using the various tools I've created: "Tell me more about document number 1" - this question is worthless since it has no keywords. At this point the chat experience fails, the LLM starts to hallucinate, etc.
What solution would you like?
I would like the "Chain of Thought" internal system to analyze the user question as part of its workflow. The chain of thought should determine if the user is asking questions about the previous response. If so, the chain of thought should add the previous RAG lookup "internal context" to the current context of the question. The chain of thought process could also skip performing RAG again if the question is about the previous question. This would result in less LLM usage as you would short-circuit more RAG lookups and use the previous context that the system is already holding.
For example -
What alternatives have you considered?
Do you have any additional context?
n/a
The text was updated successfully, but these errors were encountered: