[FEATURE] Memory Enhancement to Support more LLM applications #1614

Zhangxunmt · 2023-11-09T22:11:27Z

In the Agent Framework, we want to enhance Ml-Commons to be able to interact with all kinds of LLMs like Claude, BedRock, etc. The use case of the agent framework will not be limited in the Search scenario, but also in Chatbot, Forecasting etc. To this end, the memory component needs to be extended and refactored to support more applications as a new data layer in Ml-Commons between the public APIs and system indices. This document focus on the design of this memory layer that supports Agent Framework in Ml-Commons.

Architecture

A memory system needs to support two basic actions: reading and writing. Recall that every agent defines some core execution logic that expects certain inputs. Some of these inputs come directly from the user, but some of these inputs can come from memory. An agent will interact with its memory system twice in a given run.

AFTER receiving the initial user inputs but BEFORE executing the core logic, an agent will READ from its memory system and augment the user inputs.
AFTER executing the core logic but BEFORE returning the answer, an agent will WRITE the inputs and outputs of the current run to memory, so that they can be referred to in future runs.

As outlined in the high level Design, CRUD-like APIs will be added to the Memory system for two resources: Conversation and Interaction. Conversations are made up of interactions. An interaction represents a pair of messages: a human input and an artificial intelligence (AI) response. These restful APIs already exists in the conversational search. To support Agent Framework, the mapping and schema of the Conversation and Interaction will be updated with backward compatibility.

The sequential flow of these APIs can be summarized as below. We plan to support three types of Query Algorithms.

(P0) Most Recent Top N - store all interactions and return the most recent of them. The number of interactions returned is configurable.
(P2) Summary - generate a summary of the most recent interactions. The user can configure the number of interactions, -1 means all history messages.
(P2) VectorSearch - return the most relevant interactions by neural search.

System Index and Mappings

We will reuse the two system index that were created in the conversational search. The index “conversation-meta” stores the conversation metadata and the “conversation-interactions” stores every interaction between the user inputs and the LLM response. To support more applications, a new field “application_type” is added in the “conversation-meta” mapping to distinguish conversations from different applications. For example, a chatbot calls Fractal/Agent to create new conversations and the chatbot agent will write “chatbot” for the “application_type” field in each conversation. The conversations created in Conversational Search have the “application_type” as null/empty since their APIs do not include this new field. When ingesting new interactions into a conversation, Ml-Commons needs to make sure the chatbot interactions are referenced only to chatbot conversation, and pipeline interactions are only referenced to pipeline conversation, etc.

In the “conversation-interactions” index, the new fields are mostly flat object that are general enough to be easily fit into use cases more than just chatbot. The new mappings of these two system index are listed below with new added fields shadowed.

conversation-meta

.plugins-ml-conversation-meta
{
    "_meta": {
        "schema_version": 1
    },
    "properties": {
        "name": {"type": "keyword"},
        "create_time": {"type": "date", "format": "strict_date_time||epoch_millis"},
        "user": {"type": "keyword"},
        "application_type": "Chatbot/<other type from agent>"
    }
}

conversation-interactions

.plugins-ml-conversation-interactions
{
    "_meta": {
        "schema_version": 1
    },
    "properties": {
        "conversation_id": {"type": "keyword"},
        "create_time": {"type": "date", "format": "strict_date_time||epoch_millis"},
        "input": {"type": "text"},
        "prompt_template": {"type": "text"},
        "response": {"type": "text"},
        "origin": {"type": "keyword"},
        "additional_info": {"type": "flat_object"},
        "parent_interaction_id": {"type": "keyword"},
        "trace_number": {"type": "long"}
    }
}

New APIs:

Update Interactions: (needs revisit)

Chatbot needs to use Update Interaction API to add/update contents for interactions, including adds new fields like “notes” and “post_process_response”, etc.

PUT /_plugins/_ml/memory/<memory_id>/<interaction_id>
{
    "input": "How do I make an interaction?",
    "prompt_template": "Hello OpenAI, can you answer this question? \
                                                Here's some extra info that may help. \
                                                [INFO] \n [QUESTION]",
    "response": "Hello, this is OpenAI. Here is the answer to your question.",
    "origin": "MyFirstOpenAIWrapper",
    "additional_info": {"Additional text related to the answer A JSON or other semi-structured response" , "suggestion" : { ... }, "reference": {...} , "post_process_response": {}}
}

Update Conversations:

This is to allow users to update the name of the conversation.

PUT /_plugins/_ml/memory/<memory_id>
{
   "name": "new conversation name",
   "description": "this is a memory for chatbot" 
}

The text was updated successfully, but these errors were encountered:

HenryL27 · 2023-11-09T23:58:17Z

Thanks @Zhangxunmt! I have a couple questions:

for interaction-level vector search, is the plan to turn the interactions index into a knn index? What embedding model will you use? I guess to perform the search itself you'll use the apis introduced in Add search and singular APIs to conversation memory #1504 ?
The interaction-level "origin" field represents almost the same thing as the new "application_type" field (or is meant to). Maybe we can have the names agree with each other to make that more clear? (e.g. "application_type" -> "origin_type")
The "additional_info" field is meant as a catch-all for other application-specific information. Is it feasible to pack and unpack "trace_number", "references", and "post_process_response" into a single string? I guess if you need to search over those fields, then maybe not. btw, what does the trace number do?
Let's follow the endpoint naming conventions from [FEATURE] Singular and _search api for Conversational Memory #1268 (implemented in the above PR) and use PUT /_plugins/_ml/memory/conversation/{conversation_id}/_update and PUT /_plugins/_ml/memory/conversation/{conversation_id}/{interaction_id}/_update
I also worry a little about allowing arbitrary field additions via update? It's probably fine

austintlee · 2023-11-10T02:09:58Z

For the update API, I think certain parts of an interaction should be immutable, e.g. user input and LLM response. I don't know if versioning interactions is the way to go, but we should think about the immutability aspect.

Also, how important is it to support role-based access control for conversations and interactions? Is that going to be a blocker for this work?

ylwu-amzn · 2023-11-17T17:18:07Z

Also, how important is it to support role-based access control for conversations and interactions? Is that going to be a blocker for this work?

I think we don't have strong requriements for now for role-based access control. We can always add it in future, not one way door.

navneet1v · 2023-11-17T17:19:34Z

@austintlee , @HenryL27 , @ylwu-amzn , @Zhangxunmt

Given the index name is very much tied with conversation use case. I was thinking to strip the conversation from the index name to make this index available for other use cases which require memory. With that change this can become a pure memory layer for any kind of ML use case.

Please let me know your thoughts.

Zhangxunmt · 2023-11-17T22:45:57Z

How about renaming the index to the following names?

plugins-ml-conversation-meta -> plugins-ml-memory-meta
plugins-ml-conversation-interactions -> plugins-ml-memory-message

This change will be a breaking change. Anyone that has created conversations will lost all data after the name change. Are you all agree? @austintlee @HenryL27 @navneet1v

navneet1v · 2023-11-19T04:23:44Z

This change will be a breaking change. Anyone that has created conversations will lost all data after the name change. Are you all agree?

I agree, given that feature was in preview. I think we discussed this in the last ML call.

For users who are already using can we provide a way in which after upgrade their old data can be migrated to new indexes.

Zhangxunmt added enhancement New feature or request untriaged and removed untriaged labels Nov 9, 2023

Zhangxunmt mentioned this issue Nov 12, 2023

add new fields in the memory and refactor transport actions #1619

Merged

ylwu-amzn added this to ml-commons projects Nov 17, 2023

ylwu-amzn moved this to In Progress in ml-commons projects Nov 17, 2023

jngz-es mentioned this issue Nov 20, 2023

Memory interface in spi #1664

Merged

5 tasks

Zhangxunmt mentioned this issue Nov 22, 2023

rename the conversation/interaction indices to memory meta/message #1678

Closed

5 tasks

Zhangxunmt self-assigned this Dec 19, 2023

Zhangxunmt closed this as completed Feb 27, 2024

github-project-automation bot moved this from In Progress to Done in ml-commons projects Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Memory Enhancement to Support more LLM applications #1614

[FEATURE] Memory Enhancement to Support more LLM applications #1614

Zhangxunmt commented Nov 9, 2023 •

edited

Loading

HenryL27 commented Nov 9, 2023 •

edited by Zhangxunmt

Loading

austintlee commented Nov 10, 2023

ylwu-amzn commented Nov 17, 2023

navneet1v commented Nov 17, 2023

Zhangxunmt commented Nov 17, 2023

navneet1v commented Nov 19, 2023

[FEATURE] Memory Enhancement to Support more LLM applications #1614

[FEATURE] Memory Enhancement to Support more LLM applications #1614

Comments

Zhangxunmt commented Nov 9, 2023 • edited Loading

Architecture

System Index and Mappings

New APIs:

HenryL27 commented Nov 9, 2023 • edited by Zhangxunmt Loading

austintlee commented Nov 10, 2023

ylwu-amzn commented Nov 17, 2023

navneet1v commented Nov 17, 2023

Zhangxunmt commented Nov 17, 2023

navneet1v commented Nov 19, 2023

Zhangxunmt commented Nov 9, 2023 •

edited

Loading

HenryL27 commented Nov 9, 2023 •

edited by Zhangxunmt

Loading