Modify context before LLM #1034

tpy37 · 2024-11-04T16:23:53Z

Thank you for the great package. I am looking for a method to change the context before sending it to the LLM in MultimodalAgent class. I think it exists in VoicePipelineAgent, and I am wondering how I could implement it with OpenAI realtime API.

tpy37 · 2024-11-05T02:24:14Z

I would be happy if you could implement the RAG part of the MultimodalAgent, as in the documentation! :)
before_llm_cb=_enrich_with_rag

davidzhao · 2024-11-05T05:27:55Z

it's a bit difficult with multimodal agent, because it's from voice input directly to voice output.

the way to handle RAG is with function calling. If you are defining a function for the LLM to look up information with the user's query, it should be straight forward to pick up the function call and return the RAG results that way

tpy37 · 2024-11-05T15:20:06Z

Thank you very much David!
I see... I was trying the function call, but changing to tool "required" seem to completely halt the process and break the conversation, so I stopped using it.

Since we have the transcribed text from the user's audio in the
openai.realtime.RealtimeResponse

I was thinking that we could analyze the transcribed texts, and then use that to do RAG, and send the results async to the openai api as texts?
const event = { type: 'conversation.item.create', item: { type: 'message', role: 'user', content: [ { type: 'input_text', text: 'Hello!' } ] } }; ws.send(JSON.stringify(event)); ws.send(JSON.stringify({type: 'response.create'}));
https://platform.openai.com/docs/guides/realtime?text-generation-quickstart-example=text

Just some thoughts...

prashantmetadome · 2024-11-06T13:41:05Z

@tpy37 can you guide me on how to change context in VoicePipelineAgent?

tpy37 · 2024-11-06T13:49:25Z

I am sorry, I haven't done it myself in VoicePipeline. But the example is available in https://docs.livekit.io/agents/voice-agent/voice-pipeline/#modify-context-before-llm

I think there was also example to send it to RAG using this before_llm_cb in one of the Github repository:
examples/voice-pipeline-agent/simple-rag/assistant.py

Hope it helps!

prashantmetadome · 2024-11-06T14:01:12Z

@tpy37 thank you very much, it does help. But I am still unsure.. my use case is I want to manipulate the prompt based on a tool call.

To go deeper in the specific requirement, the conversation has outgrown the current prompt and has gone into different territory that need to be handled by a different prompt.

I do not need to manipulate the prompt inside the tool call but if I can just extract some metadata from tool call and access it in the callback function should solve the problem but I am not sure how can I do that.
@davidzhao any help would be appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify context before LLM #1034

Modify context before LLM #1034

tpy37 commented Nov 4, 2024 •

edited

Loading

tpy37 commented Nov 5, 2024

davidzhao commented Nov 5, 2024

tpy37 commented Nov 5, 2024

prashantmetadome commented Nov 6, 2024

tpy37 commented Nov 6, 2024 •

edited

Loading

prashantmetadome commented Nov 6, 2024 •

edited

Loading

Modify context before LLM #1034

Modify context before LLM #1034

Comments

tpy37 commented Nov 4, 2024 • edited Loading

tpy37 commented Nov 5, 2024

davidzhao commented Nov 5, 2024

tpy37 commented Nov 5, 2024

prashantmetadome commented Nov 6, 2024

tpy37 commented Nov 6, 2024 • edited Loading

prashantmetadome commented Nov 6, 2024 • edited Loading

tpy37 commented Nov 4, 2024 •

edited

Loading

tpy37 commented Nov 6, 2024 •

edited

Loading

prashantmetadome commented Nov 6, 2024 •

edited

Loading