-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify context before LLM #1034
Comments
it's a bit difficult with multimodal agent, because it's from voice input directly to voice output. the way to handle RAG is with function calling. If you are defining a function for the LLM to look up information with the user's query, it should be straight forward to pick up the function call and return the RAG results that way |
Thank you very much David! Since we have the transcribed text from the user's audio in the I was thinking that we could analyze the transcribed texts, and then use that to do RAG, and send the results async to the openai api as texts? Just some thoughts... |
@tpy37 can you guide me on how to change context in VoicePipelineAgent? |
I am sorry, I haven't done it myself in VoicePipeline. But the example is available in https://docs.livekit.io/agents/voice-agent/voice-pipeline/#modify-context-before-llm I think there was also example to send it to RAG using this before_llm_cb in one of the Github repository: Hope it helps! |
@tpy37 thank you very much, it does help. But I am still unsure.. my use case is I want to manipulate the prompt based on a tool call. To go deeper in the specific requirement, the conversation has outgrown the current prompt and has gone into different territory that need to be handled by a different prompt. I do not need to manipulate the prompt inside the tool call but if I can just extract some metadata from tool call and access it in the callback function should solve the problem but I am not sure how can I do that. |
Thank you for the great package. I am looking for a method to change the context before sending it to the LLM in MultimodalAgent class. I think it exists in VoicePipelineAgent, and I am wondering how I could implement it with OpenAI realtime API.
The text was updated successfully, but these errors were encountered: