This project demonstrates how to deliver ultra-low latency access to OpenAI with exceptional audio quality using Agora's SD-RTN and OpenAI's Realtime API. By integrating Agora's SDK with OpenAI's Realtime API, it ensures seamless performance and minimal delay across the globe.
Before running the demo, ensure you have the following installed and configured:
-
Python 3.11 or above
-
Agora account:
- Login to Agora
- Create a New Project, using
Secured mode: APP ID + Token
to obtain an App ID and App Certificate.
-
OpenAI account:
- Login to OpenAI
- Go to Dashboard and obtain your API key.
-
Additional Packages:
- On macOS:
brew install ffmpeg portaudio
- On Ubuntu (verified on versions 22.04 & 24.04):
sudo apt install portaudio19-dev python3-dev build-essential sudo apt install ffmpeg
- On macOS:
realtimeAgent/realtime
contains the Python implementation for interacting with the Realtime API.realtimeAgent/agent.py
includes a demo agent that leverages therealtime
module and the agora-realtime-ai-api package to build a simple application.realtimeAgent/main.py
provides a web server that allows clients to start and stop AI-driven agents.
- Create a
.env
file for the backend. Copy.env.example
to.env
in the root of the repo and fill in the required values:cp .env.example .env
- Create a virtual environment:
python3 -m venv venv && source venv/bin/activate
- Install the required dependencies:
pip install -r requirements.txt
- Run the demo agent:
python -m realtime_agent.main agent --channel_name=<channel_name> --uid=<agent_uid>
- Run the http server to start demo agent via restful service
The server provides a simple layer for managing agent processes.
python -m realtime_agent.main server
This api starts an agent with given graph and override properties. The started agent will join into the specified channel, and subscribe to the uid which your browser/device's rtc use to join.
Param | Description |
---|---|
channel_name | (string) channel name, it needs to be the same with the one your browser/device joins, agent needs to stay with your browser/device in the same channel to communicate |
uid | (int)the uid which ai agent use to join |
system_instruction | The system instruction for the agent |
voice | The voice of the agent |
Example:
curl 'http://localhost:8080/start_agent' \
-H 'Content-Type: application/json' \
--data-raw '{
"channel_name": "test",
"uid": 123
}'
This api stops the agent you started
Param | Description |
---|---|
channel_name | (string) channel name, the one you used to start the agent |
Example:
curl 'http://localhost:8080/stop_agent' \
-H 'Content-Type: application/json' \
--data-raw '{
"channel_name": "test"
}'
To test agents, use Agora's Voice Call Demo.