The main idea is to send a request to ChatGPT to get a text response and then use this output as input to Charactr TTS. Follow these small steps:
To use the gemelo.ai API for TTS and OpenAI API for Whisper and ChatGPT, install the following:
pip install charactr-api-sdk openai
pip install requests ipython
import json
from typing import Dict, List
import IPython.display
import openai
import requests
from charactr_api import CharactrAPISDK, Credentials
openai_api_key = 'xxxx'
charactr_client_key = 'yyyy'
charactr_api_key = 'zzzz'
openai.api_key = openai_api_key
credentials = Credentials(client_key=charactr_client_key, api_key=charactr_api_key)
charactr_api = CharactrAPISDK(credentials)
charactr_api.tts.get_voices()
You get a list of voices. Choose one and set up the voice_id
:
voice_id = 136
model = 'gpt-3.5-turbo'
parameters = {
'temperature': 0.8,
'max_tokens': 150,
'top_p': 1,
'presence_penalty': 0,
'frequency_penalty': 0,
'stop': None
}
def generate(request: str) -> str:
"""Generate a text response with ChatGPT."""
messages = [{'role': 'user', 'content': request}]
result = openai.ChatCompletion.create(model=model,
messages=messages,
**parameters)
try:
response = result['choices'][0]['message']['content'].strip()
except Exception as e:
raise Exception(e)
return response
text = 'Tell me a joke'
response = generate(text)
tts_result = charactr_api.tts.convert(voice_id, response)
To listen to the voice response in a notebook:
IPython.display.Audio(tts_result['data'])
with open('output.wav', 'wb') as f:
f.write(tts_result['data'])