speech-to-speech

Here are 39 public repositories matching this topic...

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

speech-to-text speech-to-speech large-language-models multimodal-large-language-models speech-language-model speech-interaction

Updated Nov 14, 2024
Python

IAHispano / Applio

Star

A simple, high-quality voice conversion tool focused on ease of use and performance

text-to-speech ai voice speech pytorch tts rvc voice-conversion vc voice-cloning speech-to-speech vits voice-clone applio

Updated Nov 28, 2024
Python

opendilab / CleanS2S

Star

High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体！

python machine-learning streaming ai speech-synthesis speech-recognition speech-to-speech gpt-4o

Updated Nov 8, 2024
Python

SamirPaulb / real-time-voice-translator

Star

A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.

Updated Jan 22, 2024
Tcl

dqqcasia / awesome-speech-translation

Star

natural-language-processing machine-translation speech speech-synthesis speech-recognition speech-processing text-translation disfluency-detection speech-translation multimodal-machine-learning multimodal-machine-translation punctuation-restoration speech-to-speech simultaneous-translation cascaded-speech-translation non-autoregressive-translation speech-to-subtitles

Updated Nov 10, 2021

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.

speech-recognition speech-to-text speech-translation speech-to-speech large-language-models chatgpt gpt-4o speech-interaction

Updated Nov 20, 2024
Python

VITA-MLLM / Freeze-Omni

Star

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

speech speech-synthesis speech-recognition speech-to-speech large-language-models multimodal-large-language-models

Updated Nov 27, 2024
Python

ictnlp / DASpeech

Star

Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".

machine-translation speech-translation speech-to-speech speech-to-speech-translation

Updated Jul 22, 2024
Python

jofizcd / Soul-of-Waifu

Star

If you've ever had the wish to talk to your AI Waifu using quality characters and voices for character voicing, then I suggest Soul of Waifu. Don't miss the opportunity to touch your dream!

text-to-speech ai chatbot artificial-intelligence tts speech-to-text waifu stt aichatbot aigirl speech-to-speech characterai aigirlfriend aiwaifu

Updated Aug 3, 2024
Python

hparcells / rtvc

Star

💬 "Realtime" voice transcription and cloning using ElevenLabs's API.

api website web ai interactive transcription voice-synthesis voice-cloning speech-to-speech voicecloning elevenlabs

Updated Mar 1, 2023
TypeScript

flo-bit / svelte-openai-realtime-api

Star

svelte component for using the openai realtime api

svelte openai realtime-api speech-to-speech sveltekit

Updated Oct 12, 2024
Svelte

liamdugan / speech-to-speech

Star

Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"

speech speech-processing speech-translation speech-to-speech simultaneous-translation

Updated Sep 1, 2023
Python

lugia19 / Echo-XI

Star

Speech to text to speech using Elevenlabs

python voice speech tts speech-synthesis speech-recognition speech-to-text speech-to-speech elevenlabs

Updated Jul 2, 2023
Python

rryam / SakuraKit

Sponsor

Star

Swift SDK for Prototyping AI Speech Generation

swift ios text-to-speech speech-to-speech

Updated Nov 13, 2024
Swift

Yogeshk4124 / Chatter-Box-Translator

Star

Chatter Box is an android app that is capable of Voice, Text, Image Text Translation, and end-to-end chat translation.

android java translation text-recognition speech-to-text chat-application text-translation otp-verification chat-translator speech-to-speech text-speech image-text-translation

Updated Jul 25, 2020
Java

winedarkmoon / ElevenGUI

Star

A user-friendly interface for ElevenLabs' API with added audio transcription capability.

python gui tts openai speech-to-text transcription speech-to-speech whisper-ai elevenlabs

Updated Jun 20, 2023
Python

Ankur2606 / Low-latency-AI-Voice-Assistant

Star

End-to-End AI Voice Assistant pipeline with Whisper for Speech-to-Text, Hugging Face LLM for response generation, and Edge-TTS for Text-to-Speech. Features include Voice Activity Detection (VAD), tunable parameters for pitch, gender, and speed, and real-time response with latency optimization.

ai versatile jarvis hacktoberfest voice-assistant multimodel speech-to-speech