Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async client doesn't support streaming input #344

Open
BackSlasher opened this issue Aug 14, 2024 · 9 comments · Fixed by #346 · May be fixed by #358
Open

Async client doesn't support streaming input #344

BackSlasher opened this issue Aug 14, 2024 · 9 comments · Fixed by #346 · May be fixed by #358

Comments

@BackSlasher
Copy link
Contributor

- text: str. The string that will get converted into speech. The Async client does not support streaming.

If one wants to provide a stream as input, they have to use the sync client, choosing between complicating their codebase or losing async's benefits

@BackSlasher
Copy link
Contributor Author

@dsinghvi I see you were involved in #67. Do you know whether the team is working on this?
Will they be accepting contributions?

nitz-uglabs added a commit to BackSlasher/elevenlabs-python that referenced this issue Aug 15, 2024
Why:
Allowing the async client to utilize incoming text streams when generating voice.
Very useful when feeding the realtime output of an LLM into the TTS.

Closes elevenlabs#344

What:
1. Copied `RealtimeTextToSpeechClient` and `text_chunker` into `AsyncRealtimeTextToSpeechClient` and `async_text_chunker`
   Most of the logic is intact, aside from async stuff
2. Added `AsyncRealtimeTextToSpeechClient` into `AsyncElevenLabs` just like `RealtimeTextToSpeechClient` is in `ElevenLabs`
3. Added rudimentary testing

The code is basically a copy-paste of what I found in the repo. We can rewrite it to be more elegant, but I figured parity with the sync code is more important.
BackSlasher added a commit to BackSlasher/elevenlabs-python that referenced this issue Aug 15, 2024
Why:
Allowing the async client to utilize incoming text streams when generating voice.
Very useful when feeding the realtime output of an LLM into the TTS.

Closes elevenlabs#344

What:
1. Copied `RealtimeTextToSpeechClient` and `text_chunker` into `AsyncRealtimeTextToSpeechClient` and `async_text_chunker`
   Most of the logic is intact, aside from async stuff
2. Added `AsyncRealtimeTextToSpeechClient` into `AsyncElevenLabs` just like `RealtimeTextToSpeechClient` is in `ElevenLabs`
3. Added rudimentary testing

The code is basically a copy-paste of what I found in the repo. We can rewrite it to be more elegant, but I figured parity with the sync code is more important.
wiger3 pushed a commit to wiger3/elevenlabs-python that referenced this issue Aug 21, 2024
Why:
Allowing the async client to utilize incoming text streams when generating voice.
Very useful when feeding the realtime output of an LLM into the TTS.

Closes elevenlabs#344

What:
1. Copied `RealtimeTextToSpeechClient` and `text_chunker` into `AsyncRealtimeTextToSpeechClient` and `async_text_chunker`
   Most of the logic is intact, aside from async stuff
2. Added `AsyncRealtimeTextToSpeechClient` into `AsyncElevenLabs` just like `RealtimeTextToSpeechClient` is in `ElevenLabs`
3. Added rudimentary testing

The code is basically a copy-paste of what I found in the repo. We can rewrite it to be more elegant, but I figured parity with the sync code is more important.
@dsinghvi
Copy link
Collaborator

@BackSlasher ill take a look at your PR this weekend!

@BackSlasher
Copy link
Contributor Author

@dsinghvi hey, any news? :)

@dsinghvi dsinghvi reopened this Sep 11, 2024
@dsinghvi
Copy link
Collaborator

@BackSlasher looks like there was a compile issue and the test failed to run in CI, so I had to revert. do you mind recreating the PR?

@BackSlasher
Copy link
Contributor Author

Resubmit the same one? You betcha

BackSlasher added a commit to BackSlasher/elevenlabs-python that referenced this issue Sep 11, 2024
Why:
Allowing the async client to utilize incoming text streams when generating voice.
Very useful when feeding the realtime output of an LLM into the TTS.

Closes elevenlabs#344

What:
1. Copied `RealtimeTextToSpeechClient` and `text_chunker` into `AsyncRealtimeTextToSpeechClient` and `async_text_chunker`
   Most of the logic is intact, aside from async stuff
2. Added `AsyncRealtimeTextToSpeechClient` into `AsyncElevenLabs` just like `RealtimeTextToSpeechClient` is in `ElevenLabs`
3. Added rudimentary testing

The code is basically a copy-paste of what I found in the repo. We can rewrite it to be more elegant, but I figured parity with the sync code is more important.
@BackSlasher
Copy link
Contributor Author

@dsinghvi any news? :)

@BackSlasher
Copy link
Contributor Author

@dsinghvi sorry for bothering you, could we get this merged?

@zhudotexe
Copy link

👋 I tested this locally on my own fork and it works great, though I would make one change - currently you're checking for AsyncIterator (https://github.com/elevenlabs/elevenlabs-python/pull/358/files#diff-75cbcf7bd37482fe5efdb55e35469453cc1380dfd25609b7f9e52bb5a9808bc1R416) which works fine for async generators, but can exclude some valid async iterables (e.g. a custom class that defines __aiter__ but not __anext__). Changing this to AsyncIterable instead allows these types of iterators. Thanks!

@gwpl
Copy link

gwpl commented Oct 29, 2024

I had issue that streaming code was not generating audio until while input is read.
( #395 )
I wonder if it's related to this #344 issue or sth else and if therefore your PR will help 🤞🏻 .

BackSlasher added a commit to BackSlasher/elevenlabs-python that referenced this issue Nov 12, 2024
Why:
Allowing the async client to utilize incoming text streams when generating voice.
Very useful when feeding the realtime output of an LLM into the TTS.

Closes elevenlabs#344

What:
1. Copied `RealtimeTextToSpeechClient` and `text_chunker` into `AsyncRealtimeTextToSpeechClient` and `async_text_chunker`
   Most of the logic is intact, aside from async stuff
2. Added `AsyncRealtimeTextToSpeechClient` into `AsyncElevenLabs` just like `RealtimeTextToSpeechClient` is in `ElevenLabs`
3. Added rudimentary testing

The code is basically a copy-paste of what I found in the repo. We can rewrite it to be more elegant, but I figured parity with the sync code is more important.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants