-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SeamlessM4T support in ASR #190
Conversation
Regarding TTS support, I believe the reason it cuts short and breaks at longer files is that our chunking does not use a stride (see Asana task regarding "Better chunking (with overlap), see hugging face - https://huggingface.co/blog/asr-chunking"). This also applies to ASR and the other SeamlessM4T functionality but is less obvious to observe in this case. I think it would be better to fix the gooey-gpu side chunking rather than remove TTS. Here's one of the ASR outputs I got due to the chunking being off (running on Hugging Face with correct chunking works): "the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone" lol |
How did you run it on huggingface? |
Oh, transformers supports seamlessm4t now - https://huggingface.co/facebook/hf-seamless-m4t-large |
We should be able to repurpose the whisper code? https://github.com/dara-network/gooey-gpu/blob/3cf4d4393ca719410ae19820e36bd829fab5243c/common/whisper.py#L16 |
@devxpy i looked into it, the seamless code isn't merged into hugging face yet -- https://github.com/huggingface/transformers/pull/25693/files. Would it be okay to use this HF branch to pull the package from until it is released? |
Branched from #168
Removed TTS support because in TTS,:
Q/A checklist
poetry export -o requirements.txt
)