Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SeamlessM4T support in ASR #190

Merged
merged 8 commits into from
Nov 17, 2023
Merged

SeamlessM4T support in ASR #190

merged 8 commits into from
Nov 17, 2023

Conversation

nikochiko
Copy link
Member

@nikochiko nikochiko commented Oct 19, 2023

Branched from #168

Removed TTS support because in TTS,:

  • non-english TTS was being cut short sometimes
  • english TTS was being suddenly broken at 20 seconds

Q/A checklist

  • Do a code review of the changes
  • Add any new dependencies to poetry & export to requirementst.txt (poetry export -o requirements.txt)
  • Carefully think about the stuff that might break because of this change
  • The relevant pages still run when you press submit
  • If you added new settings / knobs, the values get saved if you save it on the UI
  • The API for those pages still work (API tab)
  • The public API interface doesn't change if you didn't want it to (check API tab > docs page)
  • Do your UI changes (if applicable) look acceptable on mobile?

@SanderGi
Copy link
Member

Regarding TTS support, I believe the reason it cuts short and breaks at longer files is that our chunking does not use a stride (see Asana task regarding "Better chunking (with overlap), see hugging face - https://huggingface.co/blog/asr-chunking"). This also applies to ASR and the other SeamlessM4T functionality but is less obvious to observe in this case. I think it would be better to fix the gooey-gpu side chunking rather than remove TTS. Here's one of the ASR outputs I got due to the chunking being off (running on Hugging Face with correct chunking works): "the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone the bone" lol

@devxpy
Copy link
Member

devxpy commented Oct 20, 2023

running on Hugging Face with correct chunking works

How did you run it on huggingface?

@devxpy
Copy link
Member

devxpy commented Oct 20, 2023

Oh, transformers supports seamlessm4t now - https://huggingface.co/facebook/hf-seamless-m4t-large

@devxpy
Copy link
Member

devxpy commented Oct 20, 2023

@nikochiko
Copy link
Member Author

@devxpy i looked into it, the seamless code isn't merged into hugging face yet -- https://github.com/huggingface/transformers/pull/25693/files.

Would it be okay to use this HF branch to pull the package from until it is released?

@devxpy devxpy merged commit 97afb22 into master Nov 17, 2023
1 check passed
@devxpy devxpy deleted the seamlessm4t-asr branch November 17, 2023 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants