-
Notifications
You must be signed in to change notification settings - Fork 957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem for App Store edition #818
Comments
If you need millisecond precision AI models Like whisper will not be able to get the precision you need. This is a limitation of how they are built and there is nothing we can do about it. Most likely none of AI tools will be able to get milisecond precision right. When I needed millisecond precision, I used https://github.com/echogarden-project/echogarden It uses different algorithm to align the text to audio and precision is much better. I did prepare a text file with one sentence per line and used "forced alignment" feature of the Regarding errors in the transcript, try |
Oh thank you for providing such a detailed answer. A little more question, I use an appstore version, so how can I use large-v3 model in Buzz Captain app? Can I just download the large-v3 model file, and use it in Buzz Captain App? |
On existing App store version you may be able to use Huggingface Whisper type with Please note that Alternatively you can try the latest development version from some Action. Log into the GitHub and look at the bottom of the Action page f.e. here https://github.com/chidiwilliams/buzz/actions/runs/9656460007 |
Please try the latest open source version as a temporary solution while the App store version issue gets sorted |
I am using Version 1.0.2(137) and MacOS 14.6(23G5052d).
I uploaded an mp3 file,about 44'30",45mb.I used the large model with CoreML.
When the transcribe function was over,I found some errors in the generated SRT file:
1.The timeline of subtitle file was wrong which only to the second level,not the millisecond level.So I can't use the file directly,I had to adjust the timeline mannually;
2.There are 3 paragraphs that have not been transcribed correctly,only a large amount of repetitive text is present in these paragraphs in the SRT file.This error occurred three times in this file.No matter how I rerun the transcribe program, the result is the same.7 minutes of audio not transcribed correctly in total in this file.
The text was updated successfully, but these errors were encountered: