Materials for "Speech Recognition with Python" lecture at PyConPL 2023 conference.
Materials include application examples of the following tools:
- SpeechRecognition (Python module supporting several speech-to-text engines and APIs)
- AssemblyAI (API)
- OpenAI's Whisper (speech-to-text model)
- Transformers (pretrained speech-to-text models)
- Go to speech_recognition_with_python.ipynb and click on
Open in Colab
button on top of the notebook. - Copy
.wav
files from audio_filesdirectory or Google Drive folder into your personal Google Drive. Suggested path of the directory with the audio files is:Colab Notebooks/speech_recognition_with_python
. Note: If you will be storing these files in a different location, be sure to change thePATH
constant in the notebook. - Mount your Google Drive to your Google Colab notebook.
- And... that's all! 🥳 Have a great learning experience!
Please note: To use AssemblyAI you need to create your own account on https://www.assemblyai.com/. After creating your account, you will receive an AssemblyAI API Key, which you need to copy into notebook (change the value of ASSEMBLY_AI_API_KEY
constant in the notebook).