This example shows how to use ORT to do speech recognition using the Wav2Vec 2.0 model.
It is heavily inspired by this PyTorch example.
The application lets the user make an audio recording, then recognizes the speech from that recording and displays a transcript.
See the general prerequisites here.
Additionally, you will need to be able to record audio, either on a simulator or a device.
The model should be generated in this location: <this directory>/SpeechRecognition/model
See instructions here for how to generate the model.
For example, with the model generation script dependencies installed, from this directory, run:
../model/gen_model.sh ./SpeechRecognition/model
From this directory, run:
pod install
Open the generated SpeechRecognition.xcworkspace file in Xcode to build and run the example.