-
Notifications
You must be signed in to change notification settings - Fork 5
Usage
$ python speech2text.py -i ~/temp/transcription/2001 -b ~/temp/transcription/ -o=/tmp/transcription/testout/ --verbose
In the above command:
- input folder is ~/temp/transcription/2001
- base folder is ~/temp/transcription
- output folder is ~/tmp/transcription/testout
- verbose mode is enabled.
Reading from flash drive:
$ ls /Volumes/Samsung\ USB/
$ python speech2text.py -i /Volumes/Samsung\ USB/AudioJournals_TEST/ICD-BP100\ 2002_PARTIAL/ -b /Volumes/Samsung\ USB/AudioJournals_TEST/ -o=/tmp/transcription/stt_test2
Reading from and writing back to flash drive
$ python speech2text.py -i /Volumes/Samsung\ USB/AudioJournals_TEST/ICD-BP100\ 2002_PARTIAL/ -b /Volumes/Samsung\ USB/AudioJournals_TEST/ -o=/Volumes/Samsung\ USB/AudioJournals_TEST/ibm_stt
The -b option allows specifying a prefix of the input filepath that is used to organize the output results.
This functionality allows me to process a subset of files while having the result set organized within a larger context. One of the early challenges I encountered was to process the data in a piecemail selective fashion while maintaining an overall organization of the output to (a) reflect the organization of the input data, which simplifies locating the input file that corresponds to the output file, which is important in case that the audio filenames are not uniquely named, and (b) simplify the later processing steps, such as reprocessing, tallying results, performing file-level analyses.
The following command handles a small subset of files deeply embedded in folder while properly organizing the output to fit into a larger dataset that was processed previously or will be processed subsequently, regardless of whether it is done in large batches, small batches, or one file at a time.
$ python speech2text.py -i /Volumes/Samsung\ USB/AudioJournals_TEST/2016_Partial/Family/ /Volumes/Samsung\ USB/AudioJournals_TEST/ -b /Volumes/Samsung\ USB/AudioJournals_TEST/ibm_stt
The -s option allows specifying the speech-to-text client filepath, overriding the default location.
Following example uses a python executable on the laptop drive instead of on the flash drive where the data resides:
$ python speech2text.py -i /Volumes/Samsung\ USB/AudioJournals_TEST/2016_Partial/Family/ -b /Volumes/Samsung\ USB/AudioJournals_TEST/ -o /Volumes/Samsung\ USB/AudioJournals_TEST/ibm_stt -s ~/code/speech-to-text/speech-to-text-websockets-python/sttClient.py
The -m option specifies a maximum number of items to be processed. This is handy for development, or where you want to avoid overusing a processing allowance.
The -k option instructs not to overwrite previous results. This avoids reprocessing files that have already been transcribed successfully.
The following command does max of 10 items, does not overwrite previous results
$ time python speech2text.py
-i /Volumes/Samsung\ USB/AudioJournals_TEST/2016_Partial/
-b /Volumes/Samsung\ USB/AudioJournals_TEST/
-o /Volumes/Samsung\ USB/AudioJournals_TEST/ibm_stt
-s ~/code/speech-to-text/speech-to-text-websockets-python/sttClient.py
-m 10
-k
$ time python speech2text.py \
-i /Volumes/Samsung\ USB/AudioJournals/ICD-BP100\ 2003/ \
-b /Volumes/Samsung\ USB/AudioJournals/ \
-o /Volumes/Samsung\ USB/AudioJournals/ibm_stt \
-s ~/code/speech-to-text/speech-to-text-websockets-python/sttClient.py
$ time python speech2text.py \
-i /Volumes/Samsung\ USB/AudioJournals/2009/ \
-b /Volumes/Samsung\ USB/AudioJournals/ \
-o /Volumes/Samsung\ USB/AudioJournals/ibm_stt \
-s ~/code/speech-to-text/speech-to-text-websockets-python/sttClient.py
$ time python /Users/mark/code/speech-to-text/speech2text.py \
-i ~/temp/transcription/2001/ \
-b ~/temp/transcription/ \
-o ~/temp/transcription/google_stt \
-s ~/temp/gcloud/python-docs-samples/speech/grpc/transcribe_async.py \
-g -k -m8
$ time python /Users/mark/code/speech-to-text/speech2text.py \
-i /Volumes/Samsung\ USB/AudioJournals/ICD-BP100\ 2003/ \
-b /Volumes/Samsung\ USB/AudioJournals/ \
-o /Volumes/Samsung\ USB/AudioJournals/google_stt \
-s ~/temp/gcloud/python-docs-samples/speech/grpc/transcribe_async.py \
-g -k -m100
This depends on the output of the following scripts:
- text2stats.py
- tally_audio.py
$ python compare.py -f /temp/stt/AudioJournals/text2stats