-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
27 additions
and
47 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,63 +2,43 @@ | |
|
||
usage() { | ||
cat << EOF | ||
Aalto speech2text app. | ||
This app does speech2text with diarization. | ||
Usage: | ||
Example run on a single file: | ||
0) Load the speech2text app | ||
export [email protected] | ||
export SPEECH2TEXT_LANGUAGE=finnish | ||
speech2text audiofile.mp3 | ||
Load the speech2text app with | ||
Example run on a folder containing one or more audio file: | ||
module load speech2text | ||
export [email protected] | ||
export SPEECH2TEXT_LANGUAGE=finnish | ||
speech2text audiofiles/ | ||
This needs to be done once every login. | ||
The audio files can be in any common audio (.wav, .mp3, .aff, etc.) or video (.mp4, .mov, etc.) format. | ||
The speech2text app writes result files to a subfolder results/ next to each audio file. | ||
Result filenames are the audio filename with .txt and .csv extensions. For example, result files | ||
corresponding to audiofile.mp3 are written to results/audiofile.txt and results/audiofile.csv. | ||
Result files in a folder audiofiles/ will be written to folder audiofiles/results/. | ||
1) Set environment variables | ||
Notification emails will be sent to SPEECH2TEXT_EMAIL. If SPEECH2TEXT_EMAIL is left | ||
unspecified, no notifications are sent. | ||
Set email (for Slurm job notifications) and audio language environment variables: | ||
Supported languages are: | ||
export [email protected] | ||
export SPEECH2TEXT_LANGUAGE=my-language | ||
afrikaans, arabic, armenian, azerbaijani, belarusian, bosnian, bulgarian, catalan, | ||
chinese, croatian, czech, danish, dutch, english, estonian, finnish, french, galician, | ||
german, greek, hebrew, hindi, hungarian, icelandic, indonesian, italian, japanese, | ||
kannada, kazakh, korean, latvian, lithuanian, macedonian, malay, marathi, maori, nepali, | ||
norwegian, persian, polish, portuguese, romanian, russian, serbian, slovak, slovenian, | ||
spanish, swahili, swedish, tagalog, tamil, thai, turkish, ukrainian, urdu, vietnamese, | ||
welsh | ||
For example: | ||
export [email protected] | ||
export SPEECH2TEXT_LANGUAGE=finnish | ||
The following variables are already set by the lmod .lua script. They can be ignored by user. | ||
HF_HOME | ||
TORCH_HOME | ||
WHISPER_CACHE | ||
PYANNOTE_CONFIG | ||
NUMBA_CACHE | ||
MPLCONFIGDIR | ||
SPEECH2TEXT_TMP | ||
SPEECH2TEXT_MEM | ||
SPEECH2TEXT_CPUS_PER_TASK | ||
SPEECH2TEXT_TIME | ||
2a) Process a single audio file | ||
speech2text audio-file | ||
The audio file can be in any common audio (.wav, .mp3, .aff, etc.) or video (.mp4, .mov, etc.) format. | ||
The transcription and diarization results (.txt and .csv files) corresponding to each audio file | ||
will be written to results/ next to the file. | ||
2b) Process multiple audio files in a folder | ||
speech2text audio-files/ | ||
The audio file can be in any common audio (.wav, .mp3, .aff, etc.) or video (.mp4, .mov, etc.) format. | ||
The transcription and diarization results (.txt and .csv files) corresponding to each audio file | ||
will be written to audio-files/results. | ||
See also: https://github.com/AaltoRSE/speech2text | ||
You can leave the language variable SPEECH2TEXT_LANGUAGE unspecified, in which case | ||
speech2text tries to detect the language automatically. Specifying the language | ||
explicitly is, however, recommended. | ||
EOF | ||
} | ||
|
||
|