Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation #9

Merged
merged 3 commits into from
Feb 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 27 additions & 47 deletions bin/speech2text
Original file line number Diff line number Diff line change
Expand Up @@ -2,63 +2,43 @@

usage() {
cat << EOF
Aalto speech2text app.
This app does speech2text with diarization.

Usage:
Example run on a single file:

0) Load the speech2text app
export [email protected]
export SPEECH2TEXT_LANGUAGE=finnish
speech2text audiofile.mp3

Load the speech2text app with
Example run on a folder containing one or more audio file:

module load speech2text
export [email protected]
export SPEECH2TEXT_LANGUAGE=finnish
speech2text audiofiles/

This needs to be done once every login.
The audio files can be in any common audio (.wav, .mp3, .aff, etc.) or video (.mp4, .mov, etc.) format.

The speech2text app writes result files to a subfolder results/ next to each audio file.
Result filenames are the audio filename with .txt and .csv extensions. For example, result files
corresponding to audiofile.mp3 are written to results/audiofile.txt and results/audiofile.csv.
Result files in a folder audiofiles/ will be written to folder audiofiles/results/.

1) Set environment variables
Notification emails will be sent to SPEECH2TEXT_EMAIL. If SPEECH2TEXT_EMAIL is left
unspecified, no notifications are sent.

Set email (for Slurm job notifications) and audio language environment variables:
Supported languages are:

export [email protected]
export SPEECH2TEXT_LANGUAGE=my-language
afrikaans, arabic, armenian, azerbaijani, belarusian, bosnian, bulgarian, catalan,
chinese, croatian, czech, danish, dutch, english, estonian, finnish, french, galician,
german, greek, hebrew, hindi, hungarian, icelandic, indonesian, italian, japanese,
kannada, kazakh, korean, latvian, lithuanian, macedonian, malay, marathi, maori, nepali,
norwegian, persian, polish, portuguese, romanian, russian, serbian, slovak, slovenian,
spanish, swahili, swedish, tagalog, tamil, thai, turkish, ukrainian, urdu, vietnamese,
welsh

For example:

export [email protected]
export SPEECH2TEXT_LANGUAGE=finnish

The following variables are already set by the lmod .lua script. They can be ignored by user.

HF_HOME
TORCH_HOME
WHISPER_CACHE
PYANNOTE_CONFIG
NUMBA_CACHE
MPLCONFIGDIR
SPEECH2TEXT_TMP
SPEECH2TEXT_MEM
SPEECH2TEXT_CPUS_PER_TASK
SPEECH2TEXT_TIME


2a) Process a single audio file

speech2text audio-file

The audio file can be in any common audio (.wav, .mp3, .aff, etc.) or video (.mp4, .mov, etc.) format.
The transcription and diarization results (.txt and .csv files) corresponding to each audio file
will be written to results/ next to the file.


2b) Process multiple audio files in a folder

speech2text audio-files/

The audio file can be in any common audio (.wav, .mp3, .aff, etc.) or video (.mp4, .mov, etc.) format.
The transcription and diarization results (.txt and .csv files) corresponding to each audio file
will be written to audio-files/results.

See also: https://github.com/AaltoRSE/speech2text
You can leave the language variable SPEECH2TEXT_LANGUAGE unspecified, in which case
speech2text tries to detect the language automatically. Specifying the language
explicitly is, however, recommended.
EOF
}

Expand Down
2 changes: 2 additions & 0 deletions docs/source/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ Go to [Open On Demand](http://ood.triton.aalto.fi) and log in with your Aalto us
>
> Subsequently, the shell will ask you for a password. This is your Aalto password. Note that your key presses do not show - just write your password and press enter.
>
> You can skip the following question about `.zshrc` file creation by pressing "q".
>
> Afterwards, you can close this tab. Your Triton account is now fully operational.


Expand Down
6 changes: 6 additions & 0 deletions src/submit.py
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,9 @@ def submit_dir(args, job_name):
args.SPEECH2TEXT_TMP,
)

# Log
print(f"Results will be written to folder: {output_dir}\n")

# Submit
cmd = f"sbatch {tmp_file_sh.absolute()}"
cmd = shlex.split(cmd)
Expand Down Expand Up @@ -292,6 +295,9 @@ def submit_file(args, job_name):
args.SPEECH2TEXT_TMP,
)

# Log
print(f"Results will be written to folder: {output_dir}\n")

# Submit
cmd = f"sbatch {tmp_file_sh.absolute()}"
cmd = shlex.split(cmd)
Expand Down
Loading