Skip to content

Commit

Permalink
Migrate zipformer model to other Chinese datasets (#1216)
Browse files Browse the repository at this point in the history
added zipformer recipe for AISHELL-1
  • Loading branch information
JinZr authored Oct 24, 2023
1 parent 3fb9940 commit d76c3fe
Show file tree
Hide file tree
Showing 42 changed files with 3,741 additions and 18 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ log "Downloading pre-commputed fbank from $fbank_url"
git clone https://huggingface.co/csukuangfj/aishell-test-dev-manifests
ln -s $PWD/aishell-test-dev-manifests/data .

log "Downloading pre-trained model from $repo_url"
repo_url=https://huggingface.co/csukuangfj/icefall-aishell-pruned-transducer-stateless3-2022-06-20
log "Downloading pre-trained model from $repo_url"
git clone $repo_url
repo=$(basename $repo_url)

Expand Down
103 changes: 103 additions & 0 deletions .github/scripts/run-aishell-zipformer-2023-10-24.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
#!/usr/bin/env bash

set -e

log() {
# This function is from espnet
local fname=${BASH_SOURCE[1]##*/}
echo -e "$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*"
}

cd egs/aishell/ASR

git lfs install

fbank_url=https://huggingface.co/csukuangfj/aishell-test-dev-manifests
log "Downloading pre-commputed fbank from $fbank_url"

git clone https://huggingface.co/csukuangfj/aishell-test-dev-manifests
ln -s $PWD/aishell-test-dev-manifests/data .

log "======================="
log "CI testing large model"
repo_url=https://huggingface.co/zrjin/icefall-asr-aishell-zipformer-large-2023-10-24/
log "Downloading pre-trained model from $repo_url"
git clone $repo_url
repo=$(basename $repo_url)

log "Display test files"
tree $repo/
ls -lh $repo/test_wavs/*.wav

for method in modified_beam_search greedy_search fast_beam_search; do
log "$method"

./zipformer/pretrained.py \
--method $method \
--context-size 1 \
--checkpoint $repo/exp/pretrained.pt \
--tokens $repo/data/lang_char/tokens.txt \
--num-encoder-layers 2,2,4,5,4,2 \
--feedforward-dim 512,768,1536,2048,1536,768 \
--encoder-dim 192,256,512,768,512,256 \
--encoder-unmasked-dim 192,192,256,320,256,192 \
$repo/test_wavs/BAC009S0764W0121.wav \
$repo/test_wavs/BAC009S0764W0122.wav \
$repo/test_wavs/BAC009S0764W0123.wav
done

log "======================="
log "CI testing medium model"
repo_url=https://huggingface.co/zrjin/icefall-asr-aishell-zipformer-2023-10-24/
log "Downloading pre-trained model from $repo_url"
git clone $repo_url
repo=$(basename $repo_url)

log "Display test files"
tree $repo/
ls -lh $repo/test_wavs/*.wav


for method in modified_beam_search greedy_search fast_beam_search; do
log "$method"

./zipformer/pretrained.py \
--method $method \
--context-size 1 \
--checkpoint $repo/exp/pretrained.pt \
--tokens $repo/data/lang_char/tokens.txt \
$repo/test_wavs/BAC009S0764W0121.wav \
$repo/test_wavs/BAC009S0764W0122.wav \
$repo/test_wavs/BAC009S0764W0123.wav
done


log "======================="
log "CI testing small model"
repo_url=https://huggingface.co/zrjin/icefall-asr-aishell-zipformer-small-2023-10-24/
log "Downloading pre-trained model from $repo_url"
git clone $repo_url
repo=$(basename $repo_url)

log "Display test files"
tree $repo/
ls -lh $repo/test_wavs/*.wav


for method in modified_beam_search greedy_search fast_beam_search; do
log "$method"

./zipformer/pretrained.py \
--method $method \
--context-size 1 \
--checkpoint $repo/exp/pretrained.pt \
--tokens $repo/data/lang_char/tokens.txt \
--num-encoder-layers 2,2,2,2,2,2 \
--feedforward-dim 512,768,768,768,768,768 \
--encoder-dim 192,256,256,256,256,256 \
--encoder-unmasked-dim 192,192,192,192,192,192 \
$repo/test_wavs/BAC009S0764W0121.wav \
$repo/test_wavs/BAC009S0764W0122.wav \
$repo/test_wavs/BAC009S0764W0123.wav
done

95 changes: 95 additions & 0 deletions .github/workflows/run-aishell-zipformer-2023-10-24.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Copyright 2023 Zengrui Jin (Xiaomi Corp.)

# See ../../LICENSE for clarification regarding multiple authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

name: run-aishell-zipformer-2023-10-24

on:
push:
branches:
- master
pull_request:
types: [labeled]

schedule:
# minute (0-59)
# hour (0-23)
# day of the month (1-31)
# month (1-12)
# day of the week (0-6)
# nightly build at 15:50 UTC time every day
- cron: "50 15 * * *"

concurrency:
group: run_aishell_zipformer_2023_10_24-${{ github.ref }}
cancel-in-progress: true

jobs:
run_aishell_zipformer_2023_10_24:
if: github.event.label.name == 'ready' || github.event.label.name == 'zipformer' || github.event.label.name == 'run-decode' || github.event_name == 'push' || github.event_name == 'schedule'
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest]
python-version: [3.8]

fail-fast: false

steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0

- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
cache-dependency-path: '**/requirements-ci.txt'

- name: Install Python dependencies
run: |
grep -v '^#' ./requirements-ci.txt | xargs -n 1 -L 1 pip install
pip uninstall -y protobuf
pip install --no-binary protobuf protobuf==3.20.*
- name: Cache kaldifeat
id: my-cache
uses: actions/cache@v2
with:
path: |
~/tmp/kaldifeat
key: cache-tmp-${{ matrix.python-version }}-2023-05-22

- name: Install kaldifeat
if: steps.my-cache.outputs.cache-hit != 'true'
shell: bash
run: |
.github/scripts/install-kaldifeat.sh
- name: Inference with pre-trained model
shell: bash
env:
GITHUB_EVENT_NAME: ${{ github.event_name }}
GITHUB_EVENT_LABEL_NAME: ${{ github.event.label.name }}
run: |
sudo apt-get -qq install git-lfs tree
export PYTHONPATH=$PWD:$PYTHONPATH
export PYTHONPATH=~/tmp/kaldifeat/kaldifeat/python:$PYTHONPATH
export PYTHONPATH=~/tmp/kaldifeat/build/lib:$PYTHONPATH
.github/scripts/run-aishell-zipformer-2023-10-24.sh
4 changes: 3 additions & 1 deletion egs/aidatatang_200zh/ASR/prepare.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ set -eou pipefail

stage=-1
stop_stage=100
perturb_speed=true


# We assume dl_dir (download dir) contains the following
# directories and files. If not, they will be downloaded
Expand Down Expand Up @@ -77,7 +79,7 @@ if [ $stage -le 4 ] && [ $stop_stage -ge 4 ]; then
log "Stage 4: Compute fbank for aidatatang_200zh"
if [ ! -f data/fbank/.aidatatang_200zh.done ]; then
mkdir -p data/fbank
./local/compute_fbank_aidatatang_200zh.py --perturb-speed True
./local/compute_fbank_aidatatang_200zh.py --perturb-speed ${perturb_speed}
touch data/fbank/.aidatatang_200zh.done
fi
fi
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ def add_arguments(cls, parser: argparse.ArgumentParser):
group.add_argument(
"--bucketing-sampler",
type=str2bool,
default=True,
default=False,
help="When enabled, the batches will come from buckets of "
"similar duration (saves padding frames).",
)
Expand Down Expand Up @@ -289,6 +289,7 @@ def train_dataloaders(
shuffle=self.args.shuffle,
num_buckets=self.args.num_buckets,
drop_last=True,
buffer_size=50000,
)
else:
logging.info("Using SimpleCutSampler.")
Expand Down
6 changes: 4 additions & 2 deletions egs/aishell/ASR/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@

# Introduction

Please refer to <https://icefall.readthedocs.io/en/latest/recipes/Non-streaming-ASR/aishell/index.html>
for how to run models in this recipe.
Please refer to <https://k2-fsa.github.io/icefall/recipes/Non-streaming-ASR/aishell/index.html> for how to run models in this recipe.

Aishell is an open-source Chinese Mandarin speech corpus published by Beijing Shell Shell Technology Co., Ltd.
400 people from different accent areas in China are invited to participate in the recording, which is conducted in a quiet indoor environment using high fidelity microphone and downsampled to 16kHz. The manual transcription accuracy is above 95%, through professional speech annotation and strict quality inspection. The data is free for academic use. We hope to provide moderate amount of data for new researchers in the field of speech recognition.

(From [Open Speech and Language Resources](https://www.openslr.org/33/))

# Transducers

Expand Down
Loading

0 comments on commit d76c3fe

Please sign in to comment.