Skip to content

Commit

Permalink
Merge branch 'k2-fsa:master' into k2ssl
Browse files Browse the repository at this point in the history
  • Loading branch information
yfyeung authored Feb 22, 2024
2 parents 911bfac + 2483b8b commit f2f102d
Show file tree
Hide file tree
Showing 98 changed files with 13,476 additions and 181 deletions.
2 changes: 2 additions & 0 deletions .github/scripts/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ ARG _KALDIFEAT_VERSION="${KALDIFEAT_VERSION}+cpu.torch${TORCH_VERSION}"

RUN apt-get update -y && \
apt-get install -qq -y \
cmake \
ffmpeg \
git \
git-lfs \
Expand Down Expand Up @@ -50,6 +51,7 @@ RUN pip install --no-cache-dir \
onnxruntime \
pytest \
sentencepiece>=0.1.96 \
pypinyin==0.50.0 \
six \
tensorboard \
typeguard
Expand Down
21 changes: 13 additions & 8 deletions .github/scripts/docker/generate_build_matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@


def version_gt(a, b):
a_major, a_minor = a.split(".")[:2]
b_major, b_minor = b.split(".")[:2]
a_major, a_minor = list(map(int, a.split(".")))[:2]
b_major, b_minor = list(map(int, b.split(".")))[:2]
if a_major > b_major:
return True

Expand All @@ -18,8 +18,8 @@ def version_gt(a, b):


def version_ge(a, b):
a_major, a_minor = a.split(".")[:2]
b_major, b_minor = b.split(".")[:2]
a_major, a_minor = list(map(int, a.split(".")))[:2]
b_major, b_minor = list(map(int, b.split(".")))[:2]
if a_major > b_major:
return True

Expand All @@ -43,11 +43,12 @@ def get_torchaudio_version(torch_version):


def get_matrix():
k2_version = "1.24.4.dev20231220"
kaldifeat_version = "1.25.3.dev20231221"
version = "1.2"
python_version = ["3.8", "3.9", "3.10", "3.11"]
k2_version = "1.24.4.dev20240218"
kaldifeat_version = "1.25.4.dev20240218"
version = "1.4"
python_version = ["3.8", "3.9", "3.10", "3.11", "3.12"]
torch_version = ["1.13.0", "1.13.1", "2.0.0", "2.0.1", "2.1.0", "2.1.1", "2.1.2"]
torch_version += ["2.2.0"]

matrix = []
for p in python_version:
Expand All @@ -57,6 +58,10 @@ def get_matrix():
if version_gt(p, "3.10") and not version_gt(t, "2.0"):
continue

# only torch>=2.2.0 supports python 3.12
if version_gt(p, "3.11") and not version_gt(t, "2.1"):
continue

matrix.append(
{
"k2-version": k2_version,
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build-docker-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest]
image: ["torch2.1.0-cuda12.1", "torch2.1.0-cuda11.8", "torch2.0.0-cuda11.7", "torch1.13.0-cuda11.6", "torch1.12.1-cuda11.3", "torch1.9.0-cuda10.2"]
image: ["torch2.2.0-cuda12.1", "torch2.2.0-cuda11.8", "torch2.1.0-cuda12.1", "torch2.1.0-cuda11.8", "torch2.0.0-cuda11.7", "torch1.13.0-cuda11.6", "torch1.12.1-cuda11.3", "torch1.9.0-cuda10.2"]

steps:
# refer to https://github.com/actions/checkout
Expand Down
9 changes: 8 additions & 1 deletion .github/workflows/run-docker-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,20 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest]
image: ["torch2.1.0-cuda12.1", "torch2.1.0-cuda11.8", "torch2.0.0-cuda11.7", "torch1.13.0-cuda11.6", "torch1.12.1-cuda11.3", "torch1.9.0-cuda10.2"]
image: ["torch2.2.0-cuda12.1", "torch2.2.0-cuda11.8", "torch2.1.0-cuda12.1", "torch2.1.0-cuda11.8", "torch2.0.0-cuda11.7", "torch1.13.0-cuda11.6", "torch1.12.1-cuda11.3", "torch1.9.0-cuda10.2"]
steps:
# refer to https://github.com/actions/checkout
- uses: actions/checkout@v2
with:
fetch-depth: 0

- name: Free space
shell: bash
run: |
df -h
rm -rf /opt/hostedtoolcache
df -h
- name: Run the build process with Docker
uses: addnab/docker-run-action@v3
with:
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/yesno.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,7 @@ jobs:
cd /icefall
git config --global --add safe.directory /icefall
python3 -m torch.utils.collect_env
python3 -m k2.version
.github/scripts/yesno/ASR/run.sh
4 changes: 2 additions & 2 deletions docker/torch1.12.1-cuda11.3.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ ENV LC_ALL C.UTF-8
ARG DEBIAN_FRONTEND=noninteractive

# python 3.7
ARG K2_VERSION="1.24.4.dev20230725+cuda11.3.torch1.12.1"
ARG KALDIFEAT_VERSION="1.25.1.dev20231022+cuda11.3.torch1.12.1"
ARG K2_VERSION="1.24.4.dev20240211+cuda11.3.torch1.12.1"
ARG KALDIFEAT_VERSION="1.25.4.dev20240210+cuda11.3.torch1.12.1"
ARG TORCHAUDIO_VERSION="0.12.1+cu113"

LABEL authors="Fangjun Kuang <[email protected]>"
Expand Down
4 changes: 2 additions & 2 deletions docker/torch1.13.0-cuda11.6.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ ENV LC_ALL C.UTF-8
ARG DEBIAN_FRONTEND=noninteractive

# python 3.9
ARG K2_VERSION="1.24.4.dev20231021+cuda11.6.torch1.13.0"
ARG KALDIFEAT_VERSION="1.25.1.dev20231022+cuda11.6.torch1.13.0"
ARG K2_VERSION="1.24.4.dev20240211+cuda11.6.torch1.13.0"
ARG KALDIFEAT_VERSION="1.25.4.dev20240210+cuda11.6.torch1.13.0"
ARG TORCHAUDIO_VERSION="0.13.0+cu116"

LABEL authors="Fangjun Kuang <[email protected]>"
Expand Down
4 changes: 2 additions & 2 deletions docker/torch1.9.0-cuda10.2.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ ENV LC_ALL C.UTF-8
ARG DEBIAN_FRONTEND=noninteractive

# python 3.7
ARG K2_VERSION="1.24.3.dev20230726+cuda10.2.torch1.9.0"
ARG KALDIFEAT_VERSION="1.25.1.dev20231022+cuda10.2.torch1.9.0"
ARG K2_VERSION="1.24.4.dev20240211+cuda10.2.torch1.9.0"
ARG KALDIFEAT_VERSION="1.25.4.dev20240210+cuda10.2.torch1.9.0"
ARG TORCHAUDIO_VERSION="0.9.0"

LABEL authors="Fangjun Kuang <[email protected]>"
Expand Down
4 changes: 2 additions & 2 deletions docker/torch2.0.0-cuda11.7.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ ENV LC_ALL C.UTF-8
ARG DEBIAN_FRONTEND=noninteractive

# python 3.10
ARG K2_VERSION="1.24.4.dev20231021+cuda11.7.torch2.0.0"
ARG KALDIFEAT_VERSION="1.25.1.dev20231022+cuda11.7.torch2.0.0"
ARG K2_VERSION="1.24.4.dev20240211+cuda11.7.torch2.0.0"
ARG KALDIFEAT_VERSION="1.25.4.dev20240210+cuda11.7.torch2.0.0"
ARG TORCHAUDIO_VERSION="2.0.0+cu117"

LABEL authors="Fangjun Kuang <[email protected]>"
Expand Down
4 changes: 2 additions & 2 deletions docker/torch2.1.0-cuda11.8.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ ENV LC_ALL C.UTF-8
ARG DEBIAN_FRONTEND=noninteractive

# python 3.10
ARG K2_VERSION="1.24.4.dev20231021+cuda11.8.torch2.1.0"
ARG KALDIFEAT_VERSION="1.25.1.dev20231022+cuda11.8.torch2.1.0"
ARG K2_VERSION="1.24.4.dev20240211+cuda11.8.torch2.1.0"
ARG KALDIFEAT_VERSION="1.25.4.dev20240210+cuda11.8.torch2.1.0"
ARG TORCHAUDIO_VERSION="2.1.0+cu118"

LABEL authors="Fangjun Kuang <[email protected]>"
Expand Down
4 changes: 2 additions & 2 deletions docker/torch2.1.0-cuda12.1.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ ENV LC_ALL C.UTF-8
ARG DEBIAN_FRONTEND=noninteractive

# python 3.10
ARG K2_VERSION="1.24.4.dev20231021+cuda12.1.torch2.1.0"
ARG KALDIFEAT_VERSION="1.25.1.dev20231022+cuda12.1.torch2.1.0"
ARG K2_VERSION="1.24.4.dev20240211+cuda12.1.torch2.1.0"
ARG KALDIFEAT_VERSION="1.25.4.dev20240210+cuda12.1.torch2.1.0"
ARG TORCHAUDIO_VERSION="2.1.0+cu121"

LABEL authors="Fangjun Kuang <[email protected]>"
Expand Down
70 changes: 70 additions & 0 deletions docker/torch2.2.0-cuda11.8.dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
FROM pytorch/pytorch:2.2.0-cuda11.8-cudnn8-devel

ENV LC_ALL C.UTF-8

ARG DEBIAN_FRONTEND=noninteractive

# python 3.10
ARG K2_VERSION="1.24.4.dev20240211+cuda11.8.torch2.2.0"
ARG KALDIFEAT_VERSION="1.25.4.dev20240210+cuda11.8.torch2.2.0"
ARG TORCHAUDIO_VERSION="2.2.0+cu118"

LABEL authors="Fangjun Kuang <[email protected]>"
LABEL k2_version=${K2_VERSION}
LABEL kaldifeat_version=${KALDIFEAT_VERSION}
LABEL github_repo="https://github.com/k2-fsa/icefall"

RUN apt-get update && \
apt-get install -y --no-install-recommends \
curl \
vim \
libssl-dev \
autoconf \
automake \
bzip2 \
ca-certificates \
ffmpeg \
g++ \
gfortran \
git \
libtool \
make \
patch \
sox \
subversion \
unzip \
valgrind \
wget \
zlib1g-dev \
&& rm -rf /var/lib/apt/lists/*

# Install dependencies
RUN pip install --no-cache-dir \
torchaudio==${TORCHAUDIO_VERSION} -f https://download.pytorch.org/whl/torch_stable.html \
k2==${K2_VERSION} -f https://k2-fsa.github.io/k2/cuda.html \
git+https://github.com/lhotse-speech/lhotse \
kaldifeat==${KALDIFEAT_VERSION} -f https://csukuangfj.github.io/kaldifeat/cuda.html \
kaldi_native_io \
kaldialign \
kaldifst \
kaldilm \
sentencepiece>=0.1.96 \
tensorboard \
typeguard \
dill \
onnx \
onnxruntime \
onnxmltools \
multi_quantization \
typeguard \
numpy \
pytest \
graphviz

RUN git clone https://github.com/k2-fsa/icefall /workspace/icefall && \
cd /workspace/icefall && \
pip install --no-cache-dir -r requirements.txt

ENV PYTHONPATH /workspace/icefall:$PYTHONPATH

WORKDIR /workspace/icefall
70 changes: 70 additions & 0 deletions docker/torch2.2.0-cuda12.1.dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
FROM pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel

ENV LC_ALL C.UTF-8

ARG DEBIAN_FRONTEND=noninteractive

# python 3.10
ARG K2_VERSION="1.24.4.dev20240211+cuda12.1.torch2.2.0"
ARG KALDIFEAT_VERSION="1.25.4.dev20240210+cuda12.1.torch2.2.0"
ARG TORCHAUDIO_VERSION="2.2.0+cu121"

LABEL authors="Fangjun Kuang <[email protected]>"
LABEL k2_version=${K2_VERSION}
LABEL kaldifeat_version=${KALDIFEAT_VERSION}
LABEL github_repo="https://github.com/k2-fsa/icefall"

RUN apt-get update && \
apt-get install -y --no-install-recommends \
curl \
vim \
libssl-dev \
autoconf \
automake \
bzip2 \
ca-certificates \
ffmpeg \
g++ \
gfortran \
git \
libtool \
make \
patch \
sox \
subversion \
unzip \
valgrind \
wget \
zlib1g-dev \
&& rm -rf /var/lib/apt/lists/*

# Install dependencies
RUN pip install --no-cache-dir \
torchaudio==${TORCHAUDIO_VERSION} -f https://download.pytorch.org/whl/torch_stable.html \
k2==${K2_VERSION} -f https://k2-fsa.github.io/k2/cuda.html \
git+https://github.com/lhotse-speech/lhotse \
kaldifeat==${KALDIFEAT_VERSION} -f https://csukuangfj.github.io/kaldifeat/cuda.html \
kaldi_native_io \
kaldialign \
kaldifst \
kaldilm \
sentencepiece>=0.1.96 \
tensorboard \
typeguard \
dill \
onnx \
onnxruntime \
onnxmltools \
multi_quantization \
typeguard \
numpy \
pytest \
graphviz

RUN git clone https://github.com/k2-fsa/icefall /workspace/icefall && \
cd /workspace/icefall && \
pip install --no-cache-dir -r requirements.txt

ENV PYTHONPATH /workspace/icefall:$PYTHONPATH

WORKDIR /workspace/icefall
6 changes: 3 additions & 3 deletions docs/source/decoding-with-langugage-models/LODR.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ of langugae model integration.
First, let's have a look at some background information. As the predecessor of LODR, Density Ratio (DR) is first proposed `here <https://arxiv.org/abs/2002.11268>`_
to address the language information mismatch between the training
corpus (source domain) and the testing corpus (target domain). Assuming that the source domain and the test domain
are acoustically similar, DR derives the following formular for decoding with Bayes' theorem:
are acoustically similar, DR derives the following formula for decoding with Bayes' theorem:

.. math::
Expand All @@ -41,7 +41,7 @@ are acoustically similar, DR derives the following formular for decoding with Ba
where :math:`\lambda_1` and :math:`\lambda_2` are the weights of LM scores for target domain and source domain respectively.
Here, the source domain LM is trained on the training corpus. The only difference in the above formular compared to
Here, the source domain LM is trained on the training corpus. The only difference in the above formula compared to
shallow fusion is the subtraction of the source domain LM.

Some works treat the predictor and the joiner of the neural transducer as its internal LM. However, the LM is
Expand All @@ -58,7 +58,7 @@ during decoding for transducer model:
In LODR, an additional bi-gram LM estimated on the source domain (e.g training corpus) is required. Compared to DR,
the only difference lies in the choice of source domain LM. According to the original `paper <https://arxiv.org/abs/2203.16776>`_,
LODR achieves similar performance compared DR in both intra-domain and cross-domain settings.
LODR achieves similar performance compared to DR in both intra-domain and cross-domain settings.
As a bi-gram is much faster to evaluate, LODR is usually much faster.

Now, we will show you how to use LODR in ``icefall``.
Expand Down
Loading

0 comments on commit f2f102d

Please sign in to comment.