From 8259fa442402de7cf216bbd16de2de8af8f1b30c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Drago=C8=99?= <balandragos5555@gmail.com>
Date: Mon, 22 Jul 2024 19:12:31 +0200
Subject: [PATCH] Add RTF results for cts_nl

---
 NISV/cts_nl/res_labelled.md   | 14 +++++++-------
 NISV/cts_nl/res_unlabelled.md | 14 +++++++-------
 2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/NISV/cts_nl/res_labelled.md b/NISV/cts_nl/res_labelled.md
index dc888cd..03d109d 100644
--- a/NISV/cts_nl/res_labelled.md
+++ b/NISV/cts_nl/res_labelled.md
@@ -42,17 +42,17 @@ And a matrix with the **time** spent in total by each implementation **to load a
 
 \* For WhisperX, a separate alignment model based on wav2vec 2.0 has been applied in order to obtain word-level timestamps. Therefore, the time measured contains the time to load the model, time to transcribe, and time to align to generate timestamps. Speaker diarization has also been applied for WhisperX, which is measured separately and covered in [this section](./whisperx.md).
 
-<!-- <br>
+<br>
 
-Here's also a matrix with the **Real-Time Factor or RTF** for short (defined as time to process all of the input divided by the duration of the input) for transcribing **2.23 hours of speech** (rounded to 4 decimals):
+Here's also a matrix with the **Real-Time Factor or RTF** for short (defined as time to process all of the input divided by the duration of the input) for transcribing **2.18 hours of speech** (rounded to 4 decimals):
 
 |RTF (process time/duration of audio)|large-v2 with `float16`|large-v2 with `float32`|large-v3 with `float16`|large-v3 with `float32`|
 |---|---|---|---|---|
-|[OpenAI](https://github.com/openai/whisper)|0.2698|0.2443|0.3149|0.2273|
-|[Huggingface (`transformers`)](https://huggingface.co/openai/whisper-large-v2#long-form-transcription)|0.1629|0.1436|0.1746|0.1647|
-|[faster-whisper](https://github.com/SYSTRAN/faster-whisper/)|0.0871|0.168|0.0827|0.1799|
-|**[faster-whisper w/ batching](https://github.com/SYSTRAN/faster-whisper/pull/856)**|**0.0355**|**0.0663**|**0.033**|**0.0633**|
-|[WhisperX](https://github.com/m-bain/whisperX/)\*|0.0864|0.114|0.0823|0.1126| -->
+|[OpenAI](https://github.com/openai/whisper)|0.2695|0.2189|0.2396|0.246|
+|[Huggingface (`transformers`)](https://huggingface.co/openai/whisper-large-v2#long-form-transcription)|0.1606|0.1083|0.1311|0.1604|
+|[faster-whisper](https://github.com/SYSTRAN/faster-whisper/)|0.0791|0.1497|0.0963|0.1483|
+|**[faster-whisper w/ batching](https://github.com/SYSTRAN/faster-whisper/pull/856)**|**0.0522**|**0.0898**|**0.0487**|**0.0882**|
+|[WhisperX](https://github.com/m-bain/whisperX/)\*|0.238|0.2551|0.2176|0.2575|
 
 <br>
 
diff --git a/NISV/cts_nl/res_unlabelled.md b/NISV/cts_nl/res_unlabelled.md
index c785de4..b826610 100644
--- a/NISV/cts_nl/res_unlabelled.md
+++ b/NISV/cts_nl/res_unlabelled.md
@@ -31,18 +31,18 @@ Here's a matrix with the **time** spent in total by each implementation **to loa
 |[WhisperX](https://github.com/m-bain/whisperX/)*|21m:29s|22m:14s|20m:28s|21m:36s|
 
 \* For WhisperX, a separate alignment model based on wav2vec 2.0 has been applied in order to obtain word-level timestamps. Therefore, the time measured contains the time to load the model, time to transcribe, and time to align to generate timestamps. Speaker diarization has also been applied for WhisperX, which is measured separately and covered in a different section.
-<!-- 
+
 <br>
 
-Here's also a matrix with the **Real-Time Factor or RTF** for short (defined as time to process all of the input divided by the duration of the input) for transcribing **9.02 hours of speech** (rounded to 4 decimals):
+Here's also a matrix with the **Real-Time Factor or RTF** for short (defined as time to process all of the input divided by the duration of the input) for transcribing **5.69 hours of speech** (rounded to 4 decimals):
 
 |RTF (process time/duration of audio)|large-v2 with `float16`|large-v2 with `float32`|large-v3 with `float16`|large-v3 with `float32`|
 |---|---|---|---|---|
-|[OpenAI](https://github.com/openai/whisper)|0.1918|0.1487|0.2164|0.1641|
-|[Huggingface (`transformers`)](https://huggingface.co/openai/whisper-large-v2#long-form-transcription)|0.0796|0.1206|0.077|0.1141|
-|[faster-whisper](https://github.com/SYSTRAN/faster-whisper/)|0.0718|0.1434|0.0728|0.1559|
-|**[faster-whisper w/ batching](https://github.com/SYSTRAN/faster-whisper/pull/856)**|**0.0231**|**0.0436**|**0.02**|**0.0412**|
-|[WhisperX](https://github.com/m-bain/whisperX/)\*|0.0459|0.0592|0.0475|0.058| -->
+|[OpenAI](https://github.com/openai/whisper)|0.2115|0.1675|0.2604|0.2049|
+|[Huggingface (`transformers`)](https://huggingface.co/openai/whisper-large-v2#long-form-transcription)|0.1139|0.1848|0.083|0.1835|
+|[faster-whisper](https://github.com/SYSTRAN/faster-whisper/)|0.0817|0.1647|0.0932|0.1559|
+|**[faster-whisper w/ batching](https://github.com/SYSTRAN/faster-whisper/pull/856)**|**0.0227**|**0.0433**|**0.0195**|**0.0402**|
+|[WhisperX](https://github.com/m-bain/whisperX/)\*|0.0629|0.0651|0.0599|0.0633|
 
 <br>