Timestamps returned with streaming transcription are wrong. #777
-
Which Deepgram product feature are you using?Deepgram API - STT Streaming DetailsThe start and end timestamp on a subset of transcription results are incorrect. We have noticed this across multiple inputs, I am attaching a simple example here. Please find the code and the input file where the issue happens in the attached zip. I am also attaching the request-id where the issue happened. This is the specific result with the incorrect timestamp. The speech for "Alright" does not start at 21.485, it is almost finished by that point. {
"type": "Results",
"channel_index": [
0,
1
],
"duration": 0.9699993,
"start": 21.0, # This timestamp is usually too early (but in this case it is correct), so we cannot use this.
"is_final": true,
"speech_final": true,
"channel": {
"alternatives": [
{
"transcript": "Alright.",
"confidence": 0.9297402,
"words": [
{
"word": "alright",
"start": 21.485, # This timestamp is usually correct, but in this case it is not.
"end": 21.97,
"confidence": 0.9297402,
"punctuated_word": "Alright."
}
]
}
]
},
"metadata": {
"request_id": "ecd69ba3-dd6e-4dc0-9abf-91d649530d5c",
"model_info": {
"name": "2-phonecall-nova",
"version": "2024-02-05.31606",
"arch": "nova-2"
},
"model_uuid": "9c7ae805-e600-4e0f-a6a2-725be88b7ede"
}
} Note that we have found that the outer "start" timestamp (in this case set to 21.0) is frequently very early, so we believe it is better to treat the start timestamp of the first word in the "words" array as the true start timestamp. In the same call, the previous result looks like this. In this case, you can see that the outer start timestamp is 17.23, which is many seconds before the 1st word's start timestamp, which is 20.029. The latter is correct, and we have generally found this to be the case. However, as pointed out above, the start timestamp of the 1st word is also occasionally wrong, leaving us unable to figure out what the true timestamp of the speech is. {
"type": "Results",
"channel_index": [
0,
1
],
"duration": 3.7700005,
"start": 17.23, # This timestamp is too early, the below timestamp is the right one.
"is_final": true,
"speech_final": true,
"channel": {
"alternatives": [
{
"transcript": "I got it.",
"confidence": 0.9349395,
"words": [
{
"word": "i",
"start": 20.029999, # This timestamp is correct.
"end": 20.189999,
"confidence": 0.61845225,
"punctuated_word": "I"
},
{
"word": "got",
"start": 20.189999,
"end": 20.59,
"confidence": 0.9349395,
"punctuated_word": "got"
},
{
"word": "it",
"start": 20.59,
"end": 21.0,
"confidence": 0.99053323,
"punctuated_word": "it."
}
]
}
]
},
"metadata": {
"request_id": "ecd69ba3-dd6e-4dc0-9abf-91d649530d5c",
"model_info": {
"name": "2-phonecall-nova",
"version": "2024-02-05.31606",
"arch": "nova-2"
},
"model_uuid": "9c7ae805-e600-4e0f-a6a2-725be88b7ede"
}
} If you are making a request to the Deepgram API, what is the full Deepgram URL you are making a request to?No response If you are making a request to the Deepgram API and have a request ID, please paste it below:ecd69ba3-dd6e-4dc0-9abf-91d649530d5c If possible, please attach your code or paste it into the text box.import sys def callback(_, result: LiveResultResponse, **kwargs): if name == "main":
If possible, please attach an example audio file to reproduce the issue. |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 1 reply
-
Thanks for asking your question about Deepgram! If you didn't already include it in your post, please be sure to add as much detail as possible so we can assist you efficiently, such as:
|
Beta Was this translation helpful? Give feedback.
-
Thank you for your report. I'm currently sharing this information with our engineers and will respond later after I find out more information for you. |
Beta Was this translation helpful? Give feedback.
-
The engineers are currently working on a fix for this issue. It is still in testing so not going to be released quite yet but I will update when it has been released. I hope this issue isn't too much of a problem for you and you can be patient while we get this right. Thank you! |
Beta Was this translation helpful? Give feedback.
-
Is the fix in production?
…On Tue, Jun 4, 2024 at 4:22 PM John Vajda (JV) ***@***.***> wrote:
Closed #777 <#777> as
resolved.
—
Reply to this email directly, view it on GitHub
<#777>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHZO2GIXE56ITYBIHDIWKLZFZD43AVCNFSM6AAAAABIED6VMKVHI2DSMVQWIX3LMV45UABFIRUXGY3VONZWS33OIV3GK3TUHI5E433UNFTGSY3BORUW63R3GEZTCOBSHA4A>
.
You are receiving this because you authored the thread.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
-
I also have the same problem. |
Beta Was this translation helpful? Give feedback.
-
Related issue: https://github.com/orgs/deepgram/discussions/545 For those following this thread, this is an issue Deepgram is still looking into, thank you for your patience. |
Beta Was this translation helpful? Give feedback.
The engineers are currently working on a fix for this issue. It is still in testing so not going to be released quite yet but I will update when it has been released. I hope this issue isn't too much of a problem for you and you can be patient while we get this right. Thank you!