-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixing audio and video with matroskamux
causing deadlocks
#59
Comments
It is also reproducible when I run example pipeline from README.md - Record website video + audio (with audiomixer). Just need to run some CPU heavy task in the background to trigger lags. After some time of lagging, it hangs forever. |
Both pipelines work with no issue on my end (gst main). What version of GStreamer are you using? Have you tried removing either audio or video ? |
GStreamer 1.20.2 (GIT) It is reproducible when cefbin starts dropping audio or video data due to CPU overload. I've tried to remove audio/video tracks from muxer and also tried to remove cef.audio pad from audiomixer leaving only silence. In both cases, the issue was not reproducible. The only thing, that helped to make my pipeline with audio and video not hanging on overload is to make the queue between cef.audio and audiomixer leaky. |
Just a guess, but did you monitor memory usage? Maybe the queue feels up and the operating system starts to swap memory? |
Why would cefbin drop data? |
Yes, I can confirm that this happens when I have high CPU/memory usage. This causes the cefbin to start lagging. But at some point of time the pipeline completely stuck and never continues to play the pipeline. Even if CPU starts idling.
Sorry, I meant it starts lagging. But after a few seconds of lagging, the pipeline completely hangs. Just a guess, can the "lagging" of audio/video pads on cefbin causing some pts desync on both pads? |
In |
I experimented a bit in this branch, maybe some folks would like to give feedback! https://github.com/philn/gstcefsrc/tree/audio-things |
I tested your branch. It brings a change from the previous behavior. Audio does not drift as before (however, it needs more testing). I have attached two files for comparison: "out_new.mp4" is with the latest changes. I tested on virtualbox with 2 cpu 2.5 GHz to show the problem faster. With the new changes you can clearly hear crackling in the audio stream. Sometimes I also heard modulations of the sound. out_new.mp4out_old.mp4 |
Maybe I'm wrong, but it looks like the audio is synchronized to the video, and it should be the other way around. Maybe the audio PTS from CEF needs to be set as the video PTS. |
https://bitbucket.org/chromiumembedded/cef/issues/2995/audio-capture-susceptible-to-packet-loss#comment-59277083 sounds (hah) extremely related |
I don't remember to be honest, but design wise my approach was: attach whatever audio was produced since the last video frame was produced, and assume CEF provides us with live, synchronized audio and video data. |
I think the changes made by @philn improve a lot. The most important improvement is that using audio PTS from CEF allows the audio to fix itself after a heavy CPU load. The previous implementation had no way to fix itself. I suggest that these changes should be merged. |
I think the patch makes sense too, but I would like to test too if you don't mind, using virtualbox is a nice trick thanks @hevc :) @pldin601 can you please try Phil's patch on your end? I've added this to my TODO list in any case :) |
Yep, going to check. UPD. It worked much better! I heard small cracks when CPU usage was intensive, but the sound seems to be recovered rapidly. One thing I noticed - when the first audio track has ended, the next one started very quickly, and it looked like the first second of audio was played instantly. https://drive.google.com/file/d/1V_rjj8VyP0bqgX-jRdWdMfINXDwyWpjB/view?usp=sharing Rewind to 06:00 to hear this. Maybe that happened because there was a time gap between first and next audio tracks, but pts logic assumes that audio plays endlessly? UPD2. Tested again my pipeline with the same test. Still freezing at random time when I compile smthng in the background... |
Also add an audio meta to buffers, for good measure, and set discont flag when needed. This should help with A/V sync issues and audio cracks reported in centricular#59.
I've got a new patch in #60 -- testing welcome ;) |
Also add an audio meta to buffers, for good measure, and set discont flag when needed. This should help with A/V sync issues and audio cracks reported in centricular#59.
Also add an audio meta to buffers, for good measure, and set discont flag when needed. This should help with A/V sync issues and audio cracks reported in centricular#59.
Audio crackles can still occur in recordings. Look at ~7s. Tested on 4x2.5 GHz. unb.mp4 |
Might be another bug. |
I imagine that's exactly what https://bitbucket.org/chromiumembedded/cef/issues/2995/audio-capture-susceptible-to-packet-loss#comment-59277083 is referring to. To be honest, I'm completely happy with a patch that causes us to not deadlock in these situations, if audio crackles do occur and are caused by CEF, I would conclude that:
@pldin601 can you test the updated version for freezes ? |
From what I can see the chrome issue is fixed https://chromium.googlesource.com/chromium/src/+/5b5632260b8fb40a07b350bb5a6b4e21b6bfda2e%5E%21/#F1 year and a half ago. I wonder if we should test audio crackling in isolation. (Only chrome, cef) Would help to point out where issue comes from |
@MathieuDuponchelle Just tested with latest @philn patch. No freezes so far. Tested a few times. Will test in application for audio cracks soon... |
Thanks a lot for checking @reinismu :) Testing in isolation would be useful yes, for instance using the cef client example, I wonder if it was updated to also output audio. @pldin601 that sounds like good news, were you able to reproduce reliably without the patch? If so, then I'm OK with getting the patch in, we can still keep tracking the exact source of the crackles after that. Everyone OK with this? |
@MathieuDuponchelle The cpu looked like this when the audio crackles occurred: 2.5 GHz is typical for virtual machines. |
@hevc that's good to know, can you try with the sample CEF application? |
After a few more retries and after tune of CPU limits, I still able to reproduce the freeze. :( Here is another backtrace of the process when it freezes: Are we sure that those freezes are related to the cefsrc behavior or maybe it's matroska muxer bug? Why when I add "leaky" parameter to the queue it stops freezing? UPD: Probably, I found the reason why freezes are happening.
Maybe i'm wrong, but it looks like it happens because matroskamuxer awaits for a video buffer to arrive to muxer:video_0 to align it with the audio buffer that was peeked on muxer:audio_0 pad. This video buffer will never come from cef.video pad because cef element was completely blocked because audio buffers queue is full. So, pipeline has a lot of audio data, the muxer blocks pipeline on waiting for video buffer and also cef blocks pipeline on waiting for audio data to be consumed. Probably, this happens because of audio mixer element. It produces "silence" even if cef element is lagging and this causes audio queues to overrun. That explains, why the leaky property helps. It drops audio data coming from cef when audiomixer not able to consume it in time. |
@pldin601 this is surprising, why is the audio queue going full in the first place? |
@MathieuDuponchelle Yes, it's unclear for me too. Here is another pipeline where freeze 100% reproducible. I just added names to each element so the gdb backtrace was easier to debug:
And here is gdb backtrace when pipeline was blocked: Look at Thread 5 (q4:src). It's thread of the queue between cef.video and muxer:
It seems to be idle because no data in it. Look at thread 4 (q1:src). This thread of the queue between audiomixer and muxer.
It looks like it blocked on pushing data to the muxer's audio_0 pad. And finally, Thread 7 of the queue between cef.audio and audiomixer:
It stuck on pushing data to audiomixer. There are some |
Maybe the answer is somewhere in this thread:
It looks like audio and video from cefsrc are handled in a single thread, and |
That is the case yes. That doesn't tell us why the audio queue runs full however, have you figured out which queue was running full exactly, and upon reaching which limit? |
I'm not sure how can I monitor queue levels with UPD: I've tested last @philn patch in our application and haven't heard any audio cracks so far. Maybe I'm lucky. Possible freezes I've workarounded with a leaky queue. |
@pldin601 you can monitor queue levels with appropriately picked log levels and filtering, I can't tell you the exact command but you should be able to find the relevant filter by looking at the queue:6 logs. The other question is what limit gets triggered? Is it |
Here's in queue logs:
If I understood, q3 and q1 were triggered with |
The plot thickens :) If there is data in both queues feeding the muxer ( |
I found difference between main and proposed MR patch. Run: where
The pipeline will show a blue screen on the main branch. The pipeline hangs on MR patch. Adding audio will unlock the MR pipeline. e.g.
|
@hevc probably best to comment on the MR :) |
Also add an audio meta to buffers, for good measure, and set discont flag when needed. This should help with A/V sync issues reported in centricular#59.
@hevc I updated the PR, please test and report issues there ;) |
Also add an audio meta to buffers, for good measure, and set discont flag when needed. This should help with A/V sync issues reported in centricular#59.
We have the same issue on the latest master using Under heavy CPU load, the pipeline will stall and not recover. We are using an audiomixer per the example pipeline. We'd like to determine how to get this resolved, as this is used in a commercial application. |
Also add an audio meta to buffers, for good measure, and set discont flag when needed. This should help with A/V sync issues reported in #59.
I ran into this issue again today. I hacked a solution to fix my initial problem a while ago, but the issue is still there. This command will deadlock. Wait around 5 seconds and then send EOS. Note the FPS is 10.
If this were code and you connect to the overrun signal on the audio queues you will see a lot of callbacks. Running at 30 FPS is OK(ish), and the pipeline will not deadlock at least. Running this command is fine
The only difference here is the the use of nvh264enc. I have not had time to dig into the debug logs. If I get a chance I will at some point. |
I have not debug this yet, but setting |
Test pipeline:
When I launch this pipeline with gst-launch-1.0 it stuck at random time positions. Here's gdb backtrace when it happens:
Can't reproduce the issue when GST_DEBUG=5 or higher, so can't provide detailed logs.
The text was updated successfully, but these errors were encountered: