-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rate adaptation seemingly unstable #44
Comments
Hi |
BTW... replace p4p1 with the applicable interface name :-) |
Hello, Thanks for getting back to me. I had done some testing with the tbf qdisc instead of just straight netem, but that didn't perform that well either. I've tried using the htb qdisc as you suggested, but I'm still seeing a lot of variability (including SCReAM's target bitrate shooting way beyond the 8Mbit/s that I'm trying to set: I'm wondering if this reaction is due to some bursting behaviour in the htb that I'm not that familiar with how to control. I've tried adding a hard ceiling of 8Mbit/s with a burst of 6000 and a cburst of 1500, but that doesn't seem to do much to help: |
Hi
This looks really strange and I have not seen these large variations before. One observation is that I see that you run with a quite large resolution (1920*1080), the test computers that I use cannot handle x264enc at this high resolution without overloading the CPU. Can you try the same experiment with a lower resultion like e.g 640*480 ?
/Ingemar
From: Sam Hurst ***@***.***>
Sent: Monday, 11 April 2022 12:41
To: EricssonResearch/scream ***@***.***>
Cc: Ingemar Johansson S ***@***.***>; Comment ***@***.***>
Subject: Re: [EricssonResearch/scream] Rate adaptation seemingly unstable (Issue #44)
Hello,
Thanks for getting back to me. I had done some testing with the tbf qdisc instead of just straight netem, but that didn't perform that well either. I've tried using the htb qdisc as you suggested, but I'm still seeing a lot of variability (including SCReAM's target bitrate shooting way beyond the 8Mbit/s that I'm trying to set:
[Image removed by sender. scream-gst-x264enc-htb]<https://user-images.githubusercontent.com/9945958/162723673-9c3a2c57-8805-4595-9446-9d38166ce13d.png>
I'm wondering if this reaction is due to some bursting behaviour in the htb that I'm not that familiar with how to control. I've tried adding a hard ceiling of 8Mbit/s with a burst of 6000 and a cburst of 1500, but that doesn't seem to do much to help:
[Image removed by sender. scream-gst-x264enc-htb-8m-ciel-burst]<https://user-images.githubusercontent.com/9945958/162723698-cad9c6c7-3d57-4187-b98a-028263d8e5d9.png>
—
Reply to this email directly, view it on GitHub<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-57011ba3d4059cde&q=1&e=57ec6095-b0dc-4097-98f5-3c435f52e388&u=https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%2Fissues%2F44%23issuecomment-1094893001>, or unsubscribe<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-2745ee6e9fc5c72b&q=1&e=57ec6095-b0dc-4097-98f5-3c435f52e388&u=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACRZ2GC6NNUQLBF6RDW7LVLVEP6U5ANCNFSM5S45DF6A>.
You are receiving this because you commented.Message ID: ***@***.***>
|
This is really strange. I probably need to try this out myself, but I don't believe I have time to do it until next week the earliest. Is it possible for you to try with the SCReAM BW test application on the same bottleneck ? |
Just run that and here's the result, with the In case it helps, here's the CSV from my screamtx run above: |
OK, this looks more reasonable. I have to look into what causes the problems with the plugin |
Can you by (with the plugin example) chance also log the RTP bitrate, i.e the bitrate that comes from the video encoder ? |
By "entire log", do you mean the whole GStreamer log? And I can probably get the actual bit rate using GstShark, I'm not sure if the x264enc element directly reports the bit rate output. If that's what you have in mind, I'll give that a go. |
You should get a quite verbose log on stdout, if you collect it the. I should be able to dig up the necessary info (I hope)
/Ingemar
Hämta Outlook för iOS<https://aka.ms/o0ukef>
…________________________________
Från: Sam Hurst ***@***.***>
Skickat: Monday, April 11, 2022 3:39:47 PM
Till: EricssonResearch/scream ***@***.***>
Kopia: Ingemar Johansson S ***@***.***>; Comment ***@***.***>
Ämne: Re: [EricssonResearch/scream] Rate adaptation seemingly unstable (Issue #44)
By "entire log", do you mean the whole GStreamer log? And I can probably get the actual bit rate using GstShark, I'm not sure if the x264enc element directly reports the bit rate output. If that's what you have in mind, I'll give that a go.
—
Reply to this email directly, view it on GitHub<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-00280ab810626af3&q=1&e=34022b2f-825c-437a-abc9-f616f43c1385&u=https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%2Fissues%2F44%23issuecomment-1095065992>, or unsubscribe<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-88f005766db065f9&q=1&e=34022b2f-825c-437a-abc9-f616f43c1385&u=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACRZ2GBMPY4ILAMERL6ZCULVEQTSHANCNFSM5S45DF6A>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Here's the GStreamer log with GstShark bitrate logging as well as GST_DEBUG=":2,scream:9". I didn't seem to get much output from GStreamer by itself, so I turned up the element debugging. Hopefully this gets you what you're looking for. gst-x264enc-htb-5m-640x480-gstshark-dbg9.csv |
That is possible. Before raising this ticket, I had also performed testing with vaapih264enc, but that had even worse performance (combined results with the first graph on this issue, although the colours are different): I'm using the version of libx264 that came with my distribution, version 160. The parameters which I'm passing into x264enc are If I run another test without any tuning, then the graph looks like this: However, I'm almost certain this is because the encoder is hitting it's internal encoding bit rate limit for a 640x480 picture, if I up the frame size to 1024x576 then we're back to the yo-yo:
|
OK, thanks. In the vaapi example it looks like an I-frame is generated every 4 seconds or so. You can perhaps try and set keyframe-period=100 , the spikes should then occur less often. |
I've tried playing with the qp-step and vbv-buf-capacity parameters like you suggested, but cranking up the qp-step to 32 and the vbv-buf-capacity down to 100 milliseconds doesn't make much difference. Sadly, I don't have any immediate access to a Jetson nano, nor any other NVidia encoding/decoding hardware. |
OK, |
OK. Yes, one should expect more sparse peaks. I was hoping that it would reduce the drops but that does not work. Not sure what more can be done. It is difficult to handle such cases when the video coder rate control loops are this slow. Somehow I believe that it much be some setting. Perhaps the ultrafast preset that makes things bad ?, but I am just speculating here. |
It's better to have TC set on a dedicated device. When this is not possible:
The default and max socket send buffer is set using the command: To make the change permanent, add the following lines to the /etc/sysctl.conf file, which is used during the boot process: |
Apologies for being quiet here for a while, a mixture of other projects taking time as well as some personal leave last month meaning it's been longer to get this response out than I'd hoped. I ended up going away and trying to engage with both the GStreamer and x264 developers to see if there was any way of reducing the latency on the rate controller within the encoder, but this excercise did not bear much fruit. However, as a part of this effort I did end up writing a simple test that would take the SCReAM algorithm out of the loop and just allow me to see how the encoder reacted to standalone bit rate changes. I note that especially during times of observed congestion or rate limiting, the SCReAM algorithm could update the bitrate property serveral times a second, making it difficult to actually observe the reaction to the change. Here's an example of x264enc changing from 10Mbit/s to 8MBit/s, when running with a GOP length of 10: The data shown in the graph above is taken from gst-shark's buffer and bitrate tracers, each blue cross is the size of an encoded frame (against the left y-axis), the golden line is the bitrate set on the encoder (against the right y-axis), and the red line is a per-second average bitrate of the buffers that are flowing from the encoder. It takes at least a second for the x264 encoder to even begin ramping down it's encoded bit rate, and over two seconds before the average has reached the requested bit rate. Interestingly, it doesn't even seem to track with the GOP length, as I'd expect the encoder to use that as a natural point to break it's target bit rate, but x264enc doesn't seem to do this. As you suggested (#44 (comment)) I managed to get some testing done with an NVidia GPU (RTX 3080) using nvenc. Using the same test as earlier, I can see that nvenc reacts quite differently to x264enc, with a reaction to the bit rate change occuring almost immediately: However, something that I did notice that is different to the behaviour of x264enc is that whenever the encoder is reconfigured with a new bit rate, it abandons the current GOP and creates a new IDR. The first few buffers of this are then fairly large, and no matter how much I try to tune nvenc, I can't seem to tame that behaviour. The encoder certainly does it's best to keep the average bitrate down after this, and the average paced out over a second is well below the requested bit rate. I then moved onto running screamtx with nvenc, and I feel that the issue whereby every reconfiguration with a new bit rate starts to cause serious problems. I restricted the bandwidth to 8Mbit/s overall again, with a maximum allowed bandwidth of 10Mbit/s (with buffer sizes set how Jacob suggests). (Top graph is plotted from the SCReAM CSV file, bottom graph is plotted using output from gst-shark similarly to the ones above) It looks like the SCReAM rate controller tries to set it's initial bandwidth, the encoder massively overshoots and then the rate controller tries to turn down the rate, which causes the encoder to keep overshooting. This keeps happening so much that the rate controller seems to just keep trending along the bottom of the allowed bit rate. Is there any way of backing off the SCReAM congestion controller so that it doesn't do quite so many updates? I feel that this might solve this particular problem. |
Hi Sam,
The problem with nvenc that you’ve noticed:
“that whenever the encoder is reconfigured with a new bit rate, it abandons the current GOP and creates a new IDR.”
Related to change:
https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad.git (fetch)
09fd34dbb0 sys/nvcodec/gstnvbaseenc.c (Seungha Yang 2019-08-31 17:34:13 +0900 1705) reconfigure_params.resetEncoder = TRUE;
09fd34dbb0 sys/nvcodec/gstnvbaseenc.c (Seungha Yang 2019-08-31 17:34:13 +0900 1706) reconfigure_params.forceIDR = TRUE;
09fd34dbb0 sys/nvcodec/gstnvbaseenc.c (Seungha Yang 2019-08-31 17:34:13 +0900 1707) reconfigure = TRU
09fd34dbb0 sys/nvcodec/gstnvbaseenc.c (Seungha Yang 2019-08-31 17:34:13 +0900 2439) GST_VIDEO_CODEC_FRAME_SET_FORCE_KEYFRAME (frame);
You might want to try to revert that patch or part of that patch and rebuild nvenc plugins.
“Is there any way of backing off the SCReAM congestion controller so that it doesn't do quite so many updates? I feel that this might solve this particular problem.”
Let’s us think about this issue.
Regards,
|
Hi |
Thanks to Jacob's pointer, I've modified the nvbaseenc code to allow me to set those values to FALSE, and the NVidia encoder doesn't generate a new IDR every time there's a reconfiguration. This fixes the issue I was seeing with the rate never getting off the bottom of the graph, as it now behaves fairly normally: However, I'm still seeing the oscillation that I was seeing with x264enc, even though this encoder is much better at reacting to bit rate changes. |
Hi Jacob, The green line on my graph indicates the targetBitrate as set by the SCReAM congestion controller. I should have specified that my testing was performed with a network-level limitation of 8Mbit/s, and I'm trying to understand what the behaviour of the scream congestion controller is when faced with a network that has a lower amount of bandwidth than what the SCReAM congestion controller was originally configured to use. For example, a mobile user streaming video that moves to a new mast that has higher levels of congestion and/or a lower peak throughput available for that user. From my previous discussions with Ingemar, it seemed like the expected behaviour would be that the congestion controller would trend towards the network limit, and not keep going over and then dropping ~25% of the bit rate in reaction to network congestion. Currently, the only way I get a flat line for the target bitrate is if the configured SCReAM maximum bit rate is lower than the bandwidth available (i.e. network has 9Mbit/s of bandwidth, SCReAM configured with a maximum of 8Mbit/s). -Sam |
Hi Sam. As SCReAM adapts against the detection of increased queue delay you'll indeed get the behavior as shown in your figure. The reason is that once the queue starts to grow, you are essentially one round trip behind with the rate reduction, thus you'll get an overshoot. There are ways to reduce the magnitude of this oscillation. Try for instance with these extra options |
Hi I have now added a -hysteresis option that should reduce the amount of small rate changes quite considerably . For instance with -hysteresis 0.1 the bitrate must increase more than 10% or decrease more than 2.5% (1/4th of the value) for a new rate value to be presented to the encoder. If that condition is not met , then the previous rate value is returned by the getTargetBitrate(..) function . |
Hi Ingemar, With what you say about the increased queueing delay, would a potential fix be to make the queue larger so that it covers multiple round trips? I could also try making the round trip itself longer using tc, and experiment with that. At the moment, I'm running on two hosts connected directly to one another so the round trip time is a couple of milliseconds at worst. I've tried adding the options you described, but all it appears to do is decrease the frequency of overshoots that I see, not the amplitude of the reaction from the congestion controller. Many thanks for all your help to date by the way, this has all been quite helpful. I look forward to testing the hysteresis and see if that helps matters. -Sam |
Hi
Yes, a very short RTT will itself increase the rate adaptation speed in SCReAM, we have mostly tried of an links with RTT 10ms or more. /Ingemar |
I've since tried again with adding additional delay into my test network to simulate longer round trip times, including setting the qdisc limits as described in the netem documentation. The graph below shows the target bitrate for a test run with the following screamtx settings:
With round trip times of ~1ms (blue line), 20ms (green line) and 100ms (red line): Running with a more realistic RTT seems to reduce the amplitude of the bit rate back offs a bit, but they're still pretty extreme. Here's the average requested bitrates over the three tests:
So is this just what is to be expected with the setup that I've got at the moment? |
Hi |
Hi Ingemar, Here's the SCReAM CSV which should have the values in it from my 20ms test run: firebrand-nvh264enc-scream-sender-no-idr-on-reconfigure-ratescale-8Mbit-20msRTT-limit40.csv |
Thanks. I plotted some extra, see below. It seems like packet loss occur at regular intervals, and that explains the large reduction in bitrate. What is also noticeable is that the queue delay increases rapidly on occasions and that may be attributed to how the netem rate policing/shaping is implemented ?. |
Hi Ingemar, I've been doing some more investigation with different queuing disciplines to perform the rate policing, but all of the ones I have tried (netem, htb, tbf, cake) result in basically the same plotted graph with the large downward swings when the target bit rate exceeds the rate set by the traffic shaping, even when introducing additional round trip time as described above. As an aside, you mentioned in your previous message that you thought that the specific pattern of packet loss might be the cause of the swings. I know that SCReAM is designed to be used with ECN as a forewarning of packet loss, but GStreamer currently doesn't support ECN in udpsink and udpsrc. So I spent a bit of time adding ECN support to GStreamer so I could test SCReAM with that instead of having packet losses. In case you're interested, I am actually contributing my patch back to the GStreamer project, and you can find the merge request here. You should also find attached below a basic patch adding support for reading the GstNetEcnMeta from the buffers in screamrx, as well as a patch adding it to the gstreamer-rs rust bindings (renamed to .txt so GitHub will let me attach them). 0001-Add-ECN-support-to-screamrx.patch.txt I performed some testing with these ECN changes and the ECN-aware CAKE queuing discipline, and it doesn't seem to have made any difference to the actual results: In the above graph, the blue line is running with no ECN, the green line is running with ECN and the red line is running with ECN and a ~100ms RTT. |
Hi Sam,
|
Hi Jacob, packetsCe does increase over the duration of the test, so that implies that ECN is working. The final value (144) doesn't quite correlate to the wireshark capture that I took alongside the test (which only counts 133 packets as being marked with CE), but the frequency and timing of the increases as shown by rateCe does seem to correlate with what I see in wireshark. In case it helps, here is the CSV file from the above test run: firebrand-nvh264enc-scream-sender-max10Mbit-cake9Mbit-ecn-with-bigbufs.csv And I currently only set TC once before the beginning of the test, so the bit rate ceiling is constant throughout in an effort to understand this specific oscillating behaviour when exceeding the bit rate ceiling. -Sam |
Hi Sam.
|
By which I mean the maximum bit rate configured in tc. I'm trying to understand SCReAM's behaviour when the available network bandwidth drops below that configured as the maximum in screamtx. At the moment, in order to reduce the number of variables in play, I'm keeping the maximum bit rate of the test network static throughout the test. |
Hi Sam,
|
Hello, Apologies for taking a while to get back to this, I have been pulled away onto other things. I went looking to understand why the packet loss was happening in the groups that Ingemar observed. As a first step, I moved away from my local test network using tc and managed to set up a test that runs over the public internet, so as to have a more representative test. The VDSL2 connection doing the upload has a reliable maximum throughput of about 17MBit/s, so that would be my target. However, I discovered that even after increasing the send and receive buffers at both ends of the test, that even running at 6MBit/s I was still seeing those same bursts of packet loss. After analysing the traffic in wireshark, it is clear that the RTP packets for each frame are being bursted out together. All of the analysed packet loss occurred on packets at the end of these bursts. The following wireshark I/O graph shows the number of packets transmitted every 10ms, with the video running at 25fps so high enough resolution to see the peaks every 40ms. Using the RTP analysis tools, you can see that the delta time between packets is very small within packets for a given frame, but between frames the delta is large: I note that the SCReAM library (in code/ScreamTx.cpp) has a packet pacing algorithm in it, but I'm not sure it's working effectively in this instance. I started analysing the flow of buffers back out of the SCReAM library into GStreamer land, and ended up graphing the latency of buffers being passed into the screamtx callback function using the GST_TRACE message there. The following graph shows the time that buffers pass back to screamtx on the x axis, with the time since the last buffer was received on the y axis. The graph is zoomed into about 3 seconds of a stream, and well after the 5 seconds backoff period has elapsed inside the pacing algorithm: This shows that there are clusters of buffers with large, often ~40ms (1 frame time) differences between them. Some of them are even more than that, but I'm not sure why that's happening at the moment. I think this isn't related to the problem at hand, but I note that the screamtx element does not change the pts or dts timestamps on any of the GstBuffers that it sends onwards. This means that GStreamer may well end up buffering all those packets up before releasing to udpsink, but I note that all your examples set the |
Hi Sam, |
I started doing some more poking about in ScreamTx.cpp, adding some debug prints in to print various values so I could try and understand what the maths is doing in the algorithm around pacing to see if I could figure out why it wasn't working. During some of my testing, I noticed a few periods where it was actually pacing the packets out as I'd expect it to, so I went looking. The following graph is an expanded form of the one I showed in my last comment, with the blue line again showing the time that buffers pass back into screamtx against the time since the last buffer was received. I've also added two new lines which track values inside ScreamTx.cpp, red for The blue and red lines should be tighter synchronised, but it's basically impossible to synchronise the timestamps between GStreamer's logging for the blue line and any timestamps in the SCReAM code for the red and green lines - but it's safe to imagine that the blue and red spikes should be overlayed upon one another. What I notice from this is that |
Hi |
Hello, Thanks for the response, even when on vacation. Turning off the packet pacing when the estimated bit rate is higher than the actual throughput is a very interesting decision. I can't find anything in RFC 8298 that describes such behaviour, so I'm assuming this is just a feature of this particular implementation? I'm not sure that it's worth it for that 10-15ms gain, because of the problems that it can cause on any network. The application at each end doesn't really have any visibility into the actual levels of congestion on the network, outside of packet losses and ECN flags. If the buffers in the switches and routers on the network path are already fairly high then the bursts of traffic could easily overwhelm them and then cause those packet loss events. I'd guess this is why RFC 8085 specifies that you should always pace UDP traffic. I've tried fiddling with the code to try and force packet pacing to be on all the time, but I haven't been successful thus far. -Sam |
Hi It should be possible to force packet pacing on by changing line I cannot guarantee that it will work right out of the box, there is for instance a risk that the rate ramp-up will slow down, at least initially. It will be a few days before I can try it out myself. /Ingemar |
Hi Ingemar, I tried your suggestion and while it resulted in -Sam |
It is in order that you'll get a gap between last RTP packet of the previous frame and the first RTP packet of the following frame. This is because the pacing bitrate is a bit higher than the nominal (video coder) bitrate, the reason is that the video frame sizes fluctuate a bit +/- around the average frame size and it is undesirable to let RTP packets stay in the RTP queue just because the video frame was slightly larger than normal. |
Hello Ingemar, Jacob; Thanks again for all your comments and help so far. I've been wondering if you've managed to have a look into this yourselves? I was also wondering if it might be worth trying to set up a meeting where we can discuss things in real time rather than via GitHub. If you think that might be a good idea, do you perhaps have some availability next week? Best regards, |
Hi
A short reply, still on vacation with only cellphone at my hands.
Packet pacing is actually turned off when the link capacity is considerably higher than the transmitted bitrate. The estimated queue delay is then very low. This is to avoid that video frames are delayed unnecessarily when there is no congestion.
To be honest, I believe that these 10-15ms gained in e2e latency by turning off the packet pacing is more of academic importance.
Thanks anyway for digging in the code like this. I definitely believe there is potential for improvement here.
/Ingemar
Hämta Outlook för iOS<https://aka.ms/o0ukef>
…________________________________
Från: Sam Hurst ***@***.***>
Skickat: fredag, augusti 5, 2022 5:59 em
Till: EricssonResearch/scream ***@***.***>
Kopia: Ingemar Johansson S ***@***.***>; Comment ***@***.***>
Ämne: Re: [EricssonResearch/scream] Rate adaptation seemingly unstable (Issue #44)
I started doing some more poking about in ScreamTx.cpp, adding some debug prints in to print various values so I could try and understand what the maths is doing in the algorithm around pacing to see if I could figure out why it wasn't working. During some of my testing, I noticed a few periods where it was actually pacing the packets out as I'd expect it to, so I went looking. The following graph is an expanded form of the one I showed in my last comment, with the blue line again showing the time that buffers pass back into screamtx against the time since the last buffer was received.
[screamtx-debugging-long-rd646-ntpdbg-paceheadroom-0 1-no-1 5-until-loss]<https://user-images.githubusercontent.com/9945958/183113614-650a555e-8a57-40b1-870b-28d610c64930.png>
I've also added two new lines which track values inside ScreamTx.cpp, red for nextTransmitT_rtp within isOkToTransmit at line 524, before it performs this if statement<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-30a548ff09dddefe&q=1&e=e234035c-1b2d-4ed6-b841-217537f35307&u=https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%2Fblob%2Fmaster%2Fcode%2FScreamTx.cpp%23L524>; and green for paceInterval_ntp at line 606 before it is used to update nextTransmitT_ntp<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-6d9dd04c07619d05&q=1&e=e234035c-1b2d-4ed6-b841-217537f35307&u=https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%2Fblob%2Fmaster%2Fcode%2FScreamTx.cpp%23L606>.
The blue and red lines should be tighter synchronised, but it's basically impossible to synchronise the timestamps between GStreamer's logging for the blue line and any timestamps in the SCReAM code for the red and green lines - but it's safe to imagine that the blue and red spikes should be overlayed upon one another.
What I notice from this is that paceInterval_ntp is always 0 whenever it's not pacing correctly, but when it is non-zero then the pacing happens correctly. I think this is because paceInterval is reset to kMinPaceInterval here<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-99cd4f4f076f4439&q=1&e=e234035c-1b2d-4ed6-b841-217537f35307&u=https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%2Fblob%2Fmaster%2Fcode%2FScreamTx.cpp%23L1220> which in my case always seems to be 0, but then the following if statement never evaluates to true because queueDelayFractionAvg is < 0.02. I haven't yet been able to get my head around why this is the case yet, but I'll have another look next week if I can. In the meantime, any insight would be welcome to help understand this a bit better.
—
Reply to this email directly, view it on GitHub<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-4f357cf48ac4536a&q=1&e=e234035c-1b2d-4ed6-b841-217537f35307&u=https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%2Fissues%2F44%23issuecomment-1206610810>, or unsubscribe<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-e1fc3958133dedd3&q=1&e=e234035c-1b2d-4ed6-b841-217537f35307&u=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACRZ2GCFH5FZJTPJUKVR3FDVXU25LANCNFSM5S45DF6A>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Hi Ingemar, It looks like your previous message is a copy of a previous message sent to this issue - was this intentional? Or did your mail client eat the message you wanted to reply with? -Sam |
The message 9 days age was a follow-up on the message posted august 16 |
Hi Sam ,
What happened to your request ?
Regards,
* Jacob
From: Sam Hurst ***@***.***>
Sent: Thursday, July 7, 2022 1:58 AM
To: EricssonResearch/scream ***@***.***>
Cc: Jacob Teplitsky ***@***.***>; Comment ***@***.***>
Subject: Re: [EricssonResearch/scream] Rate adaptation seemingly unstable (Issue #44)
Hi Ingemar,
I've been doing some more investigation with different queuing disciplines to perform the rate policing, but all of the ones I have tried (netem, htb, tbf, cake) result in basically the same plotted graph with the large downward swings when the target bit rate exceeds the rate set by the traffic shaping, even when introducing additional round trip time as described above.
As an aside, you mentioned in your previous message that you thought that the specific pattern of packet loss might be the cause of the swings. I know that SCReAM is designed to be used with ECN as a forewarning of packet loss, but GStreamer currently doesn't support ECN in udpsink and udpsrc. So I spent a bit of time adding ECN support to GStreamer so I could test SCReAM with that instead of having packet losses. In case you're interested, I am actually contributing my patch back to the GStreamer project, and you can find the merge request here<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-a5bc496e750351cb&q=1&e=6d725425-5af4-4684-9c53-0e6f136cb5a1&u=https%3A%2F%2Fgitlab.freedesktop.org%2Fgstreamer%2Fgstreamer%2F-%2Fmerge_requests%2F2717>. You should also find attached below a basic patch adding support for reading the GstNetEcnMeta from the buffers in screamrx, as well as a patch adding it to the gstreamer-rs rust bindings (renamed to .txt so GitHub will let me attach them).
0001-Add-ECN-support-to-screamrx.patch.txt<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-9b6d1cc4c7836826&q=1&e=6d725425-5af4-4684-9c53-0e6f136cb5a1&u=https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%2Ffiles%2F9061968%2F0001-Add-ECN-support-to-screamrx.patch.txt>
0001-Add-GstNetEcnMeta-binding.patch.txt<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-f86e1d20c1857f0f&q=1&e=6d725425-5af4-4684-9c53-0e6f136cb5a1&u=https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%2Ffiles%2F9061978%2F0001-Add-GstNetEcnMeta-binding.patch.txt>
I performed some testing with these ECN changes and the ECN-aware CAKE queuing discipline, and it doesn't seem to have made any difference to the actual results:
<https://user-images.githubusercontent.com/9945958/177733920-7c4fac79-7fbc-4e2b-a7d4-08ae735ddf93.png>
In the above graph, the blue line is running with no ECN, the green line is running with ECN and the red line is running with ECN and a ~100ms RTT.
—
Reply to this email directly, view it on GitHub<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-b5a902ee0593dbcc&q=1&e=6d725425-5af4-4684-9c53-0e6f136cb5a1&u=https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%2Fissues%2F44%23issuecomment-1177276822>, or unsubscribe<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-abe8d2cd220c6ec0&q=1&e=6d725425-5af4-4684-9c53-0e6f136cb5a1&u=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAMBRFBHHAPW2VFOGQL3SIS3VS2LYRANCNFSM5S45DF6A>.
You are receiving this because you commented.Message ID: ***@***.******@***.***>>
|
Hello,
I've been working with SCReAM a bit more lately, and I've encountered an issue where when using the GStreamer elements over a constrained network link, the rate adaptation seems unstable. I'm trying to simulate what happens when during a stream, the network capacity drops below the value configured as the maximum bit rate for the SCReAM sender. The sender is configured with a maximum bit rate of 10Mbit/s, using the settings
-initrate 2500 -minrate 500 -maxrate 10000 -nosummary
. The full GStreamer sending pipeline is below:export SENDPIPELINE="videotestsrc is-live=true pattern=\"smpte\" horizontal-speed=10 ! video/x-raw,format=I420,width=1920,height=1080,framerate=25/1 ! x264enc name=video threads=4 speed-preset=ultrafast tune=fastdecode+zerolatency ! queue ! rtph264pay ssrc=1 ! queue max-size-buffers=2 max-size-bytes=0 max-size-time=0 ! screamtx name=screamtx params=\" -initrate 2500 -minrate 500 -maxrate 10000 -nosummary\" ! udpsink host=10.0.0.168 port=5000 sync=true rtpbin name=r udpsrc port=6001 address=10.0.0.194 ! queue ! screamtx.rtcp_sink screamtx.rtcp_src ! r.recv_rtcp_sink_0 "
I'm using the netem tool to constrain the bandwidth of the link to include 40ms of latency at each end (i.e. 80ms RTT), and limiting the sending rate of both machines to 8Mbit/s:
sudo tc qdisc add dev enp0s31f6 root netem delay 40ms rate 8Mbit limit 40
This is a graph of the actual transmission rate (green) and the target encoder bit rate (blue) looks like with the network restriction applied for a full five minutes:
I think it's safe to say that the target bit rate selected is quite erratic, and it doesn't seem to match up with the graphs shown in README.md, where the line does seem to wobble a bit but it stays tightly bound around one point. I've also run the scream_bw_test_tx/rx application and I get results like this, which show a still unstable target encoder bitrate but it's a lot more closely grouped.
Using iperf3 in UDP mode, I see that the actual performance of the network is fairly stable, sending 10Mbit/s of traffic results in a pretty uniform 7.77Mbit/s of actual throughput.
I suppose my real question is - is this expected behaviour? The huge swings in target bit rate cause a lot of decoding artifacts in the video stream, and I see a lot of packet loss as it keeps bouncing off the limit. If this is not expected behaviour, can you tell me how best to optimise my sending pipeline to suit?
The text was updated successfully, but these errors were encountered: