-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Do CC reaction before largest_acked
#2117
base: main
Are you sure you want to change the base?
Conversation
Packets are only declared as lost relative to `largest_acked`. If we hit a PTO while we don't have a largest_acked yet, also do a congestion control reaction (because otherwise none would happen). Broken out of mozilla#1998
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2117 +/- ##
=======================================
Coverage 95.35% 95.35%
=======================================
Files 112 112
Lines 36336 36353 +17
=======================================
+ Hits 34648 34665 +17
Misses 1688 1688 ☔ View full report in Codecov by Sentry. |
Failed Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
Benchmark resultsPerformance differences relative to 55e3a93. coalesce_acked_from_zero 1+1 entries: 💚 Performance has improved.time: [99.172 ns 99.508 ns 99.849 ns] change: [-12.518% -12.118% -11.754%] (p = 0.00 < 0.05) coalesce_acked_from_zero 3+1 entries: 💚 Performance has improved.time: [116.98 ns 117.37 ns 117.77 ns] change: [-33.509% -33.205% -32.883%] (p = 0.00 < 0.05) coalesce_acked_from_zero 10+1 entries: 💚 Performance has improved.time: [116.41 ns 116.86 ns 117.39 ns] change: [-39.701% -34.919% -31.350%] (p = 0.00 < 0.05) coalesce_acked_from_zero 1000+1 entries: 💚 Performance has improved.time: [97.101 ns 97.244 ns 97.401 ns] change: [-32.461% -32.017% -31.593%] (p = 0.00 < 0.05) RxStreamOrderer::inbound_frame(): Change within noise threshold.time: [111.73 ms 111.87 ms 112.09 ms] change: [+0.2883% +0.4298% +0.6314%] (p = 0.00 < 0.05) transfer/pacing-false/varying-seeds: No change in performance detected.time: [26.276 ms 27.355 ms 28.450 ms] change: [-9.3568% -4.2882% +0.8907%] (p = 0.12 > 0.05) transfer/pacing-true/varying-seeds: No change in performance detected.time: [34.967 ms 36.613 ms 38.279 ms] change: [-10.397% -4.4172% +1.7587%] (p = 0.16 > 0.05) transfer/pacing-false/same-seed: Change within noise threshold.time: [24.989 ms 25.912 ms 26.848 ms] change: [-9.6160% -5.2548% -0.8232%] (p = 0.02 < 0.05) transfer/pacing-true/same-seed: No change in performance detected.time: [40.251 ms 42.323 ms 44.460 ms] change: [-9.4232% -3.2176% +3.2552%] (p = 0.33 > 0.05) 1-conn/1-100mb-resp (aka. Download)/client: 💚 Performance has improved.time: [112.15 ms 112.55 ms 112.92 ms] thrpt: [885.59 MiB/s 888.51 MiB/s 891.65 MiB/s] change: time: [-3.3400% -2.9120% -2.4998%] (p = 0.00 < 0.05) thrpt: [+2.5639% +2.9994% +3.4554%] 1-conn/10_000-parallel-1b-resp (aka. RPS)/client: No change in performance detected.time: [313.64 ms 317.26 ms 320.89 ms] thrpt: [31.163 Kelem/s 31.520 Kelem/s 31.884 Kelem/s] change: time: [-2.6213% -0.9663% +0.6976%] (p = 0.25 > 0.05) thrpt: [-0.6928% +0.9757% +2.6919%] 1-conn/1-1b-resp (aka. HPS)/client: Change within noise threshold.time: [33.964 ms 34.154 ms 34.363 ms] thrpt: [29.101 elem/s 29.279 elem/s 29.443 elem/s] change: time: [+0.4719% +1.2255% +2.0762%] (p = 0.00 < 0.05) thrpt: [-2.0340% -1.2106% -0.4697%] Client/server transfer resultsTransfer of 33554432 bytes over loopback.
|
@martinthomson I'd appreciate a review, since the code I am touching is pretty complex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure looks fine, but this seems overly conservative to me.
@@ -337,6 +337,41 @@ impl<T: WindowAdjustment> CongestionControl for ClassicCongestionControl<T> { | |||
congestion || persistent_congestion | |||
} | |||
|
|||
/// Handle a congestion event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just a move? It looks like it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but from ClassicCongestionControl
to CongestionControl
.
// Packets are only declared as lost relative to `largest_acked`. If we hit a PTO while | ||
// we don't have a largest_acked yet, also do a congestion control reaction (because | ||
// otherwise none would happen). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that we want to slow down? If we haven't received an ACK, we should be at a low send rate. Do we really want to slow down further?
One thing that concerns me here is that a fast PTO (if we ever enable that) will hit this condition more often and that might not be good for performance. The same goes for long RTT connections, which might be OK while we are still retransmitting within the RTT, but once we get an RTT estimate that is long, we'll slow our send rate by a huge amount for the false PTOs we've hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the argument is that the combination of initial cwnd and initial RTT (= PTO) were made with some sort of safety criteria in mind. A PTO is an indication that either the RTT is much longer than the default assumption, or there is quite a bit of loss. In either of these two cases, we probably want to slow down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my main concern is that if we want to be more aggressive about PTO, such as sending another Initial we create a situation where slowing down is not the result of a misunderstanding of the RTT, but a deliberate choice to send more to start with.
Packets are only declared as lost relative to
largest_acked
. If we hit a PTO while we don't have a largest_acked yet, also do a congestion control reaction (because otherwise none would happen).Broken out of #1998