* [Ecn-sane] quick question @ 2023-08-26 11:48 Sebastian Moeller 2023-08-26 12:06 ` Jonathan Morton 0 siblings, 1 reply; 6+ messages in thread From: Sebastian Moeller @ 2023-08-26 11:48 UTC (permalink / raw) To: ECN-Sane, bloat Dear ECN-experts, I have a quick question, when running downloads, say x flows (with MTU ~1500), over a Y Mbps link with Zms RTT, what kind of CE rate do I need to expect? I ask, because the kids started to play some games on steam, and when these do their multi-GB updates I accumulate quite a lot of CE marks for the respective cake tin: Before: qdisc cake 810a: dev ifb4pppoe-wan root refcnt 2 bandwidth 105Mbit diffserv3 dual-dsthost nat nowash ingress no-ack-filter split-gso rtt 100ms noatm overhead 34 mpu 88 memlimit 32Mb Sent 21471648426 bytes 16712159 pkt (dropped 9179, overlimits 27961170 requeues 0) backlog 2984b 2p requeues 0 memory used: 864Kb of 32Mb capacity estimate: 105Mbit min/max network layer size: 28 / 1492 min/max overhead-adjusted size: 88 / 1526 average network hdr offset: 0 Bulk Best Effort Voice thresh 6562Kbit 105Mbit 26250Kbit target 5ms 5ms 5ms interval 100ms 100ms 100ms pk_delay 116us 2.68ms 88us av_delay 31us 1.29ms 18us sp_delay 6us 57us 6us backlog 0b 2984b 0b pkts 170434 16500879 50027 bytes 73422115 21408109703 3480229 way_inds 56 1320752 2152 way_miss 3879 101928 341 way_cols 0 0 0 drops 12 9167 0 marks 34 559 0 ack_drop 0 0 0 sp_flows 1 5 1 bk_flows 0 1 0 un_flows 0 0 0 max_len 1492 1492 192 quantum 300 1514 801 After: qdisc cake 810a: dev ifb4pppoe-wan root refcnt 2 bandwidth 105Mbit diffserv3 dual-dsthost nat nowash ingress no-ack-filter split-gso rtt 100ms noatm overhead 34 mpu 88 memlimit 32Mb Sent 26301367224 bytes 19973069 pkt (dropped 9381, overlimits 34351737 requeues 0) backlog 0b 0p requeues 0 memory used: 1025408b of 32Mb capacity estimate: 105Mbit min/max network layer size: 28 / 1492 min/max overhead-adjusted size: 88 / 1526 average network hdr offset: 0 Bulk Best Effort Voice thresh 6562Kbit 105Mbit 26250Kbit target 5ms 5ms 5ms interval 100ms 100ms 100ms pk_delay 336us 238us 180us av_delay 62us 96us 36us sp_delay 2us 2us 8us backlog 0b 0b 0b pkts 171622 19760656 50172 bytes 73494415 26237802841 3493005 way_inds 56 1321363 2237 way_miss 3904 102830 345 way_cols 0 0 0 drops 12 9369 0 marks 34 2346888 0 ack_drop 0 0 0 sp_flows 1 19 1 bk_flows 0 1 0 un_flows 0 0 0 max_len 1492 1492 192 quantum 300 1514 801 # note there was other traffic ongoing as well, so some drops/marks are unrelated, but the bulk of marks was correlated with the steam download delta packets: (19760656 - 16500879) = 3259777 delta bytes: (26237802841 - 21408109703) / 1000^3 = 4.829693138 GB average packet size: (26237802841 - 21408109703) / (19760656 - 16500879) = 1481.60231145 delta drops: (9369 - 9167) = 202 delta marks: (2346888 - 559) = 2346329 fraction of packets marked: 2346329 / 3259777 = 0.719782058711 percentage of packets marked: 100 * (2346329 / 3259777) = 72% This seems like too high a marking rate to me. I would naively expect that a flow on getting a mark scale back by its cwin by 20-50% and then slowly increaer it again, so I expect the actual marking rate to be considerably below 50% per flow... My gut feeling is that these steam flows do not obey RFC3168 ECN (or something wipes the CE marks my router sends upstream along the path)... but without a good model what marking rate I should expect this is very hand-wavy, so if anybody could help me out with an easy derivation of the expected average marking rate I would be grateful. Regards Sebastian ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] quick question 2023-08-26 11:48 [Ecn-sane] quick question Sebastian Moeller @ 2023-08-26 12:06 ` Jonathan Morton 2023-08-26 12:34 ` Sebastian Moeller 2023-08-26 12:42 ` [Ecn-sane] [Bloat] " Erik Auerswald 0 siblings, 2 replies; 6+ messages in thread From: Jonathan Morton @ 2023-08-26 12:06 UTC (permalink / raw) To: Sebastian Moeller; +Cc: ECN-Sane, bloat [-- Attachment #1: Type: text/plain, Size: 1595 bytes --] > On 26 Aug, 2023, at 2:48 pm, Sebastian Moeller via Ecn-sane <ecn-sane@lists.bufferbloat.net> wrote: > > percentage of packets marked: 100 * (2346329 / 3259777) = 72% > > This seems like too high a marking rate to me. I would naively expect that a flow on getting a mark scale back by its cwin by 20-50% and then slowly increaer it again, so I expect the actual marking rate to be considerably below 50% per flow... > My gut feeling is that these steam flows do not obey RFC3168 ECN (or something wipes the CE marks my router sends upstream along the path)... but without a good model what marking rate I should expect this is very hand-wavy, so if anybody could help me out with an easy derivation of the expected average marking rate I would be grateful. Yeah, that's definitely too much marking. We've actually seen this behaviour from Steam servers before, but they had fixed it at some point. Perhaps they've unfixed it again. My best guess is that they're running an old version of BBR with ECN negotiation left on. BBRv1, at least, completely ignores ECE responses. Fortunately BBR itself does a good job of congestion control in the FQ environment which Cake provides, as you can tell by the fact that the queues never get full enough to trigger heavy dropping. The CUBIC RFC offers an answer to your question: Reading the table, for RTT of 100ms and throughput 100Mbps in a single flow, a "loss rate" (equivalent to a marking rate) of about 1 per 7000 packets is required. The formula can be rearranged to find a more general answer. - Jonathan Morton [-- Attachment #2.1: Type: text/html, Size: 2381 bytes --] [-- Attachment #2.2: Screenshot 2023-08-26 at 3.03.03 pm.png --] [-- Type: image/png, Size: 29386 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] quick question 2023-08-26 12:06 ` Jonathan Morton @ 2023-08-26 12:34 ` Sebastian Moeller 2023-08-26 12:42 ` [Ecn-sane] [Bloat] " Erik Auerswald 1 sibling, 0 replies; 6+ messages in thread From: Sebastian Moeller @ 2023-08-26 12:34 UTC (permalink / raw) To: Jonathan Morton; +Cc: ECN-Sane, bloat Hi Jonathan, much appreciated! > On Aug 26, 2023, at 14:06, Jonathan Morton <chromatix99@gmail.com> wrote: > >> On 26 Aug, 2023, at 2:48 pm, Sebastian Moeller via Ecn-sane <ecn-sane@lists.bufferbloat.net> wrote: >> >> percentage of packets marked: 100 * (2346329 / 3259777) = 72% >> >> This seems like too high a marking rate to me. I would naively expect that a flow on getting a mark scale back by its cwin by 20-50% and then slowly increaer it again, so I expect the actual marking rate to be considerably below 50% per flow... > >> My gut feeling is that these steam flows do not obey RFC3168 ECN (or something wipes the CE marks my router sends upstream along the path)... but without a good model what marking rate I should expect this is very hand-wavy, so if anybody could help me out with an easy derivation of the expected average marking rate I would be grateful. > > Yeah, that's definitely too much marking. We've actually seen this behaviour from Steam servers before, but they had fixed it at some point. Perhaps they've unfixed it again. Hmm, I guess the next time around I will take a packet capture and see whether I see ECE (expected) and especially CWR flags in these streams... my side is a recent ubuntu (kernel from the 5.15 series I believe) with sysctl configured to negotiate ECN... > > My best guess is that they're running an old version of BBR with ECN negotiation left on. BBRv1, at least, completely ignores ECE responses. Fortunately BBR itself does a good job of congestion control in the FQ environment which Cake provides, as you can tell by the fact that the queues never get full enough to trigger heavy dropping. Yes, good point! Next time I see a big download I will take a packet capture to look for additional markers of ECN activity... Sidenote, BBRv1 ignoring ECN and not making sure to veto ECN negotiation really seems like a sub-optimal coincidence... > > The CUBIC RFC offers an answer to your question: > > <Screenshot 2023-08-26 at 3.03.03 pm.png> > > Reading the table, for RTT of 100ms and throughput 100Mbps in a single flow, a "loss rate" (equivalent to a marking rate) of about 1 per 7000 packets is required. The formula can be rearranged to find a more general answer. Thanks! Will look into this... Regards Sebastian > > - Jonathan Morton ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] [Bloat] quick question 2023-08-26 12:06 ` Jonathan Morton 2023-08-26 12:34 ` Sebastian Moeller @ 2023-08-26 12:42 ` Erik Auerswald 2023-08-26 12:51 ` Jonathan Morton 1 sibling, 1 reply; 6+ messages in thread From: Erik Auerswald @ 2023-08-26 12:42 UTC (permalink / raw) To: Jonathan Morton; +Cc: Sebastian Moeller, ECN-Sane, bloat Hi, On Sat, Aug 26, 2023 at 03:06:09PM +0300, Jonathan Morton via Bloat wrote: > > On 26 Aug, 2023, at 2:48 pm, Sebastian Moeller via Ecn-sane <ecn-sane@lists.bufferbloat.net> wrote: > > > > percentage of packets marked: 100 * (2346329 / 3259777) = 72% > > > > This seems like too high a marking rate to me. I would naively expect > > that a flow on getting a mark scale back by its cwin by 20-50% and > > then slowly increaer it again, so I expect the actual marking rate > > to be considerably below 50% per flow... > > > My gut feeling is that these steam flows do not obey RFC3168 ECN > > (or something wipes the CE marks my router sends upstream along the > > path)... but without a good model what marking rate I should expect > > this is very hand-wavy, so if anybody could help me out with an easy > > derivation of the expected average marking rate I would be grateful. > > Yeah, that's definitely too much marking. We've actually seen this > behaviour from Steam servers before, but they had fixed it at some > point. Perhaps they've unfixed it again. > > My best guess is that they're running an old version of BBR with ECN > negotiation left on. BBRv1, at least, completely ignores ECE responses. > Fortunately BBR itself does a good job of congestion control in the > FQ environment which Cake provides, as you can tell by the fact that > the queues never get full enough to trigger heavy dropping. > > The CUBIC RFC offers an answer to your question: > [small screenshot attached to email] I find the attached screenshot quite unreadable. It seems to be taken starting from the paragraph above section 5.2 of RFC 9438 <https://www.rfc-editor.org/rfc/rfc9438#section-5.2>. In UTF-8 text it looks as follows: ------------------------------------------------------------------------ _C_ determines the aggressiveness of CUBIC in competing with other congestion control algorithms for bandwidth. CUBIC is more friendly to Reno TCP if the value of _C_ is lower. However, it is NOT RECOMMENDED to set _C_ to a very low value like 0.04, since CUBIC with a low _C_ cannot efficiently use the bandwidth in fast and long- distance networks. Based on these observations and extensive deployment experience, _C_=0.4 seems to provide a good balance between Reno-friendliness and aggressiveness of window increase. Therefore, _C_ SHOULD be set to 0.4. With _C_ set to 0.4, Figure 7 is reduced to 4 ┌────┐ ╲ │ 3 ╲│RTT AVG_W = 1.054 * ──────── cubic 4 ┌──┐ ╲ │ 3 ╲│p Figure 8 Figure 8 is then used in the next subsection to show the scalability of CUBIC. 5.2. Using Spare Capacity CUBIC uses a more aggressive window increase function than Reno for fast and long-distance networks. Table 3 shows that to achieve the 10 Gbps rate, Reno TCP requires a packet loss rate of 2.0e-10, while CUBIC TCP requires a packet loss rate of 2.9e-8. +===================+===========+=========+=========+=========+ | Throughput (Mbps) | Average W | Reno P | HSTCP P | CUBIC P | +===================+===========+=========+=========+=========+ | 1 | 8.3 | 2.0e-2 | 2.0e-2 | 2.0e-2 | +-------------------+-----------+---------+---------+---------+ | 10 | 83.3 | 2.0e-4 | 3.9e-4 | 2.9e-4 | +-------------------+-----------+---------+---------+---------+ | 100 | 833.3 | 2.0e-6 | 2.5e-5 | 1.4e-5 | +-------------------+-----------+---------+---------+---------+ | 1000 | 8333.3 | 2.0e-8 | 1.5e-6 | 6.3e-7 | +-------------------+-----------+---------+---------+---------+ | 10000 | 83333.3 | 2.0e-10 | 1.0e-7 | 2.9e-8 | +-------------------+-----------+---------+---------+---------+ Table 3: Required Packet Loss Rate for Reno TCP, HSTCP, and CUBIC to Achieve a Certain Throughput Table 3 describes the required packet loss rate for Reno TCP, HSTCP, and CUBIC to achieve a certain throughput, with 1500-byte packets and an _RTT_ of 0.1 seconds. ------------------------------------------------------------------------ (extracted using: wget -q -O- 'https://www.rfc-editor.org/rfc/rfc9438.txt' \ | sed -En '/^ +_C_ determines the aggressiveness of CUBIC/,/of 0\.1 seconds\.$/p' ) > Reading the table, for RTT of 100ms and throughput 100Mbps in a single > flow, a "loss rate" (equivalent to a marking rate) of about 1 per > 7000 packets is required. The formula can be rearranged to find a > more general answer. HTH, Erik ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] [Bloat] quick question 2023-08-26 12:42 ` [Ecn-sane] [Bloat] " Erik Auerswald @ 2023-08-26 12:51 ` Jonathan Morton 2023-08-26 15:35 ` Sebastian Moeller 0 siblings, 1 reply; 6+ messages in thread From: Jonathan Morton @ 2023-08-26 12:51 UTC (permalink / raw) To: Erik Auerswald; +Cc: Sebastian Moeller, ECN-Sane, bloat [-- Attachment #1: Type: text/plain, Size: 353 bytes --] > On 26 Aug, 2023, at 3:42 pm, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: > > I find the attached screenshot quite unreadable. Yeah, I forgot to prevent Apple Mail from auto-shrinking it. Here's the original: I also rearranged the formula and made log-log plots over the range of likely RTTs and bandwidths: - Jonathan Morton [-- Attachment #2.1: Type: text/html, Size: 1053 bytes --] [-- Attachment #2.2: Screenshot 2023-08-26 at 3.03.03 pm.png --] [-- Type: image/png, Size: 130211 bytes --] [-- Attachment #2.3: Screenshot 2023-08-26 at 3.49.14 pm.png --] [-- Type: image/png, Size: 258957 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] [Bloat] quick question 2023-08-26 12:51 ` Jonathan Morton @ 2023-08-26 15:35 ` Sebastian Moeller 0 siblings, 0 replies; 6+ messages in thread From: Sebastian Moeller @ 2023-08-26 15:35 UTC (permalink / raw) To: Jonathan Morton; +Cc: Erik Auerswald, ECN-Sane, bloat Hi Jonathan, hi Erik, that was helpful, thanks! I now played around with tcpdump a bit an apparently: tcpdump -i pppoe-wan -v -n 'tcp[tcpflags] & (tcp-ece|tcp-cwr) != 0' # TCP ECN flags, ECN in action will allow me to quickly see whether I get ECE or CWR flags in my traffic, so I will use this for the next steam download to see whether there is any ECN activity. I guess ECN echos will be apparent as these are from my host, so I might simply reduce the logging to CWR. > On Aug 26, 2023, at 14:51, Jonathan Morton <chromatix99@gmail.com> wrote: > >> On 26 Aug, 2023, at 3:42 pm, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote: >> >> I find the attached screenshot quite unreadable. > > Yeah, I forgot to prevent Apple Mail from auto-shrinking it. [SM] I run into this same issue from time to time ;), but even the reduced screen shot and the surrounding informatin was enough to find that "page" in the RFC. Regards Sebastian > Here's the original: > > <Screenshot 2023-08-26 at 3.03.03 pm.png> > > I also rearranged the formula and made log-log plots over the range of likely RTTs and bandwidths: > <Screenshot 2023-08-26 at 3.49.14 pm.png> > > - Jonathan Morton ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-08-26 15:35 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-08-26 11:48 [Ecn-sane] quick question Sebastian Moeller 2023-08-26 12:06 ` Jonathan Morton 2023-08-26 12:34 ` Sebastian Moeller 2023-08-26 12:42 ` [Ecn-sane] [Bloat] " Erik Auerswald 2023-08-26 12:51 ` Jonathan Morton 2023-08-26 15:35 ` Sebastian Moeller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox