* [Ecn-sane] quick question
@ 2023-08-26 11:48 Sebastian Moeller
2023-08-26 12:06 ` Jonathan Morton
0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Moeller @ 2023-08-26 11:48 UTC (permalink / raw)
To: ECN-Sane, bloat
Dear ECN-experts,
I have a quick question, when running downloads, say x flows (with MTU ~1500), over a Y Mbps link with Zms RTT, what kind of CE rate do I need to expect?
I ask, because the kids started to play some games on steam, and when these do their multi-GB updates I accumulate quite a lot of CE marks for the respective cake tin:
Before:
qdisc cake 810a: dev ifb4pppoe-wan root refcnt 2 bandwidth 105Mbit diffserv3 dual-dsthost nat nowash ingress no-ack-filter split-gso rtt 100ms noatm overhead 34 mpu 88 memlimit 32Mb
Sent 21471648426 bytes 16712159 pkt (dropped 9179, overlimits 27961170 requeues 0)
backlog 2984b 2p requeues 0
memory used: 864Kb of 32Mb
capacity estimate: 105Mbit
min/max network layer size: 28 / 1492
min/max overhead-adjusted size: 88 / 1526
average network hdr offset: 0
Bulk Best Effort Voice
thresh 6562Kbit 105Mbit 26250Kbit
target 5ms 5ms 5ms
interval 100ms 100ms 100ms
pk_delay 116us 2.68ms 88us
av_delay 31us 1.29ms 18us
sp_delay 6us 57us 6us
backlog 0b 2984b 0b
pkts 170434 16500879 50027
bytes 73422115 21408109703 3480229
way_inds 56 1320752 2152
way_miss 3879 101928 341
way_cols 0 0 0
drops 12 9167 0
marks 34 559 0
ack_drop 0 0 0
sp_flows 1 5 1
bk_flows 0 1 0
un_flows 0 0 0
max_len 1492 1492 192
quantum 300 1514 801
After:
qdisc cake 810a: dev ifb4pppoe-wan root refcnt 2 bandwidth 105Mbit diffserv3 dual-dsthost nat nowash ingress no-ack-filter split-gso rtt 100ms noatm overhead 34 mpu 88 memlimit 32Mb
Sent 26301367224 bytes 19973069 pkt (dropped 9381, overlimits 34351737 requeues 0)
backlog 0b 0p requeues 0
memory used: 1025408b of 32Mb
capacity estimate: 105Mbit
min/max network layer size: 28 / 1492
min/max overhead-adjusted size: 88 / 1526
average network hdr offset: 0
Bulk Best Effort Voice
thresh 6562Kbit 105Mbit 26250Kbit
target 5ms 5ms 5ms
interval 100ms 100ms 100ms
pk_delay 336us 238us 180us
av_delay 62us 96us 36us
sp_delay 2us 2us 8us
backlog 0b 0b 0b
pkts 171622 19760656 50172
bytes 73494415 26237802841 3493005
way_inds 56 1321363 2237
way_miss 3904 102830 345
way_cols 0 0 0
drops 12 9369 0
marks 34 2346888 0
ack_drop 0 0 0
sp_flows 1 19 1
bk_flows 0 1 0
un_flows 0 0 0
max_len 1492 1492 192
quantum 300 1514 801
# note there was other traffic ongoing as well, so some drops/marks are unrelated, but the bulk of marks was correlated with the steam download
delta packets: (19760656 - 16500879) = 3259777
delta bytes: (26237802841 - 21408109703) / 1000^3 = 4.829693138 GB
average packet size: (26237802841 - 21408109703) / (19760656 - 16500879) = 1481.60231145
delta drops: (9369 - 9167) = 202
delta marks: (2346888 - 559) = 2346329
fraction of packets marked: 2346329 / 3259777 = 0.719782058711
percentage of packets marked: 100 * (2346329 / 3259777) = 72%
This seems like too high a marking rate to me. I would naively expect that a flow on getting a mark scale back by its cwin by 20-50% and then slowly increaer it again, so I expect the actual marking rate to be considerably below 50% per flow...
My gut feeling is that these steam flows do not obey RFC3168 ECN (or something wipes the CE marks my router sends upstream along the path)... but without a good model what marking rate I should expect this is very hand-wavy, so if anybody could help me out with an easy derivation of the expected average marking rate I would be grateful.
Regards
Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] quick question
2023-08-26 11:48 [Ecn-sane] quick question Sebastian Moeller
@ 2023-08-26 12:06 ` Jonathan Morton
2023-08-26 12:34 ` Sebastian Moeller
2023-08-26 12:42 ` [Ecn-sane] [Bloat] " Erik Auerswald
0 siblings, 2 replies; 6+ messages in thread
From: Jonathan Morton @ 2023-08-26 12:06 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: ECN-Sane, bloat
[-- Attachment #1: Type: text/plain, Size: 1595 bytes --]
> On 26 Aug, 2023, at 2:48 pm, Sebastian Moeller via Ecn-sane <ecn-sane@lists.bufferbloat.net> wrote:
>
> percentage of packets marked: 100 * (2346329 / 3259777) = 72%
>
> This seems like too high a marking rate to me. I would naively expect that a flow on getting a mark scale back by its cwin by 20-50% and then slowly increaer it again, so I expect the actual marking rate to be considerably below 50% per flow...
> My gut feeling is that these steam flows do not obey RFC3168 ECN (or something wipes the CE marks my router sends upstream along the path)... but without a good model what marking rate I should expect this is very hand-wavy, so if anybody could help me out with an easy derivation of the expected average marking rate I would be grateful.
Yeah, that's definitely too much marking. We've actually seen this behaviour from Steam servers before, but they had fixed it at some point. Perhaps they've unfixed it again.
My best guess is that they're running an old version of BBR with ECN negotiation left on. BBRv1, at least, completely ignores ECE responses. Fortunately BBR itself does a good job of congestion control in the FQ environment which Cake provides, as you can tell by the fact that the queues never get full enough to trigger heavy dropping.
The CUBIC RFC offers an answer to your question:
Reading the table, for RTT of 100ms and throughput 100Mbps in a single flow, a "loss rate" (equivalent to a marking rate) of about 1 per 7000 packets is required. The formula can be rearranged to find a more general answer.
- Jonathan Morton
[-- Attachment #2.1: Type: text/html, Size: 2381 bytes --]
[-- Attachment #2.2: Screenshot 2023-08-26 at 3.03.03 pm.png --]
[-- Type: image/png, Size: 29386 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] quick question
2023-08-26 12:06 ` Jonathan Morton
@ 2023-08-26 12:34 ` Sebastian Moeller
2023-08-26 12:42 ` [Ecn-sane] [Bloat] " Erik Auerswald
1 sibling, 0 replies; 6+ messages in thread
From: Sebastian Moeller @ 2023-08-26 12:34 UTC (permalink / raw)
To: Jonathan Morton; +Cc: ECN-Sane, bloat
Hi Jonathan,
much appreciated!
> On Aug 26, 2023, at 14:06, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 26 Aug, 2023, at 2:48 pm, Sebastian Moeller via Ecn-sane <ecn-sane@lists.bufferbloat.net> wrote:
>>
>> percentage of packets marked: 100 * (2346329 / 3259777) = 72%
>>
>> This seems like too high a marking rate to me. I would naively expect that a flow on getting a mark scale back by its cwin by 20-50% and then slowly increaer it again, so I expect the actual marking rate to be considerably below 50% per flow...
>
>> My gut feeling is that these steam flows do not obey RFC3168 ECN (or something wipes the CE marks my router sends upstream along the path)... but without a good model what marking rate I should expect this is very hand-wavy, so if anybody could help me out with an easy derivation of the expected average marking rate I would be grateful.
>
> Yeah, that's definitely too much marking. We've actually seen this behaviour from Steam servers before, but they had fixed it at some point. Perhaps they've unfixed it again.
Hmm, I guess the next time around I will take a packet capture and see whether I see ECE (expected) and especially CWR flags in these streams... my side is a recent ubuntu (kernel from the 5.15 series I believe) with sysctl configured to negotiate ECN...
>
> My best guess is that they're running an old version of BBR with ECN negotiation left on. BBRv1, at least, completely ignores ECE responses. Fortunately BBR itself does a good job of congestion control in the FQ environment which Cake provides, as you can tell by the fact that the queues never get full enough to trigger heavy dropping.
Yes, good point! Next time I see a big download I will take a packet capture to look for additional markers of ECN activity... Sidenote, BBRv1 ignoring ECN and not making sure to veto ECN negotiation really seems like a sub-optimal coincidence...
>
> The CUBIC RFC offers an answer to your question:
>
> <Screenshot 2023-08-26 at 3.03.03 pm.png>
>
> Reading the table, for RTT of 100ms and throughput 100Mbps in a single flow, a "loss rate" (equivalent to a marking rate) of about 1 per 7000 packets is required. The formula can be rearranged to find a more general answer.
Thanks! Will look into this...
Regards
Sebastian
>
> - Jonathan Morton
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] [Bloat] quick question
2023-08-26 12:06 ` Jonathan Morton
2023-08-26 12:34 ` Sebastian Moeller
@ 2023-08-26 12:42 ` Erik Auerswald
2023-08-26 12:51 ` Jonathan Morton
1 sibling, 1 reply; 6+ messages in thread
From: Erik Auerswald @ 2023-08-26 12:42 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Sebastian Moeller, ECN-Sane, bloat
Hi,
On Sat, Aug 26, 2023 at 03:06:09PM +0300, Jonathan Morton via Bloat wrote:
> > On 26 Aug, 2023, at 2:48 pm, Sebastian Moeller via Ecn-sane <ecn-sane@lists.bufferbloat.net> wrote:
> >
> > percentage of packets marked: 100 * (2346329 / 3259777) = 72%
> >
> > This seems like too high a marking rate to me. I would naively expect
> > that a flow on getting a mark scale back by its cwin by 20-50% and
> > then slowly increaer it again, so I expect the actual marking rate
> > to be considerably below 50% per flow...
>
> > My gut feeling is that these steam flows do not obey RFC3168 ECN
> > (or something wipes the CE marks my router sends upstream along the
> > path)... but without a good model what marking rate I should expect
> > this is very hand-wavy, so if anybody could help me out with an easy
> > derivation of the expected average marking rate I would be grateful.
>
> Yeah, that's definitely too much marking. We've actually seen this
> behaviour from Steam servers before, but they had fixed it at some
> point. Perhaps they've unfixed it again.
>
> My best guess is that they're running an old version of BBR with ECN
> negotiation left on. BBRv1, at least, completely ignores ECE responses.
> Fortunately BBR itself does a good job of congestion control in the
> FQ environment which Cake provides, as you can tell by the fact that
> the queues never get full enough to trigger heavy dropping.
>
> The CUBIC RFC offers an answer to your question:
> [small screenshot attached to email]
I find the attached screenshot quite unreadable. It seems to be
taken starting from the paragraph above section 5.2 of RFC 9438
<https://www.rfc-editor.org/rfc/rfc9438#section-5.2>. In UTF-8 text it
looks as follows:
------------------------------------------------------------------------
_C_ determines the aggressiveness of CUBIC in competing with other
congestion control algorithms for bandwidth. CUBIC is more friendly
to Reno TCP if the value of _C_ is lower. However, it is NOT
RECOMMENDED to set _C_ to a very low value like 0.04, since CUBIC
with a low _C_ cannot efficiently use the bandwidth in fast and long-
distance networks. Based on these observations and extensive
deployment experience, _C_=0.4 seems to provide a good balance
between Reno-friendliness and aggressiveness of window increase.
Therefore, _C_ SHOULD be set to 0.4. With _C_ set to 0.4, Figure 7
is reduced to
4 ┌────┐
╲ │ 3
╲│RTT
AVG_W = 1.054 * ────────
cubic 4 ┌──┐
╲ │ 3
╲│p
Figure 8
Figure 8 is then used in the next subsection to show the scalability
of CUBIC.
5.2. Using Spare Capacity
CUBIC uses a more aggressive window increase function than Reno for
fast and long-distance networks.
Table 3 shows that to achieve the 10 Gbps rate, Reno TCP requires a
packet loss rate of 2.0e-10, while CUBIC TCP requires a packet loss
rate of 2.9e-8.
+===================+===========+=========+=========+=========+
| Throughput (Mbps) | Average W | Reno P | HSTCP P | CUBIC P |
+===================+===========+=========+=========+=========+
| 1 | 8.3 | 2.0e-2 | 2.0e-2 | 2.0e-2 |
+-------------------+-----------+---------+---------+---------+
| 10 | 83.3 | 2.0e-4 | 3.9e-4 | 2.9e-4 |
+-------------------+-----------+---------+---------+---------+
| 100 | 833.3 | 2.0e-6 | 2.5e-5 | 1.4e-5 |
+-------------------+-----------+---------+---------+---------+
| 1000 | 8333.3 | 2.0e-8 | 1.5e-6 | 6.3e-7 |
+-------------------+-----------+---------+---------+---------+
| 10000 | 83333.3 | 2.0e-10 | 1.0e-7 | 2.9e-8 |
+-------------------+-----------+---------+---------+---------+
Table 3: Required Packet Loss Rate for Reno TCP, HSTCP, and
CUBIC to Achieve a Certain Throughput
Table 3 describes the required packet loss rate for Reno TCP, HSTCP,
and CUBIC to achieve a certain throughput, with 1500-byte packets and
an _RTT_ of 0.1 seconds.
------------------------------------------------------------------------
(extracted using:
wget -q -O- 'https://www.rfc-editor.org/rfc/rfc9438.txt' \
| sed -En '/^ +_C_ determines the aggressiveness of CUBIC/,/of 0\.1 seconds\.$/p'
)
> Reading the table, for RTT of 100ms and throughput 100Mbps in a single
> flow, a "loss rate" (equivalent to a marking rate) of about 1 per
> 7000 packets is required. The formula can be rearranged to find a
> more general answer.
HTH,
Erik
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] [Bloat] quick question
2023-08-26 12:42 ` [Ecn-sane] [Bloat] " Erik Auerswald
@ 2023-08-26 12:51 ` Jonathan Morton
2023-08-26 15:35 ` Sebastian Moeller
0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Morton @ 2023-08-26 12:51 UTC (permalink / raw)
To: Erik Auerswald; +Cc: Sebastian Moeller, ECN-Sane, bloat
[-- Attachment #1: Type: text/plain, Size: 353 bytes --]
> On 26 Aug, 2023, at 3:42 pm, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
>
> I find the attached screenshot quite unreadable.
Yeah, I forgot to prevent Apple Mail from auto-shrinking it. Here's the original:
I also rearranged the formula and made log-log plots over the range of likely RTTs and bandwidths:
- Jonathan Morton
[-- Attachment #2.1: Type: text/html, Size: 1053 bytes --]
[-- Attachment #2.2: Screenshot 2023-08-26 at 3.03.03 pm.png --]
[-- Type: image/png, Size: 130211 bytes --]
[-- Attachment #2.3: Screenshot 2023-08-26 at 3.49.14 pm.png --]
[-- Type: image/png, Size: 258957 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Ecn-sane] [Bloat] quick question
2023-08-26 12:51 ` Jonathan Morton
@ 2023-08-26 15:35 ` Sebastian Moeller
0 siblings, 0 replies; 6+ messages in thread
From: Sebastian Moeller @ 2023-08-26 15:35 UTC (permalink / raw)
To: Jonathan Morton; +Cc: Erik Auerswald, ECN-Sane, bloat
Hi Jonathan, hi Erik,
that was helpful, thanks!
I now played around with tcpdump a bit an apparently:
tcpdump -i pppoe-wan -v -n 'tcp[tcpflags] & (tcp-ece|tcp-cwr) != 0' # TCP ECN flags, ECN in action
will allow me to quickly see whether I get ECE or CWR flags in my traffic, so I will use this for the next steam download to see whether there is any ECN activity. I guess ECN echos will be apparent as these are from my host, so I might simply reduce the logging to CWR.
> On Aug 26, 2023, at 14:51, Jonathan Morton <chromatix99@gmail.com> wrote:
>
>> On 26 Aug, 2023, at 3:42 pm, Erik Auerswald <auerswal@unix-ag.uni-kl.de> wrote:
>>
>> I find the attached screenshot quite unreadable.
>
> Yeah, I forgot to prevent Apple Mail from auto-shrinking it.
[SM] I run into this same issue from time to time ;), but even the reduced screen shot and the surrounding informatin was enough to find that "page" in the RFC.
Regards
Sebastian
> Here's the original:
>
> <Screenshot 2023-08-26 at 3.03.03 pm.png>
>
> I also rearranged the formula and made log-log plots over the range of likely RTTs and bandwidths:
> <Screenshot 2023-08-26 at 3.49.14 pm.png>
>
> - Jonathan Morton
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-08-26 15:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-26 11:48 [Ecn-sane] quick question Sebastian Moeller
2023-08-26 12:06 ` Jonathan Morton
2023-08-26 12:34 ` Sebastian Moeller
2023-08-26 12:42 ` [Ecn-sane] [Bloat] " Erik Auerswald
2023-08-26 12:51 ` Jonathan Morton
2023-08-26 15:35 ` Sebastian Moeller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox