[Cake] cake default target is too low for bbr?
Andy Furniss
adf.lists at gmail.com
Sat Apr 29 06:31:02 EDT 2017
Jonathan Morton wrote:
>
>> On 29 Apr, 2017, at 01:26, Andy Furniss <adf.lists at gmail.com>
>> wrote:
>>
>>>>> As I understand it, increase in RTT due to queueing of
>>>>> packets is the main feedback mechanism for BBR. So dropping
>>>>> packets, which I already consider harmful, is really harmful
>>>>> with BBR because you're not telling the sender to slow down.
>
> Actually, BBR considers mainly a measurement of “delivery rate”, and
> paces its sending to match that. It does *not* primarily rely on a
> congestion window as most TCPs do; one is provided only as a safety
> net of last resort.
>
> Measurements of RTT are mostly used for two purposes: to size the
> congestion window so that it doesn’t interfere with normal operation;
> and to estimate when “queue draining” is complete after a bandwidth
> probe cycle.
Interesting.
>
>>>> If BBR does not slow down when packets are dropped, it's too
>>>> hostile to use on a public network. The only way for a public
>>>> network to respond to a flood of traffic higher than what it
>>>> can handle is to drop packets (with a possible warning via ECN
>>>> shortly before packets get dropped). If BBR doesn't slow down,
>>>> it's just going to be wasting bandwidth.
>
>>> No it isn't. Packet loss does not equal conguestion - it never
>>> did. Dropping packets to signal congestion is an ugly hack for
>>> implementations that are too dumb to understand any proper
>>> congestion control mechanism.
>>
>> Hmm, I bet a lot of carrier links are policed rather than smart
>> queue.
>
> Policing should theoretically produce a consistent delivery rate,
> which is what BBR needs to work effectively. A wrinkle here is that
> most policers and shapers to date rely on a token-bucket algorithm
> which permits bursts at rates well above the policy, and BBR has to
> attempt to infer the policy rate from the medium-term behaviour.
Ok, but it's not really going to work (be fair) when a big link with
000s of users is policed overall. Of course this shouldn't really happen
- but contention exists at ISP level and local level for DOCSIS cable users.
>> It also seems (OK one quick possibly flawed test), that bbr ignores
>> ECN as well as drops in the sense that marked is just as high as
>> dropped.
>
> Yes, BBR ignores ECN, which I consider to be an unfortunate feature;
> it could quite reasonably be used to terminate bandwidth probes
> early, before they build up a significant queue (which then needs to
> be drained).
Now that is unfortunate - so ECN is effectively deprecated by BBR :-(
>
> Cake works very well with BBR, provided it is deployed at the
> upstream end of the bottleneck link. In this position, Cake happily
> absorbs the temporary standing queue caused by bandwidth probes, and
> the deficit-mode shaper means that BBR tends to see a consistent
> delivery rate, which it considers ideal. In practice it matters
> little whether the BBR sender negotiates ECN or not, in this case.
>
> When deployed at the downstream end of the bottleneck link, Cake
> works less well with BBR - but that’s true to some extent of all
> TCPs. In ingress mode, at least, dropping packets effectively causes
> a reduction in the delivery rate, which should influence BBR more
> strongly to correct itself. But if ECN is negotiated, these packet
> drops do not occur. In both cases, the temporary standing queue
> collects in the upstream dumb queue, unless Cake is configured to a
> conservative enough margin below the bottleneck rate. Cake does
> everything it reasonably can here, but the topology is fundamentally
> unfavourable.
Further testing with the aim of reducing drops seems to indicate that
it's not so much target that matters but RTT.
Each output two netperf (x5) runs one marked cs1, one unmarked.
Sending through bulk with higher target and best effort with lower
target isn't much different. Using ingress param for this test, tcp
throughput is low 1.6 mbit (x5)
qdisc cake 1: dev ifb0 root refcnt 2 bandwidth 16Mbit diffserv3
dual-srchost ingress rtt 100.0ms atm overhead 40 via-ethernet
Sent 20770678 bytes 13819 pkt (dropped 9485, overlimits 36449 requeues 0)
backlog 0b 0p requeues 0
memory used: 153Kb of 4Mb
capacity estimate: 16Mbit
Bulk Best Effort Voice
thresh 1Mbit 16Mbit 4Mbit
target 18.2ms 5.0ms 5.0ms
interval 113.2ms 100.0ms 10.0ms
pk_delay 13.0ms 9.6ms 0us
av_delay 10.0ms 3.9ms 0us
sp_delay 6us 13us 0us
pkts 11654 11650 0
bytes 17565164 17563044 0
way_inds 0 0 0
way_miss 10 10 0
way_cols 0 0 0
drops 4511 4974 0
marks 0 0 0
sp_flows 4 5 0
bk_flows 2 0 0
un_flows 0 0 0
max_len 1514 1514 0
With RTT at 300ms throughput is 2.33 mbit and less drops for similar target.
qdisc cake 1: dev ifb0 root refcnt 2 bandwidth 16Mbit diffserv3
dual-srchost ingress rtt 300.0ms atm overhead 40 via-ethernet
Sent 31265716 bytes 20758 pkt (dropped 2563, overlimits 43619 requeues 0)
backlog 0b 0p requeues 0
memory used: 153Kb of 4Mb
capacity estimate: 16Mbit
Bulk Best Effort Voice
thresh 1Mbit 16Mbit 4Mbit
target 18.2ms 15.0ms 15.0ms
interval 303.2ms 300.0ms 30.0ms
pk_delay 21.2ms 20.1ms 0us
av_delay 18.9ms 17.0ms 0us
sp_delay 4us 7us 0us
pkts 11656 11665 0
bytes 17564952 17579378 0
way_inds 0 0 0
way_miss 10 10 0
way_cols 0 0 0
drops 1206 1357 0
marks 0 0 0
sp_flows 5 5 0
bk_flows 0 1 0
un_flows 0 0 0
max_len 1514 1514 0
It's even better for loss/throughput (2.44 x5) with higher rtt.
qdisc cake 1: dev ifb0 root refcnt 2 bandwidth 16Mbit diffserv3
dual-srchost ingress rtt 400.0ms atm overhead 40 via-ethernet
Sent 32594660 bytes 21626 pkt (dropped 1677, overlimits 44556 requeues 0)
backlog 0b 0p requeues 0
memory used: 153Kb of 4Mb
capacity estimate: 16Mbit
Bulk Best Effort Voice
thresh 1Mbit 16Mbit 4Mbit
target 20.0ms 20.0ms 20.0ms
interval 400.0ms 400.0ms 40.0ms
pk_delay 21.9ms 20.3ms 0us
av_delay 19.7ms 18.8ms 0us
sp_delay 4us 4us 0us
pkts 11640 11663 0
bytes 17550552 17583086 0
way_inds 0 0 0
way_miss 10 10 0
way_cols 0 0 0
drops 836 841 0
marks 0 0 0
sp_flows 5 5 0
bk_flows 0 1 0
un_flows 0 0 0
max_len 1514 1514 0
I haven't yet tried cooking up a test that includes a double
queue one 10% faster fifo to see how much the latency is affected.
More information about the Cake
mailing list