[Ecn-sane] The state of l4s, bbrv2, sce?

Dave Taht dave.taht at gmail.com
Fri Jul 26 11:45:30 EDT 2019


On Fri, Jul 26, 2019 at 8:37 AM Neal Cardwell <ncardwell at google.com> wrote:
>
> On Fri, Jul 26, 2019 at 11:05 AM Dave Taht <dave.taht at gmail.com> wrote:
>>
>> 1) BBRv2 is now available for public hacking. I had a good readthrough
>> last night.
>>
>> The published tree applies cleanly (with a small patch) to net-next.
>> I've had a chance to read through the code (lots of good changes to
>> bbr!).
>>
>> Although neal was careful to say in iccrg the optional ecn mode uses
>> "dctcp/l4s-style signalling", he did not identify how that was
>> actually applied
>> at the middleboxes, and the supplied test scripts
>> (gtests/net/tcp/bbr/nsperf) don't do that. All we know is that it's
>> set to kick in at 20 packets. Is it fq_codel's ce_threshold? red? pie?
>> dualpi? Does it revert to drop on overload?
>
>
> As mentioned in the ICCRG session, the TCP source tree includes the scripts used to run the tests and generate the graphs in the slide deck. Here is the commit I was mentioning:
>
>    https://github.com/google/bbr/commit/e76d4f89b0c42d5409d34c48ee6f8d32407d4b8d
>
> So you can look at exactly how each test was run, and re-run those tests yourself, with the v2alpha code or any experimental tweaks you might make beyond that.
>
> To answer your particular question, the ECN marks were from a bottleneck qdisc configured as:
>
>   codel ce_threshold 242us limit 1000 target 100ms

thx neal! I missed that!

> I'm not claiming that's necessarily the best mechanism or set of parameters to set ECN marks. The 20-packet number comes from the DCTCP SIGCOMM 2010 paper's recommendation for 1Gbps bottlenecks. I just picked this kind of approach because the bare metal router/switch hardware varies, so this is a simple and easy way for everyone to experiment with the exact same ECN settings.

ok!

>
>> Is it running on bare metal? 260us is at the bare bottom of what linux
>> can schedule reliably, vms are much worse.
>
>
> I have tried both VMs and bare metal with those scripts, and of course the VMs are quite noisy and the bare metal results much less noisy. So the graphs are from runs on bare metal x86 server-class machines.

Good to know. On the cloud I use (linode) 1ms was the best I could
hope for, and even then, dang jittery. (it was much worse 8 years
back when xen underneath could be in the 10-20ms range!).

There are major jitter issues on lower end hardware but I don't know
how bad they are post spectre fixes, been afraid to look.

containers are a huge improvement over vms but still break things like tsq.

>
> neal
>


-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


More information about the Ecn-sane mailing list