[Ecn-sane] The state of l4s, bbrv2, sce?

Dave Taht dave.taht at gmail.com
Fri Jul 26 11:32:15 EDT 2019


I did miss a couple details

On Fri, Jul 26, 2019 at 8:05 AM Dave Taht <dave.taht at gmail.com> wrote:
>
> Changing the title....
>
> I hope to be able to add some features and boxes to the worldwide
> flent fleet to gather up some more data. Simple stuff includes trying
> to verify more fully worldwide what happens when you twiddle the ecn
> bits, mildly longer term look at what happens when conflicting
> interpretations
> of these bits are in play somewhere on the path, bit longer than that
> getting an openwrt build up as a middlebox and vm, and then finally,
> finally
> see what happens on a couple kinds of wifi.
>
> There's now a flent server in mumbai, in particular, which I hope will
> shed some insight as to the state of networks in india, long term, on
> a variety
> of fronts. But none of it's ready lacking a good release to freeze on.
>
> 1) BBRv2 is now available for public hacking. I had a good readthrough
> last night.
>
> The published tree applies cleanly (with a small patch) to net-next.
> I've had a chance to read through the code (lots of good changes to
> bbr!).
>
> Although neal was careful to say in iccrg the optional ecn mode uses
> "dctcp/l4s-style signalling", he did not identify how that was
> actually applied
> at the middleboxes, and the supplied test scripts
> (gtests/net/tcp/bbr/nsperf) don't do that. All we know is that it's
> set to kick in at 20 packets. Is it fq_codel's ce_threshold? red? pie?
> dualpi? Does it revert to drop on overload?
>
> Is it running on bare metal? 260us is at the bare bottom of what linux
> can schedule reliably, vms are much worse.
>
> Couple notes:
>
> BBRv2 doesn't use ect(1) as an identifier.
>
> The chromium release has no support for ecn at all.
>
> Adding back in the stuff I'd first done to rfc3168 bbrv1 looks
> straightforward, making it do sce, less so.

I note that at lower rates a cap of cwnd 2 instead of 4 seems seems feasible.

> 2) To clarify something from the l4s team, are the results you've been
> presenting for years all from the 3.19 kernel? bsd? microsoft? ns2?
> ns3? what?
>
> The code on github is not worth testing against currently? It does
> have some needed features like a setsockopt for using up ect(1).

Were these tests with gro/tso enabled?

> should I use the issue tracker for that? I have some comments on
> dualpi in addition to my outstanding question about pie's default of
> drop at 10% mark
> rate vs dualpi's 0. Notably it's set to 1000 packets now (fq_codel
> defaults to 10,000 and we switched to memory limits both in it and
> cake given a modern
> packet's dynamic range of 64b to 64k). I've observed 10gige can be in
> the 2-3k packets range... has dualpi been tested above 1gige yet?
>
> 3) The current patches for sce need to get rebased for net-next. The
> sch_cake mods are easy but as the dctcp code did morph a bit since sce
> work forked it as did the other tcps. I took a stab at forward porting
> it to net-next, but I figure that development is hot and heavy and
> some patches will land after ietf. I do not mind taking a stab again
> at cleaning it up (helps me to understand what's going on), as how the
> algos currently (as of, like, yesterday) work is clear to me... what
> I'd like to do at least is also add 'em to the out of tree
> fq_codel_fast implementation.

Another issue on the tcp front in this patchset was disabling iw10 as
a burst. I do strongly agree with that, pacing it,
and or reverting to iw4, then pacing (as it's not been taken up by
netbsd or osx either) would make this stuff gentler at lower rates.

Is the ramp function as needed with iw4 in play?

>
> Did I miss anything about the current state of things?
>
> My basic testbed is a string of containers on a couple 12 core boxes
> on bare metal, and more advanced is the openwrt stuff part of my wifi
> lab. That's
> presently almost all 4.14 based on arm, mips, and x86, running both on
> real hardware and in emulation.
>
> On Fri, Jul 26, 2019 at 6:10 AM Pete Heist <pete at heistp.net> wrote:
> >
> >
> > > On Jul 25, 2019, at 12:14 PM, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper at nokia-bell-labs.com> wrote:
> > >
> > > We have the testbed running our reference kernel version 3.19 with the drop patch. Let me know if you want to see the difference in behavior between the “good” DCTCP and the “deteriorated” DCTCP in the latest kernels too. There were several issues introduced which made DCTCP both more aggressive, and currently less aggressive. It calls for better regression tests (for Prague at least) to make sure it’s behavior is not changed too drastically by new updates. If enough people are interested, we can organize a session in one of the available rooms.
> > >
> > > Pete, Jonathan,
> > >
> > > Also for testing further your tests, let me know when you are available.
> >
> > Regarding testing, we now have a five node setup in our test environment running a mixture of tcp-prague and dualq kernels to cover the scenarios Jon outlined earlier. With what little time we’ve had for it this week, we’ve only done some basic tests, and seem to be seeing behavior similar to what we saw at the hackathon, but we can discuss specific results following IETF 105.
> >
> > Our intention is to coordinate a public effort to create reproducible test scenarios for L4S using flent. Details to follow post-conference. We do feel it’s important that all of our Linux testing be on modern 5.1+ kernels, as the 3.19 series was end of life as of May 2015 (https://lwn.net/Articles/643934/), so we'll try to keep up to date with any patches you might have for the newer kernels.
> >
> > Overall, I think we’ve improved the cooperation between the teams this week (from zero to a little bit :), which should hopefully help move both projects along...
> > _______________________________________________
> > Ecn-sane mailing list
> > Ecn-sane at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/ecn-sane
>
>
>
> --
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


More information about the Ecn-sane mailing list