[Ecn-sane] [tsvwg] Comments on L4S drafts
De Schepper, Koen (Nokia - BE/Antwerp)
koen.de_schepper at nokia-bell-labs.com
Mon Jul 22 14:15:26 EDT 2019
Jonathan,
I'm a bit surprised to read what I read here... I had the impression that we were on a much better level of understanding during the hackathon and that :
- we both agreed that the latest updates in the Linux kernels had quite some impact on DCTCP's performance (burstyness) that both you and we are working on. As also our testbed showed it had the same impact on DualPI2 and FQ-Codel (yes we do understand FQ_Codel and did extensively compare DualQ with it since the beginning of L4S).
- the current TCP-Prague we have in the public GitHub, which is DCTCP using accurate ECN and ect(1) and is drop compliant with Reno, is what SCE can use as well, and whatever you called SCE-TCP can be used for L4S, as (what I showed you mathematically) it is actually perfectly working according to DCTCP's law of 1/p, because it is DCTCP with some simple pacing tweaks you did. I thought we agreed that there is no difference in the congestion control part, and we want the same thing, and the only difference is how to use the code-point.
- related to the testbed setups, we have several running, the first since 2013. We support all kernel versions since 3.19 up to the latest 5.2-rc5. We have demonstrated L4S since 2015 in IETF93 and the L4S BoF with real equipment and software that is still the same as we use today.
- the testbed I brought (5 laptops and a switch that got broken during travel and I had to replace in the nearest shop), I had to install during the hackathon from scratch from our public GitHub (I arrived only at 14:00 on Saturday) which we made immediately available for you guys to put the flent testing tools on.
- related to the flent testing, you might have expected to find big differences, but both measurements showed exactly the same results. I understood you need to extent your tools to get more measurement parameters included which were missing compared to ours.
- we planned to complete your test list during this week and maybe best that we jointly report on the outcome of those to avoid different interpretations again.
- anybody who had interest in L4S could have evaluated it since we made our DUALPI2 code available in 2015 (actually many did). (To Dave That: if you wanted to evaluate DualPI2 you had plenty of opportunity, 4 years by now. I find it weird that suddenly you were not able to install a qdisc in Linux. Even if you wanted us to setup a testbed for you, you could have asked us.)
Maybe some good news too, we also had a (first time right) successful accurate ECN interop test between our Linux TCP-Prague and FreeBSD Reno (acc-ecn implementation provided by Richard Scheffenegger).
I hope these accusations of incompetence can stop now, and that we get to the point of finally getting a future looking low latency Internet deployed. Anybody else who doubts on the performance/robustness of L4S, let me know and we arrange a test session this week.
Koen.
-----Original Message-----
From: Jonathan Morton <chromatix99 at gmail.com>
Sent: Sunday, July 21, 2019 6:01 PM
To: Bob Briscoe <in at bobbriscoe.net>
Cc: Sebastian Moeller <moeller0 at gmx.de>; De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper at nokia-bell-labs.com>; Black, David <David.Black at dell.com>; ecn-sane at lists.bufferbloat.net; tsvwg at ietf.org; Dave Taht <dave at taht.net>
Subject: Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
> On 21 Jul, 2019, at 7:53 am, Bob Briscoe <in at bobbriscoe.net> wrote:
>
> Both teams brought their testbeds, and as of yesterday evening, Koen and Pete Heist had put the two together and started the tests Jonathan proposed. Usual problems: latest Linux kernel being used has introduced a bug, so need to wind back. But progressing.
>
> Nonetheless, altho it's included in the tests, I don't see the particular concern with this 'Cake' scenario. How can "L4S flows crowd out more reactive RFC3168 flows" in "an RFC3168-compliant FQ-AQM". Whenever it would be happening, FQ would prevent it.
>
> To ensure we're not continually being blown into the weeds, I thought the /only/ concern was about RFC3168-compliant /single-queue/ AQMs.
I drew up a list of five network topologies to test, each with the SCE set of tests and tools, but using mostly L4S network components and focused on L4S performance and robustness.
1: L4S sender -> L4S middlebox (bottleneck) -> L4S receiver.
This is simply a sanity check to make sure the tools worked. Actually we fell over even at this stage yesterday, because we discovered problems in the system Bob and Koen had brought along to demo. These may or may not be improved today; we'll see.
2: L4S sender -> FQ-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.
This is the most favourable-to-L4S topology that incorporates a non-L4S component that we could easily come up with, and therefore . Apparently the L4S folks are also relatively unfamiliar with Codel, which is now the most widely deployed AQM in the world, and this would help to validate that L4S transports respond reasonably to it.
3: L4S sender -> single-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.
This is the topology of most concern, and is obtained from topology 2 by simply changing a parameter on our middlebox.
4: L4S sender -> ECT(1) mangler -> L4S middlebox (bottleneck) -> L4S receiver.
Exploring what happens if an adversary tries to game the system. We could also try an ECT(0) mangler or a Not-ECT mangler, in the same spirit.
5: L4S sender -> L4S middlebox (bottleneck 1) -> Dumb FIFO (bottleneck 2) -> FQ-AQM middlebox (bottleneck 3) -> L4S receiver.
This is Sebastian's scenario. We did have some discussion yesterday about the propensity of existing senders to produce line-rate bursts occasionally, and the way these bursts could collect in *all* of the queues at successively decreasing bottlenecks. This is a test which explores that scenario and measures its effects, and is highly relevant to best consumer practice on today's Internet.
Naturally, we have tried the equivalent of most of the above scenarios on our SCE testbed already. The only one we haven't explicitly tried out is #5; I think we'd need to use all of Pete's APUs plus at least one of my machines to set it up, and we were too tired for that last night.
- Jonathan Morton
More information about the Ecn-sane
mailing list