[Ecn-sane] [Bloat] [tsvwg] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

Wed Mar 20 19:30:51 EDT 2019

Hi Jake,

> On Mar 21, 2019, at 00:11, Holland, Jake <jholland at akamai.com> wrote:
> 
> I think it's a fair point that even as close as the non-home side
> of the access network, fq would need a lot of queues, and if you
> want something in hardware it's going to be tricky.  I hear
> they're up to an average of ~6k homes per OLT.

	Except they state "In practice it would also be important to de- ploy AQM in the residential gateway, but to minimise side-effects we kept upstream traffic below capacity." meaning in addition to the OLT/BNG/whatever shaper they also envision a shaper on the CPE. And I believe there is ample evidence (in openwrt with sqm-scripts) that in that case the downstream shaper can also be put on the CPE with reasonable success. 

> 
> I don't think the default assumption here should be that they
> missed something obvious, but rather that they're trying to
> solve a hard problem, and something with a classifier has a
> legitimate value.

	I agree, except ECT(1) clearly is a very approximate "classifier" as it can not distinguish the L4S-ness of CE marked packets, which affects both the AQM part which will treat non-L4S traffic as false positive as well as TCP Prague endpoints that will mistreat CE-marked packets as L4S signals even if the CE mark is from a TCP-friendly AQM. I note that neither "‘Data Centre to the Home’: Ultra-Low Latency for All" nor "PI2: A Linearized AQM for both Classic and Scalable TCP" seem to discuss these classification errors and their effects on real traffic in sufficient depth.
It is one thing to soak of one of the last few available "codepoints" in the IP headers, but it is another in my book to do so and not reliably being able to extract the encoded information. At least from my layman's perspective I wonder why this does not seem to bother anybody here?

> 
> The question to me is about how much it breaks other things to
> extract that value, and how much you get out of it in the end.

	That is basically the core of my question above, how much do you get out in the end?

>  If you need fq and therefore the only viable place for AQM with good
> results is on the home side of the router, that's got some bad
> deployment problems too.

	As I state above, even the L4S project position seems to be that AQM on the CPE/router is essential, so we are only haggling about how much AQM needs to be done on the router. But from that perspective, I would not be unhappy if my ISP would employ a lower latency AQM solution upstream of my router than they currently do, sort of as a belt and suspender approach to have my router's back in cases of severe packet inrush.

Best Regards
	Sebastian

> 
> Just my 2c.
> 
> -Jake
> 
> On 2019-03-20, 15:56, "Sebastian Moeller" <moeller0 at gmx.de> wrote:
> 
> 
> 
>> On Mar 20, 2019, at 23:31, Jonathan Morton <chromatix99 at gmail.com> wrote:
>> 
>>> On 21 Mar, 2019, at 12:12 am, Sebastian Moeller <moeller0 at gmx.de> wrote:
>>> 
>>> they see 20ms queue delay with a 7ms base link delay @ 40 Mbps
>> 
>> At 40Mbps you might as well be running Cake, and thereby getting 1ms inter-flow induced delay; an order of magnitude better.  And we achieved that o a shoestring budget while they were submarining for a patent application.
>> 
>> If we're supposed to be impressed…
> 
>    Nah, there is this GEM:
> 
> 
>    Comparing Experiments 5, 7 with 6, 8, we can again conclude that our DualQ AQM very much approximates the fq CoDel AQM without the need for flow identi- fication and more complex processing. The main ad- vantage is DualQ’s lower queuing delay for L4S traffic.
> 
>    So for normal traffic is is worse than fq_codel and better for traffic that does behave TCP-friendly, for which it was bespoke made. So at least they shoud have pimped fq_codel/cake to emit their required CE marking regime and do a test against that, if the goal is to compare apples and apples. I note that they do come into this with a grudge against fq "Per-flow queuing:  Similarly per-flow queuing is not incompatible with the L4S approach.  However, one queue for every flow can be thought of as overkill compared to the minimum of two queues for 
>    all traffic needed for the L4S approach.  The overkill of per-flow queuing has side-effects:" followed by a list of 4 more or less straw-man arguments. Heck these might be actually reasonable arguments at their core, but the short description in the RFC is fishy.
>    I believe the coupling between the two queues to be clever and elegant, but the whole premise seems odd to me. What they should have done, IMHO is teach their AQM something like SCE so it can easily react to CE and drops in a standard compliant TCP-friendly fashion, and only do the clever window/rate adjustments if the AQM signals ECT(1), add fair queueing to separate the different TCP variants behavior from each other, and bang no classification bit needed. And no patent (assuming the patent covers the coupling between the two queues)... I am sure I am missing something here, it can not be that simple.
> 
> 
>    Best Regards
>    	Sebastian
> 
>    P.S.: How did the SCE-Talk go, interesting feed-back and discussions?
> 
> 
> 
>> 
>> - Jonathan Morton
>> 
> 
>    _______________________________________________
>    Bloat mailing list
>    Bloat at lists.bufferbloat.net
>    https://lists.bufferbloat.net/listinfo/bloat
> 
>