[Ecn-sane] [Bloat] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

Tue Mar 19 04:50:01 EDT 2019


> On Mar 19, 2019, at 05:44, Greg White <g.white at CableLabs.com> wrote:
> If I can boil this down for the people who are jumping into this without reading the drafts:
>  
> 	• Both L4S and SCE are attempting to provide congestion-controlled senders with better congestion signals so that flows can achieve link capacity without buffering delay. 
> 	• Both are proposing to use ECT(1) as part of the mechanism, but to use it in different ways.

	SCE tries to encode information about the quantitative congestion state of the marking AQM into ECT(1), while L4S tries to use this as a general identifier of promised behavior as a receiver of CE marks, or rather as an indication that flows marked ECT(1) will not respond to CE marks as described in rfc3168. Which realistically means any non-L4S AQM needs to learn quickly to drop ECT(1) packets instead of marking them CE; that seems better controlled than waiting for a fall-back to rfc3168-compliant CE response due to a heuristic based on RTT variation.

> 	• SCE’s usage of ECT(1) potentially allows an automatic fallback to traditional Cubic behavior if the bottleneck link is a single-queue classic-ECN AQM (do any of these exist?), whereas L4S will need to detect such a condition via RTT measurement
> 	• L4S’s usage of ECT(1) allows links to identify new senders and take advantage of new sender features like reordering tolerance that can further drive down latency in many common link technologies.

	But L4S is incapable of _reliably_ classifying L4S flows/packets as CE-marked packets default to L4S-treatment. This indicates to me, that ECT(1) is not really suited as a reliable L4S identifier, what am I missing? 
This ambiguity leads to the question of the side-effects of this leaky classification: what about re-ordering of CE-marked packets? I hope that out of caution CE-marked packets will not be re-ordered as these are very much not guaranteed to employ RACK. (And tangentially, how is a link that desires more latitude for re-ordering going to deal with the RACK requirement to keep the re-ordering windows <= 1 RTT, given that RTTs over the internet differ from a few to dozens of ms. . Is there any study showing how RACK and re-ordering actually interact in real-life?) And how is it going to help a link in regards to re-ordering at all? It has been argued, that links do not differentiate flows at all, and assuming TCP traffic to coexist for a long time with (DC)TCP_Prague traffic, how can a link actually allow more re-ordering than currently tolerable without severely impacting the TCP flows? If it just transmits all ECT(1) packets in its queue things will be a bit better than now, but after the egress queue is emptied the link might still be stalled until the re-transmit of ECT(0) and CE marked packets is finished, no?


> 	• SCE will only work if the bottleneck link implements fq.  Some bottleneck network gear will not be able to implement fq or will not implement it due to its undesirable side effects (see section 6 of RFC 8290).
> 	• L4S will work if the bottleneck link implements *either* fq or dual queue.

	The proof ought to e in the pudding ;) is there data showing an working L4S fq-AQM?

>  
> Beyond that, they are *very,very* similar.  
>  
> But, L4S has been demonstrated in real equipment and in simulation, and leverages an existing congestion controller that is available in Linux and Windows (with some tweaks).  

	As far as I can see the public git repository for TCP Prague is only a few days old so how could that be "available in Linux and Windows" right now, and one could similarly argue that it will only take a few tweaks to teach cubic how to deal with SCE.


So I have no pony in this race as I am outside of the field, but the L4S RFCs seem to promise more than they 


> SCE leverages a paragraph in a draft that describes a first guess about how a congestion controller might work.
>  
> L4S has defined a congestion feedback mechanism so that these congestion signals can get back to the sender.  SCE offers that “we’ll propose something later”.
>  
> BBR currently does not listen to explicit congestion signals, but it could be updated to do so (for either SCE or L4S).
>  
> -Greg
>  
>  
> From: Bloat <bloat-bounces at lists.bufferbloat.net> on behalf of "David P. Reed" <dpreed at deepplum.com>
> Date: Sunday, March 17, 2019 at 12:07 PM
> To: Vint Cerf <vint at google.com>
> Cc: bloat <bloat at lists.bufferbloat.net>, "ecn-sane at lists.bufferbloat.net" <ecn-sane at lists.bufferbloat.net>
> Subject: Re: [Bloat] [Ecn-sane] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104
>  
> Vint -
>  
> BBR is the end-to-end control logic that adjusts the source rate to match the share of the bolttleneck link it should use.
>  
> It depends on getting reliable current congestion information via packet drops and/or ECN.
>  
> So the proposal by these guys (not the cable guys) is an attempt to improve the quality of the congestion signal inserted by the router with the bottleneck outbound link.
>  
> THe cable guys are trying to get a "private" field in the IP header for their own use.
>  
>  
> -----Original Message-----
> From: "Vint Cerf" <vint at google.com>
> Sent: Saturday, March 16, 2019 5:57pm
> To: "Holland, Jake" <jholland at akamai.com>
> Cc: "Mikael Abrahamsson" <swmike at swm.pp.se>, "David P. Reed" <dpreed at deepplum.com>, "ecn-sane at lists.bufferbloat.net" <ecn-sane at lists.bufferbloat.net>, "bloat" <bloat at lists.bufferbloat.net>
> Subject: Re: [Ecn-sane] [Bloat] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104
> 
> where does BBR fit into all this?
> v
>  
> On Sat, Mar 16, 2019 at 5:39 PM Holland, Jake <jholland at akamai.com> wrote:
>> On 2019-03-15, 11:37, "Mikael Abrahamsson" <swmike at swm.pp.se> wrote:
>>     L4S has a much better possibility of actually getting deployment into the 
>>     wider Internet packet-moving equipment than anything being talked about 
>>     here. Same with PIE as opposed to FQ_CODEL. I know it's might not be as 
>>     good, but it fits better into actual silicon and it's being proposed by 
>>     people who actually have better channels into the people setting hard 
>>     requirements.
>> 
>>     I suggest you consider joining them instead of opposing them.
>> 
>> 
>> Hi Mikael,
>> 
>> I agree it makes sense that fq_anything has issues when you're talking
>> about the OLT/CMTS/BNG/etc., and I believe it when you tell me PIE
>> makes better sense there.
>> 
>> But fq_x makes great sense and provides real value for the uplink in a
>> home, small office, coffee shop, etc. (if you run the final rate limit
>> on the home side of the access link.)  I'm thinking maybe there's a
>> disconnect here driven by the different use cases for where AQMs can go.
>> 
>> The thing is, each of these is the most likely congestion point at
>> different times, and it's worthwhile for each of them to be able to
>> AQM (and mark packets) under congestion.
>> 
>> One of the several things that bothers me with L4S is that I've seen
>> precious little concern over interfering with the ability for another
>> different AQM in-path to mark packets, and because it changes the
>> semantics of CE, you can't have both working at the same time unless
>> they both do L4S.
>> 
>> SCE needs a lot of details filled in, but it's so much cleaner that it
>> seems to me there's reasonably obvious answers to all (or almost all) of
>> those detail questions, and because the semantics are so much cleaner,
>> it's much easier to tell it's non-harmful.
>> 
>> <aside regarding="non-harmful">
>> The point you raised in another thread about reordering is mostly
>> well-taken, and a good counterpoint to the claim "non-harmful relative
>> to L4S".
>> 
>> To me it seems sad and dumb that switches ended up trying to make
>> ordering guarantees at cost of switching performance, because if it's
>> useful to put ordering in the switch, then it must be equally useful to
>> put it in the receiver's NIC or OS.
>> 
>> So why isn't it in all the receivers' NIC or OS (where it would render
>> the switch's ordering efforts moot) instead of in all the switches?
>> 
>> I'm guessing the answer is a competition trap for the switch vendors,
>> plus "with ordering goes faster than without, when you benchmark the
>> switch with typical load and current (non-RACK) receivers".
>> 
>> If that's the case, it seems like the drive for a competitive advantage
>> caused deployment of a packet ordering workaround in the wrong network
>> location(s), out of a pure misalignment of incentives.
>> 
>> RACK rates to fix that in the end, but a lot of damage is already done,
>> and the L4S approach gives switches a flag that can double as proof that
>> RACK is there on the receiver, so they can stop trying to order those
>> packets.
>> 
>> So point granted, I understand and agree there's a cost to abandoning
>> that advantage.
>> </aside>
>> 
>> But as you also said so well in another thread, this is important.  ("The
>> last unicorn", IIRC.)  How much does it matter if there's a feature that
>> has value today, but only until RACK is widely deployed?  If you were
>> convinced RACK would roll out everywhere within 3 years and SCE would
>> produce better results than L4S over the following 15 years, would that
>> change your mind?
>> 
>> It would for me, and that's why I'd like to see SCE explored before
>> making a call.  I think at its core, it provides the same thing L4S does
>> (a high-fidelity explicit congestion signal for the sender), but with
>> much cleaner semantics that can be incrementally added to congestion
>> controls that people are already using.
>> 
>> Granted, it still remains to be seen whether SCE in practice can match
>> the results of L4S, and L4S was here first.  But it seems to me L4S comes
>> with some problems that have not yet been examined, and that are nicely
>> dodged by a SCE-based approach.
>> 
>> If L4S really is as good as they seem to think, I could imagine getting
>> behind it, but I don't think that's proven yet.  I'm not certain, but
>> all the comparative analyses I remember seeing have been from more or
>> less the same team, and I'm not convinced they don't have some
>> misaligned incentives of their own.
>> 
>> I understand a lot of work has gone into L4S, but this move to jump it
>> from interesting experiment to de-facto standard without a more critical
>> review that digs deeper into some of the potential deployment problems
>> has me concerned.
>> 
>> If it really does turn out to be good enough to be permanent, I'm not
>> opposed to it, but I'm just not convinced that it's non-harmful, and my
>> default position is that the cleaner solution is going to be better in
>> the long run, if they can do the same job.
>> 
>> It's not that I want it to be a fight, but I do want to end up with the
>> best solution we can get.  We only have the one internet.
>> 
>> Just my 2c.  
>> 
>> -Jake
>> 
>> 
>> _______________________________________________
>> Ecn-sane mailing list
>> Ecn-sane at lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/ecn-sane
> 
> -- 
> New postal address:
> Google
> 1875 Explorer Street, 10th Floor
> Reston, VA 20190
> _______________________________________________
> Bloat mailing list
> Bloat at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat