[Ecn-sane] [Bloat] [iccrg] Fwd: [tcpPrague] Implementation and experimentation of TCP Prague/L4S hackaton at IETF104

Sat Mar 16 18:09:35 EDT 2019

> On Mar 16, 2019, at 22:38, Holland, Jake <jholland at akamai.com> wrote:
> 
> On 2019-03-15, 11:37, "Mikael Abrahamsson" <swmike at swm.pp.se> wrote:
>    L4S has a much better possibility of actually getting deployment into the 
>    wider Internet packet-moving equipment than anything being talked about 
>    here. Same with PIE as opposed to FQ_CODEL. I know it's might not be as 
>    good, but it fits better into actual silicon and it's being proposed by 
>    people who actually have better channels into the people setting hard 
>    requirements.
> 
>    I suggest you consider joining them instead of opposing them.
> 
> 
> Hi Mikael,
> 
> I agree it makes sense that fq_anything has issues when you're talking
> about the OLT/CMTS/BNG/etc., and I believe it when you tell me PIE
> makes better sense there.

	Except PIE is not mandatory there, DOCSIS3.1 made PIE mandatory on the CPE or customer modems, CMTS AQM was I believe recommended.

> 
> But fq_x makes great sense and provides real value for the uplink in a
> home, small office, coffee shop, etc. (if you run the final rate limit
> on the home side of the access link.)  I'm thinking maybe there's a
> disconnect here driven by the different use cases for where AQMs can go.
> 
> The thing is, each of these is the most likely congestion point at
> different times, and it's worthwhile for each of them to be able to
> AQM (and mark packets) under congestion.
> 
> One of the several things that bothers me with L4S is that I've seen
> precious little concern over interfering with the ability for another
> different AQM in-path to mark packets, and because it changes the
> semantics of CE, you can't have both working at the same time unless
> they both do L4S.

The relevant section from https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-06:

"A.1.4.  Fall back to Reno-friendly congestion control on classic ECN

        bottlenecks

   Description: A scalable congestion control needs to react to ECN
   marking from a non-L4S but ECN-capable bottleneck in a way that will
   coexist with a TCP Reno congestion control [RFC5681].

   Motivation: Similarly to the requirement in Appendix A.1.3, this
   requirement is a safety condition to ensure a scalable congestion
   control behaves properly when it builds a queue at a network
   bottleneck that has not been upgraded to support L4S.  On detecting
   classic ECN marking (see below), a scalable congestion control will
   need to fall back to classic congestion control behaviour.  If it
   does not comply with this requirement it could starve classic
   traffic.

   It would take time for endpoints to distinguish classic and L4S ECN
   marking.  An increase in queuing delay or in delay variation would be
   a tell-tale sign, but it is not yet clear where a line would be drawn
   between the two behaviours.  It might be possible to cache what was
   learned about the path to help subsequent attempts to detect the type
   of marking."

In short L4S has not seem to have solved this problem yet except for identifying it.
IMHO this is a clear reason not to to re-use ECT(1) outside of ECN signaling.

> 
> SCE needs a lot of details filled in, but it's so much cleaner that it
> seems to me there's reasonably obvious answers to all (or almost all) of
> those detail questions, and because the semantics are so much cleaner,
> it's much easier to tell it's non-harmful.

	IMHO the beauty of the simple SCE proposal is that it simply supplies information a rational flow could/should react on purely by self interest, but ignoring it should do no harm, assuming the assumption holds that ECT(1) safely traverses the internet.

> 
> <aside regarding="non-harmful">
> The point you raised in another thread about reordering is mostly
> well-taken, and a good counterpoint to the claim "non-harmful relative
> to L4S".

	Would this not be better handled by a dedicated signal instead of assuming all L4S traffic is re-ordering tolerant (which as seen from my vantage point runs counter L4S goal of ultra-low latency).

> 
> To me it seems sad and dumb that switches ended up trying to make
> ordering guarantees at cost of switching performance, because if it's
> useful to put ordering in the switch, then it must be equally useful to
> put it in the receiver's NIC or OS.

	The issue I see, is that re-ordering with fast ARQ cycles on a fast link will be faster than pushing the un-ordered packets over the bottleneck access link, as in the case of data stretching over multiple packets the user might need them all before the data can be actually used.

> 
> So why isn't it in all the receivers' NIC or OS (where it would render
> the switch's ordering efforts moot) instead of in all the switches?
> 
> I'm guessing the answer is a competition trap for the switch vendors,
> plus "with ordering goes faster than without, when you benchmark the
> switch with typical load and current (non-RACK) receivers".
> 
> If that's the case, it seems like the drive for a competitive advantage
> caused deployment of a packet ordering workaround in the wrong network
> location(s), out of a pure misalignment of incentives.
> 
> RACK rates to fix that in the end, but a lot of damage is already done,
> and the L4S approach gives switches a flag that can double as proof that
> RACK is there on the receiver, so they can stop trying to order those
> packets.
> 
> So point granted, I understand and agree there's a cost to abandoning
> that advantage.
> </aside>
> 
> But as you also said so well in another thread, this is important.  ("The
> last unicorn", IIRC.)  How much does it matter if there's a feature that
> has value today, but only until RACK is widely deployed?  If you were
> convinced RACK would roll out everywhere within 3 years and SCE would
> produce better results than L4S over the following 15 years, would that
> change your mind?
> 
> It would for me, and that's why I'd like to see SCE explored before
> making a call.  I think at its core, it provides the same thing L4S does
> (a high-fidelity explicit congestion signal for the sender), but with
> much cleaner semantics that can be incrementally added to congestion
> controls that people are already using.
> 
> Granted, it still remains to be seen whether SCE in practice can match
> the results of L4S, and L4S was here first.  But it seems to me L4S comes
> with some problems that have not yet been examined, and that are nicely
> dodged by a SCE-based approach.
> 
> If L4S really is as good as they seem to think, I could imagine getting
> behind it, but I don't think that's proven yet.  I'm not certain, but
> all the comparative analyses I remember seeing have been from more or
> less the same team, and I'm not convinced they don't have some
> misaligned incentives of their own.
> 
> I understand a lot of work has gone into L4S, but this move to jump it
> from interesting experiment to de-facto standard without a more critical
> review that digs deeper into some of the potential deployment problems
> has me concerned.
> 
> If it really does turn out to be good enough to be permanent, I'm not
> opposed to it, but I'm just not convinced that it's non-harmful, and my
> default position is that the cleaner solution is going to be better in
> the long run, if they can do the same job.
> 
> It's not that I want it to be a fight, but I do want to end up with the
> best solution we can get.  We only have the one internet.
> 
> Just my 2c.  
> 
> -Jake
> 
> 
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane