[Ecn-sane] IETF 110 quick summary

Pete Heist pete at heistp.net
Tue Mar 9 04:57:31 EST 2021


On Mon, 2021-03-08 at 23:06 -0500, Steven Blake wrote:
> If I'm a random network operator, not participating in any L4S
> experiments, and L4S traffic traversing my network hits a bottleneck,
> what happens? Consider all of the cases (no AQM tail-drop, AQM-drop,
> AQM-classic ECN).
> 
> My understanding was that TCP-Prague's classic bottleneck detection
> code wasn't fully baked.

Hi Steven, I'll take a crack at this as I see it anyway:

*No AQM tail-drop & AQM-drop*

Both _should_ be OK, as L4S transports, at least Prague, treat drop
with a 50% MD (barring one bug which has been fixed). We have tested
with straight tail-drop FIFOs and drop-based AQMs and afaik so far it
was safe, even if performance wasn't ideal in all cases.

*AQM-classic ECN, single queue*

Severity:

L4S flows drive competing flows, ECN capable or not, down to somewhere
around minimum cwnd. FCT for shorter flows is also harmed, but some
flows can do better, if they complete before getting out of SS.

Prevalence:

We're not sure how many single queue AQMs are enabled, so it's unclear
how often this would be a problem. Maybe rarely, but it's hard to
believe that there are zero single queue 3168 AQMs enabled out there.

*AQM-classic ECN, FQ*

Severity:

Same as AQM-classic ECN single queue, _when there is a problem_.

Prevalence:

FQ protects competing flows, unless L4S and non-L4S traffic ends up in
the same queue. This can happen with a hash collision, or maybe more
commonly, with tunneled traffic in tunnels that support copying the ECN
bits from the inner to the outer. If anyone thinks of any other reasons
we haven't considered why competing flows would share the same 5-tuple
and thus the same queue, do mention it. :) We've tried to get a handle
on the percentage of random paths with fq_codel deployed. In one
environment we measured around 10%, but that's still +/- an order of
magnitude as for the general Internet, given that the study was
relatively small
(https://tools.ietf.org/html/draft-heist-tsvwg-ecn-deployment-observations-02#section-3.2
).

Lastly, not a safety problem but a performance problem, when L4S flows
traverse ANY fq_codel bottleneck they impose delays on themselves,
since they don't respond to CE in the way the AQM expects. That leads
to intra-flow latency spikes, explained here:
https://github.com/heistp/l4s-tests/#intra-flow-latency-spikes
So, this will happen on whatever percentage of paths fq_codel, or any
other RFC3168 AQM is deployed on. Delay spikes after rate reductions
can be higher in Codel due to how the algorithm works.

> On Tue, 2021-03-09 at 02:13 +0000, Holland, Jake wrote:
> > The presentations were pretty great, but they were really short
> > on time.  In the chat a person or 2 was surprised about the way
> > L4S will impact NECT competing traffic when competing in a queue.
> > I agree some of the people who have tuned out the discussion are
> > learning things from these presentations, and I thought Jonathan's
> > slot was a good framing of the real question, and Pete's study was
> > also very helpful.
> > 
> > I seem to recall a thread in the wake of Apple's ECN enabling about
> > one of the Linux distros considering turning ECN on by default for
> > outbound connections, in which one of them found that it completely
> > wrecked his throughput, and so it got tabled with unfortunately
> > no pcap posted.
> > 
> > Any recollection of where that was?  I was guessing it might be
> > one of the misbehaviors from the network that Apple encountered.
> > 
> > I also thought Apple had a sysctl to disable the hold-downs and
> > always use ECN in spite of the heuristics, did that not work?
> > 
> > -Jake
> > 
> > On 3/8/21, 3:57 PM, "Dave Taht" <dave.taht at gmail.com> wrote:
> > 
> > Thx very much for the update. I wanted to note that
> > preseem does a lot of work with wisps and I wish they'd share more
> > data on it, as well as our ever present mention of free.fr.
> > 
> > Another data point is that apple's early rollout of ecn was kind of
> > a failure, and there are now so many workarounds in the os for it as
> > to make coherent testing impossible.
> > 
> > I do wish there was more work on ecn enabling bbr, as presently
> > it does negotiate ecn often and then completely ignores it. You can
> > see this in traces from dropbox in particular.
> > 
> > 
> > 
> > On Mon, Mar 8, 2021 at 3:47 PM Pete Heist <pete at heistp.net> wrote:
> > > Just responding to Dave's ask for a quick IETF 110 summary on ecn-
> > > sane,
> > > after one day. We presented the data on ECN at MAPRG
> > > (
> > > https://urldefense.com/v3/__https://datatracker.ietf.org/doc/draft-heist-tsvwg-ecn-deployment-observations/__;!!GjvTz_vk!AsneqOLeLWeNxzyWItOxlVbVQYefAMLslNpK4U9NEHw0dfUI0vDG7O07G3f1kzw$
> > >  
> > > ). It basically just showed that ECN is in use by endpoints (more
> > > as a
> > > proportion across paths than a proportion of flows), that RFC3168
> > > AQMs
> > > do exist out there and are signaling, and that the ECN field can be
> > > misused. There weren't any questions, maybe because we were the
> > > last to
> > > present and were already short on time.
> > > 
> > > We also applied that to L4S by first explaining that risk is the
> > > product of severity and prevalence, and tried to increase the
> > > awareness
> > > about the flow domination problem when L4S flows meet non-L4S flows
> > > (ECN or not) in a 3168 queue. Spreading this information seems to
> > > go
> > > slowly, as we're still hearing "oh really?", which leads me to
> > > believe
> > > 1) that people are tuning this debate out, and 2) it just takes a
> > > long
> > > time to comprehend, and to believe. It's still our stance that L4S
> > > can't be deployed due to its signalling design, or if it is, the
> > > end
> > > result is likely to be more bleaching and confusion with the DS
> > > field.
> > > 
> > > There was a question I'd already heard before about why fq_codel is
> > > being deployed at an ISP, so I tried to cover that over in tsvwg.
> > > Basically, fq_codel is not ideal for this purpose, lacking host and
> > > subscriber fairness, but it's available and effective, so it's a
> > > good
> > > start.
> > > 
> > > Wednesday's TSVWG session will be entirely devoted to L4S drafts.
> 
> 
> Regards,
> 
> // Steve
> 
> 
> 
> 
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane




More information about the Ecn-sane mailing list