[Ecn-sane] [tsvwg] Comments on L4S drafts

Dave Taht dave.taht at gmail.com
Fri Jul 19 16:03:14 EDT 2019


On Fri, Jul 19, 2019 at 11:33 AM Wesley Eddy <wes at mti-systems.com> wrote:
>
> On 7/19/2019 11:37 AM, Dave Taht wrote:
> > It's the common-q with AQM **+ ECN** that's the sticking point. I'm
> > perfectly satisfied with the behavior of every ietf approved single
> > queued AQM without ecn enabled. Let's deploy more of those!
>
> Hi Dave, I'm just trying to make sure I'm reading into your message
> correctly ... if I'm understanding it, then you're not in favor of
> either SCE or L4S at all?

I am not in favor of internet scale deployment of *ECN* at this time.
For controlled networks it can make sense. I have, indeed, done so.

Of the two proposals for making ECN safer and more useful, SCE is
struck me as superior when it appeared, and L4S
totally undeployable for a half dozen reasons since it appeared and
perpetually worse as more and more details and flaws fell out of the
architecture documents, and were 'documented' rather than treated as
the showstoppers they were.

>  With small queues and without ECN, loss
> becomes the only congestion signal

RTT... BBR...

>, which is not desirable,

packet loss we know to work with all protocols we have on the internet
other than tcp, and thus is the most important congestion indicator we
have. Until fq_codel's essentially accidental deployment of
ecn-enablement, and apple then turning it on universally, we had
essentially no field data aside from those crazies (like me) that
fully deployed it on their corporate networks.

I do rather like SCE's addition of two new congestion signals and
retention of CE as a very strong one. I'd *really* like it,
additionally, if treating "drop and mark" as an even stronger
congesting indicator also became a thing.

And I'd like it if we did more transport level work (as is finally
happening) on just about everything and *dogfooded* the results on
real home and small business networks (as I do), and ran real
benchmarks with real loads concurrent, before unleashing such a change
to the internet.

> IMHO, or am
> I totally misunderstanding something?

Has it not been clear all these years that I don't care much for ECN
in the first place? Nor do the designers of codel? Nor everyone burned
by it the first time? That ecn advocacy is limited to a very small
passionate number of folk in the ietf?

Do any of the "ecn side" actually dogfood any of their ecn stuff, day
in and day out? I encouraged y'all years ago to convince one uni, one
lab, one reasonably large scale enterprise to go all-in on ecn, and
that has not happened? still?

Look at how much of that sort of testing went into ipv6 before it
started to deploy...

every time I give a talk to the more general networking public -
people that should know what I'm talking about - I have to go explain
ecn, in enormous detail.

One of the most basic side-effects of ecn enablement is that I also
had to ecn-enable the babel protocol so it doesn't get starved out on
slower links. This points to bad side effects on every non-tcp-enabled
routing protocol.

>
> > If we could somehow create a neutral poll in the general networking
> > community outside the ietf (nanog, bsd, linux, dcs, bigcos, routercos,
> > ISPs small and large) , and do it much like your classic "vote for a
> > political measure" thing, with a single point/counterpoint section,
> > maybe we'd get somewhere.
>
> While I agree that would be really useful, it's kind of an "I want a
> pony" statement.  As a TSVWG chair where we're doing this work, we've
> been getting inputs from people that have a foot in many of the
> communities you mention, but always looking for more.

Speaking as someone very fed up with the ietf, that did try to leave a
few months back - there is one sadly optional ietf process I like -
"running code, & two interoperable implementations", that I wish had
been applied to the entire l4s process long before it got to this
point.

public Ns2 and ns3 models of pie and codel were required in the aqm
group. So was independent testing.

In the L4S process we'd also made the strong suggestion the L4Steam
went the openwrt route, just as we did for fq_codel, to be able to
look at real world problems we encountered there like TSO/GRO batching
and non-tcp applications. We still don't got anything even close to
that. L4S is essentially at a pre 2011 state in terms of the real
effects on real networks and legacy applcations.

Wanting that basic stuff, *running* long before it is standardized is
not "I want a pony", it's "you want a unicorn".

"doing the work" includes doing basic stuff like that. to me it's
utterly required to have done that work before inflicting it on even
the tiniest portion of the internet. I have no idea why some ietfers
don't seem to get this.

Anyway, I'm on the verge of losing my temper again, and I really
should just stay clear of these discussions, and steer clear of the
meetings, and try to just read summary reports and code. I rather
liked the early SCE results that went
by one some thread here or another in the past week or two, even the
single queue ones looked promising, and the FQ one was to die for.....

I'm looking forward, as I've always said throughout these processes,
for *RUNNING CODE* and a chance to independently evaluate the various
new ideas on real gear. My personal and principal goal is to make wifi
(and other wireless internet tech)  work better, or at least - not
work worse - that what has already been deployed in 10s of millions in
the fq_codel for wifi work.

I would like it very much if the tsvwg chairs decided to enforce the
"running code, two interoperable implementations,
and independent testability requirements that I have" - and the old
ietf that I used to like used to have - on both L4S and SCE, and the
transport mods under test - and even then the ect(1) dispute needs to
be resolved soon.

Is there any chance we'll see my conception of the good ietf process
enforced on the L4S and SCE processes by the chairs?

I'd sleep better to then focus on what I do best, which is blowing up
ideas in the real world and making them good enough to use across the
internet.

>
>
> > In particular conflating "low latency" really confounds the subject
> > matter, and has for years. FQ gives "low latency" for the vast
> > majority of flows running below their fair share. L4S promises "low
> > latency" for a rigidly defined set of congestion controls in a
> > specialized queue, and otherwise tosses all flows into a higher latency
> > queue when one flow is greedy.
>
> I don't think this is a correct statement.  Packets have to be from a
> "scalable congestion control" to get access to the L4S queue.  There are

No, they just have to mark the right bit.

No requirement to be from a scalable congestion control is *enforcable*.

So I'd never say "packets have to be from a scalable congestion
control", but "they have to set the right bit"

as for the other part, I'd re-say:

"and otherwise toss all  "normal" (classic) flows into a higher
latency classic queue when one normal flow is greedy."

I don't think "have to be from a scalable congestion control" is a
correct statement. What part about how any application can, from
userspace, set:

    const int ds = 0x01;        /* Yea! let's abuse L4S! */
    rc = setsockopt(s, IPPROTO_IPV6, IPV6_TCLASS, &ds, sizeof(ds));

is unclear?

> some draft requirements for using the L4S ID, but they seem pretty
> flexible to me.  Mostly, they're things that an end-host algorithm needs
> to do in order to behave nicely, that might be good things anyways
> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
> work well w/ small RTT, be robust to reordering).  I am curious which
> ones you think are too rigid ... maybe they can be loosened?

no, I don't think they are rigid enough to actually work against
mixed, real workloads!

> Also, I don't think the "tosses all flows into a higher latency queue
> when one flow is greedy" characterization is correct.  The other queue
> is for classic/non-scalable traffic, and not necessarily higher latency

"Classic" is *normal* traffic. roughly 100% of the traffic that exists
today, falls into that queue.

So I should have said - "tosses all normal ("classic") flows into a
single and higher latency queue when a greedy normal flow is present"
... "in the dualpi" case? I know it's possible to hang a different
queue algo on the "normal" queue, but
to this day I don't see the need for the l4s "fast lane" in the first
place, nor a cpu efficient way of doing the right things with the
dualpi or curvyred code. What I see, is, long term, that special bit
just becomes a "fast" lane for any sort of admission controlled
traffic the ISP wants to put there, because the dualpi idea fails on
real traffic.

In my future public statements on this I'm going to give up entirely
on the newspeak.

> for a given flow, nor is winding up there related to whether another
> flow is greedy.

I'm not sure if we were talking about the same thing, but I agree what
I wrote above was originally unclear especially if your mated to the
dualq concept.

>
> > So to me, it goes back to slamming the door shut, or not, on L4S's usage
> > of ect(1) as a too easily gamed e2e identifier. As I don't think it and
> > all the dependent code and algorithms can possibly scale past a single
> > physical layer tech, I'd like to see it move to a DSCP codepoint, worst
> > case... and certainly remain "experimental" in scope until anyone
> > independent can attempt to evaluate it.
>
> That seems good to discuss in regard to the L4S ID draft.  There is a
> section (5.2) there already discussing DSCP, and why it alone isn't
> feasible.  There's also more detailed description of the relation and
> interworking in
> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02

It's kind of a showstopping problem, I think, for anything but a well
controlled network.

Ship some code, do some tests, let some other people at it, get some
real results, starting with flent's rrul tests.

>
>
> > I'd really all the tcp-go-fast-at-any-cost people to take a year off to
> > dogfood their designs, and go live somewhere with a congested network to
> > deal with daily, like a railway or airport, or on 3G network on a
> > sailboat or beach somewhere. It's not a bad life... REALLY.
> >
> Fortunately, at least in the IETF, I don't think there have been
> initiatives in the direction of going fast at any cost in recent
> history, and they would be unlikely to be well accepted if there were!
> That is at least one place that there seems to be strong consensus.

Well if the various WGs would exit that nice hotel, and form a
diaspora over the city in coffee shops and other public spaces, and do
some tests of your latest and greatest stuff, y'all might get a more
accurate viewpoint of what you are actually accomplishing. Take a look
at what BBR does, take a look at what IW10 does, take a look at what
browsers currently do.

IETF design and testing is overly driven by overly simple tests, and
not enough by real world traffic effects.

I'm not coming to this meeting and I'm not on the tsvwg list.

I'd wanted the ecn-sane list to be a nice quiet spot to be able to
think clearly about how to fix the enormous fq_codel deployment -
particularly on wifi - if we had to - far more than I'd wanted to get
embroiled in the l4s debate.

Is there any chance we'll see my conception of the good ietf process
enforced on both the L4S and SCE processes by the chairs?


>
>
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


More information about the Ecn-sane mailing list