[Ecn-sane] [tsvwg] Comments on L4S drafts

Dave Taht dave.taht at gmail.com
Fri Jul 19 19:42:54 EDT 2019


On Fri, Jul 19, 2019 at 3:09 PM Wesley Eddy <wes at mti-systems.com> wrote:
>
> Hi Dave, thanks for clarifying, and sorry if you're getting upset.

There have been a few other disappointments this ietf. I'd hoped bbrv2
would land for independent testing. Didn't.

https://github.com/google/bbr

I have some "interesting" patches for bbrv1 but felt it would be saner
to wait for the most current version (or for the bbrv2 authors to
have the small rfc3168 baseline patch I'd requested tested by them
rather than I), to bother redoing that series of tests and publishing.

I'd asked if the dctcp and dualpi code on github was stable enough to
be independently tested. No reply.

The SCE folk did freeze and document a release worth testing.

I did some testing on wifi at battlemesh but it's too noisy (but the
sources of "noise" were important) and too obviously "ecn is not the
wifi problem"

I didn't know there was an "add a delay based option to cubic patch"
until last week.

So anyway, I do retain hope, maybe after this coming week and some
more hackathoning, it might be possible to start getting reproducible
and repeatable results from more participants in this controversy.
Having to sit through another half-dozen presentations with
irreproducible results is not something I look forward to, and I'm
glad I don't have to.

> When we're talking about keeping very small queues, then RTT is lost as
> a congestion indicator (since there is no queue depth to modulate as a
> congestion signal into the RTT).  We have indicators that include drop,
> RTT, and ECN (when available).  Using rate of marks rather than just
> binary presence of marking gives a finer-grained signal.  SCE is also
> providing a multi-level indication, so that's another way to get more
> "ENOB" into the samples of congestion being fed to the controllers.

While this is extremely well said, RTT is NOT lost as a congestion
indicator, it just becomes finer grained.

While I'm reading tea-leaves... there's been a lot of stuff landing in
the linux kernel from google around edf scheduling for tcp and the
hardware enabled pacing qdiscs. So I figure they are now in the nsec
category on their stuff but not ready to be talking.

> Marking (whether classic ECN, mark-rate, or multi-level marking) is
> needed since with small queues there's lack of congestion information in
> the RTT.

small queues *and isochronous, high speed, wired connections*.

What will it take to get the ecn and especially l4s crowd to take a
hard look at actual wireless or wifi packet captures? I mean, y'all
are sitting staring into your laptops for a week, doing wifi. Would it
hurt to test more actual transports during
that time?

How many ISPs would still be in business if wifi didn't exist, only {X}G?

the wifi at the last ietf sucked...

Can't even get close to 5ms latencies on any form of wireless/wifi.

Anyway, I long ago agreed that multiple marks (of some sort) per rtt
made sense (see my position statements on ecn-sane),
but of late I've been leaning more towards really good pacing,  rtt
and chirping with minimal marking required on
"small queues *and isochronous, high speed, wired connections*.

>
> To address one question you repeated a couple times:
>
> > Is there any chance we'll see my conception of the good ietf process
> > enforced on the L4S and SCE processes by the chairs?
>
> We look for working group consensus.  So far, we saw consensus to adopt
> as a WG item for experimental track, and have been following the process
> for that.

Well, given the announcement of docsis low latency, and the size of
the fq_codel deployment,
and the l4s/sce drafts, we are light-years beyond anything I'd
consider to be "experimental" in the real world.

Would recognizing this reality and somehow converting this to a
standards track debate within the ietf help anything?

Would getting this out of tsvwg and restarting aqmwg help any?

I was, up until all this blew up in december, planning on starting the
process for an rfc8289bis and rfc8290bis on the standards track.

>
> On the topic of gaming the system by falsely setting the L4S ID, that
> might need to be discussed a little bit more, since now that you mention
> it, the docs don't seem to very directly address it yet.

to me this has always been a game theory deal killer for l4s (and
diffserv, intserv, etc). You cannot ask for
more priority, only less. While I've been recommending books from
kleinrock lately, another one that
I think everyone in this field should have is:

https://www.amazon.com/Theory-Games-Economic-Behavior-Commemorative-ebook/dp/B00AMAZL4I/ref=sr_1_1?keywords=theory+of+games+and+economic+behavior&qid=1563579161&s=gateway&sr=8-1

I've read it countless times (and can't claim to have understood more
than a tiny percentage of it). I wasn't aware
until this moment there was a kindle edition.

> I can only
> speak for myself, but assumed a couple things internally, such as (1)
> this is getting enabled in specific environments, (2) in less controlled
> environments, an operator enabling it has protections in place for
> getting admission or dealing with bad behavior, (3) there could be
> further development of audit capabilities such as in CONEX, etc.  I
> guess it could be good to hear more about what others were thinking on this.

I think there was "yet another queue" suggested for detected bad behavior.

>
> > So I should have said - "tosses all normal ("classic") flows into a
> > single and higher latency queue when a greedy normal flow is present"
> > ... "in the dualpi" case? I know it's possible to hang a different
> > queue algo on the "normal" queue, but
> > to this day I don't see the need for the l4s "fast lane" in the first
> > place, nor a cpu efficient way of doing the right things with the
> > dualpi or curvyred code. What I see, is, long term, that special bit
> > just becomes a "fast" lane for any sort of admission controlled
> > traffic the ISP wants to put there, because the dualpi idea fails on
> > real traffic.
>
> Thanks; this was helpful for me to understand your position.

Groovy.

I recently ripped ecn support out of fq_codel entirely, in
the fq_codel_fast tree. saved some cpu, still measuring (my real objective
is to make that code multicore),

another branch also has the basic sce support, and will have more
after jon settles on a ramp and single queue fallbacks in
sch_cake. btw, if anyone cares, there's more than a few flent test
servers scattered around the internet now that
do some variant of sce for others to play with....

>
>
> > Well if the various WGs would exit that nice hotel, and form a
> > diaspora over the city in coffee shops and other public spaces, and do
> > some tests of your latest and greatest stuff, y'all might get a more
> > accurate viewpoint of what you are actually accomplishing. Take a look
> > at what BBR does, take a look at what IW10 does, take a look at what
> > browsers currently do.
>
> All of those things come up in the meetings, and frequently there is
> measurement data shown and discussed.  It's always welcome when people
> bring measurements, data, and experience.  The drafts and other
> contributions are here so that anyone interested can independently
> implement and do the testing you advocate and share results.  We're all
> on the same team trying to make the Internet better.

Skip a meeting. Try the internet in Bali. Or africa. Or south america.
Or on a boat, Or do an interim
in places like that.

>
>


-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


More information about the Ecn-sane mailing list