[Bloat] when does the CoDel part of fq_codel help in the real world?

Dave Taht dave.taht at gmail.com
Tue Nov 27 15:10:22 EST 2018


On Mon, Nov 26, 2018 at 1:56 PM Michael Welzl <michawe at ifi.uio.no> wrote:
>
> Hi folks,
>
> That “Michael” dude was me  :)
>
> About the stuff below, a few comments. First, an impressive effort to dig all of this up - I also thought that this was an interesting conversation to have!
>
> However, I would like to point out that thesis defense conversations are meant to be provocative, by design - when I said that CoDel doesn’t usually help and long queues would be the right thing for all applications, I certainly didn’t REALLY REALLY mean that.  The idea was just to be thought provoking - and indeed I found this interesting: e.g., if you think about a short HTTP/1 connection, a large buffer just gives it a greater chance to get all packets across, and the perceived latency from the reduced round-trips after not dropping anything may in fact be less than with a smaller (or CoDel’ed) buffer.

I really did want Toke to have a hard time. Thanks for putting his
back against the wall!

And I'd rather this be a discussion of toke's views... I do tend to
think he thinks FQ solves more than it does.... and I wish we had a
sound analysis as to why 1024 queues
works so much better for us than 64 or less on the workloads we have.
I tend to think in part it's because that acts as a 1000x1
rate-shifter - but should it scale up? Or down? Is what we did with
cake (1024 setassociative) useful? or excessive? I'm regularly seeing
64,000 queues on 10Gig and up hardware due to 64 hardware queues and
fq_codel on each, on that sort of gear. I think that's too much and
renders the aqm ineffective, but lack data...

but, to rant a bit...

While I tend to believe FQ solves 97% of the problem, AQM 2.9% and ECN .09%.

BUT: Amdahls law says once you reduce one part of the problem to 0,
everything else takes 100%. :)

it often seems like me, being the sole and very lonely FQ advocate
here in 2011, have reversed the situation (in this group!), and I'm
oft the AQM advocate *here* now.

It's sort of like all the people quoting the e2e argument still, back
at me when dave reed (at least, and perhaps the other co-authors now)
have bought into this level of network interference between the
endpoints, and had no religion - or the red in a different light paper
being rejected because it attempted to overturn other religion - and
I'll be damned if I'll let fq_codel, sch_fq, pie, l4s, scream, nada,

I admit to getting kind of crusty and set in my ways, but so long as
people put code in front of me along with the paper, I still think,
when the facts change, so do my opinions.

Pacing is *really impressive* and I'd like to see that enter
everything, not just in packet processing - I've been thinking hard
about the impact of cpu bursts (like resizing a hash table), and other
forms of work that we currently do on computers that have a
"dragster-like" peak performance, and a great average, but horrible
pathologies - and I think the world would be better off if we built
more

Anyway...

Once you have FQ and a sound outer limit on buffer size (100ms),
depredations like comcast's 680ms buffers no longer matter. There's
still plenty of room to innovate. BBR works brilliantly vs fq_codel
(and you can even turn ECN on which it doesn't respect and still get a
great result). LoLa would probably work well also 'cept that the git
tree was busted when I last tried it and it hasn't been tested much in
the 1Mbit-1Gbit range.

>
> But corner cases aside, in fact I very much agree with the answers to my question Pete gives below, and also with the points others have made in answering this thread. Jonathan Morton even mentioned ECN - after Dave’s recent over-reaction to ECN I made a point of not bringing up ECN *yet* again

Not going to go into it (much) today! We ended up starting another
project on ECN that that operates under my core ground rule - "show me
the code" - and life over there and on that mailing list has been
pleasantly quiet. https://www.bufferbloat.net/projects/ecn-sane/wiki/
.

I did get back on the tsvwg mailing list recently because of some
ludicrously inaccurate misstatements about fq_codel. I also made a
strong appeal to the l4s people, to, in general, "stop thanking me" in
their documents. To me that reads as an endorsement, where all I did
was participate in the process until I gave up and hit my "show me the
code" moment - which was about 5 years ago and hasn't moved on the
needle since except in mutating standards documents.

The other document I didn't like was an arbitary attempt to just set
the ecn backoff figure to .8 when the sanest thing, given the
deployment, and pacing... was to aim for a right number - anyway.....
in that case I just wanted off the "thank you" list.

I like to think the more or less rfc3168 compliant deployment of ecn
is thus far going brilliantly, but lack data. Certainly would like a
hostile reviewers evaluation of cake's ecn method and for that matter,
pie's, honestly - from real traffic! There's an RFC- compliant version
of Pie being pushed into the kernel after it gets through some of
stephens nits.

And I'd really prefer all future discussions of "ecn benefits" to come
with code and data and be discussed over on the ecn-sane mailing list,
or *not discussed here* if no code is available.

>, but… yes indeed, being able to use ECN to tell an application to back off instead of requiring to drop a packet is also one of the benefits.

One thus far mis-understood and under-analyzed aspect of our work is
the switch to head dropping.

To me the switch to head dropping essentially killed the tail loss RTO
problem, eliminated most of the need for ecn. Forward progress and
prompt signalling always happens. That otherwise wonderful piece
stuart cheshire did at apple elided the actual dropping mode version
of fq_codel, which as best as I recall was about 12? 15ms? long and
totally invisible to the application.

> (I think people easily miss the latency benefit of not dropping a packet, and thereby eliminating head-of-line blocking - packet drops require an extra RTT for retransmission, which can be quite a long time. This is about measuring latency at the right layer...)

see above. And yea, perversely, I agree with your last statement. A
slashdot web page download takes 78 separate flows and 2.2 seconds to
complete. Worst case completion
time - if you had *tail* loss would be about 80ms longer than that, on
a tiny fraction of loads. The rest of it is absorbed into those 2.2
seconds.

EVEN with http 2.0/ I would be extremely surprised to learn that many
websites fit it all into one tcp transaction.

There are very few other examples of TCP traffic requiring a low
latency response. I happen to be very happy with the ecn support in
mosh btw, not that anybody's ever looked at it since we did it.

And I'd really prefer all future discussions of "ecn benefits" to come
with code and data and be discussed over on the ecn-sane mailing list,
or not discussed here if no code is available.

> BTW, Anna Brunstrom was also very quick to also give me the HTTP/2.0 example in the break after the defense. Also, TCP will generally not work very well when queues get very long… the RTT estimate gets way off.

I like to think that the syn/ack and ssl negotation handshake under
fq_codel gives a much more accurate estimate of actual RTT than we
ever had before.

> All in all, I think this is a fun thought to consider for a bit, but not really something worth spending people’s time on, IMO: big buffers are bad, period. All else are corner cases.

I've said it elsewhere, and perhaps we should resume, but an RFC
merely stating the obvious about maximal buffer limits and getting
ISPs do to do that would be a boon.

> I’ll use the opportunity to tell folks that I was also pretty impressed with Toke’s thesis as well as his performance at the defense. Among the many cool things he’s developed (or contributed to), my personal favorite is the airtime fairness scheduler. But, there were many more. Really good stuff.

I so wish the world has about 1000 more toke's in training. How can we
make that happen?

>
> With that, I wish all the best to all you bloaters out there - thanks for reducing our queues!
>
> Cheers,
> Michael
>
>
> On Nov 26, 2018, at 8:08 PM, Pete Heist <pete at heistp.net> wrote:
>
> In Toke’s thesis defense, there was an interesting exchange with examination committee member Michael (apologies for not catching the last name) regarding how the CoDel part of fq_codel helps in the real world:
>
> https://www.youtube.com/watch?v=upvx6rpSLSw&t=2h16m20s
>
> My attempt at a transcript is at the end of this message. (I probably won’t attempt a full defense transcript, but if someone wants more of a particular section I can try. :)
>
> So I just thought to continue the discussion- when does the CoDel part of fq_codel actually help in the real world? I’ll speculate with a few possibilities:
>
> 1) Multiplexed HTTP/2.0 requests containing both a saturating stream and interactive traffic. For example, a game that uses HTTP/2.0 to download new map data while position updates or chat happen at the same time. Standalone programs could use HTTP/2.0 this way, or for web apps, the browser may multiplex concurrent uses of XHR over a single TCP connection. I don’t know of any examples.
>
> 2) SSH with port forwarding while using an interactive terminal together with a bulk transfer?
>
> 3) Does CoDel help the TCP protocol itself somehow? For example, does it speed up the round-trip time when acknowledging data segments, improving behavior on lossy links? Similarly, does it speed up the TCP close sequence for saturating flows?
>
> Pete
>
> ---
>
> M: In fq_codel what is really the point of CoDel?
> T: Yeah, uh, a bit better intra-flow latency...
> M: Right, who cares about that?
> T: Apparently some people do.
> M: No I mean specifically, what types of flows care about that?
> T: Yeah, so, um, flows that are TCP based or have some kind of- like, elastic flows that still want low latency.
> M: Elastic flows that are TCP based that want low latency...
> T: Things where you want to discover the- like, you want to utilize the full link and sort of probe the bandwidth, but you still want low latency.
> M: Can you be more concrete what kind of application is that?
> T: I, yeah, I…
> M: Give me any application example that’s gonna benefit from the CoDel part- CoDel bits in fq_codel? Because I have problems with this.
> T: I, I do too... So like, you can implement things this way but equivalently if you have something like fq_codel you could, like, if you have a video streaming application that interleaves control…
> M: <inaudible> that runs on UDP often.
> T: Yeah, but I, Netflix…
> M: Ok that’s a long way… <inaudible>
> T: No, I tend to agree with you that, um…
> M: Because the biggest issue in my opinion is, is web traffic- for web traffic, just giving it a huge queue makes the chance bigger that uh, <inaudible, ed: because of the slow start> so you may end up with a (higher) faster completion time by buffering a lot. Uh, you’re not benefitting at all by keeping the queue very small, you are simply <inaudible> Right, you’re benefitting altogether by just <inaudible> which is what the queue does with this nice sparse flow, uh… <inaudible>
> T: You have the infinite buffers in the <inaudible> for that to work, right. One benefit you get from CoDel is that - you screw with things like - you have to drop eventually.
> M: You should at some point. The chances are bigger that the small flow succeeds (if given a huge queue). And, in web surfing, why does that, uh(?)
> T: Yeah, mmm...
> M: Because that would be an example of something where I care about latency but I care about low completion. Other things where I care about latency they often don’t send very much. <inaudible...> bursts, you have to accommodate them basically. Or you have interactive traffic which is UDP and tries to, often react from queueing delay <inaudible>. I’m beginning to suspect that fq minus CoDel is really the best <inaudible> out there.
> T: But if, yeah, if you have enough buffer.
> M: Well, the more the better.
> T: Yeah, well.
> M: Haha, I got you to say yes. [laughter :] That goes in history. I said the more the better and you said yeah.
> T: No but like, it goes back to good-queue bad-queue, like, buffering in itself has value, you just need to manage it.
> M: Ok.
> T: Which is also the reason why just having a small queue doesn’t help in itself.
> M: Right yeah. Uh, I have a silly question about fq_codel, a very silly one and there may be something I missed in the papers, probably I did, but I'm I was just wondering I mean first of all this is also a bit silly in that <inaudible> it’s a security thing, and I think that’s kind of a package by itself silly because fq_codel often probably <inaudible> just in principle, is that something I could easily attack by creating new flows for every packet?
> T: No because, they, you will…
> M: With the sparse flows, and it’s gonna…
> T: Yeah, but at some point you’re going to go over the threshold, I, you could, there there’s this thing where the flow goes in, it’s sparse, it empties out and then you put it on the normal round robin implementation before you queue <inaudible> And if you don’t do that than you can have, you could time packets so that they get priority just at the right time and you could have lockout.
> M: Yes.
> T: But now you will just fall back to fq.
> M: Ok, it was just a curiousity, it’s probably in the paper. <inaudible>
> T: I think we added that in the RFC, um, you really need to, like, this part is important.
>
> _______________________________________________
> Bloat mailing list
> Bloat at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>
>
> _______________________________________________
> Bloat mailing list
> Bloat at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740



More information about the Bloat mailing list