From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-bk0-f43.google.com (mail-bk0-f43.google.com [209.85.214.43]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 5CC1421F15E; Wed, 28 Nov 2012 08:16:11 -0800 (PST) Received: by mail-bk0-f43.google.com with SMTP id jf20so6593655bkc.16 for ; Wed, 28 Nov 2012 08:16:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=9upvNeC25q/gtfSaBTkpkYXdbqZNijZOU+1risAdbwk=; b=jhdzFTt1ECml+lPg2XZmoUEXVOjwPlX5I3aq4g/xSxki9WTm8/n23JRkD/QC2zupy4 C0lmdrEoxoNKPw1WXcnLCFD2plwfdI2miw2Kj1KxRI+0A+uJAO/MfBJfOsNWoJATbO3n HwTVoEQl2YWaUaQYWAFnyuAB+F9yugeMUEOQV2bxCUXFFkr+ZWzMxvK4qlN2qPbtxEGZ PRhvM8R8ERvqSDWlPR0y7R2F87/vAH6kW2eFr+FFWvEl/uKLsu66g0RF7YN79jJWyKBs yVT9ZEEWPa134VjL1UDK/F+AMOtL1s7+la/qXycBmRLyMLY8SwEWs+bnbIR1OG4WMK0H mUaA== MIME-Version: 1.0 Received: by 10.204.146.92 with SMTP id g28mr5783034bkv.127.1354119368741; Wed, 28 Nov 2012 08:16:08 -0800 (PST) Received: by 10.204.41.196 with HTTP; Wed, 28 Nov 2012 08:16:08 -0800 (PST) Received: by 10.204.41.196 with HTTP; Wed, 28 Nov 2012 08:16:08 -0800 (PST) In-Reply-To: <20121128160133.GA16995@linux.vnet.ibm.com> References: <20121127224915.GM2474@linux.vnet.ibm.com> <20121128002710.GS2474@linux.vnet.ibm.com> <50B5887C.7010605@pollere.com> <20121128043838.GX2474@linux.vnet.ibm.com> <20121128160133.GA16995@linux.vnet.ibm.com> Date: Wed, 28 Nov 2012 18:16:08 +0200 Message-ID: From: Jonathan Morton To: paulmck@linux.vnet.ibm.com Content-Type: multipart/alternative; boundary=0015174a0456f5142304cf907b3b Cc: Paolo Valente , =?ISO-8859-1?Q?Toke_H=F8iland=2DJ=F8rgensen?= , Eric Raymond , "codel@lists.bufferbloat.net" , "cerowrt-devel@lists.bufferbloat.net" , bloat , John Crispin Subject: Re: [Codel] [Cerowrt-devel] FQ_Codel lwn draft article review X-BeenThere: codel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: CoDel AQM discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Nov 2012 16:16:12 -0000 --0015174a0456f5142304cf907b3b Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable It may be worth noting that fq-codel is not stochastic in it's fairness mechanism. SFQ suffers from the birthday effect because it hashes packets into buffers, which is what makes it stochastic. - Jonathan Morton On Nov 28, 2012 6:02 PM, "Paul E. McKenney" wrote: > Dave gave me back the pen, so I looked to see what I had expanded > FQ-CoDel to. The answer was... Nothing. Nothing at all. > > So I added a Quick Quiz as follows: > > Quick Quiz 2: What does the FQ-CoDel acronym expand to? > > Answer: There are some differences of opinion on this. The > comment header in net/sched/sch_fq_codel.c says > =93Fair Queue CoDel=94 (presumably by analogy to SFQ's > expansion of =93Stochastic Fairness Queueing=94), and > =93CoDel=94 is generally agreed to expand to =93controlle= d > delay=94. However, some prefer =93Flow Queue Controlled > Delay=94 and still others prefer to prepend a silent and > invisible "S", expanding to =93Stochastic Flow Queue > Controlled Delay=94 or =93Smart Flow Queue Controlled > Delay=94. No doubt additional expansions will appear in > the fullness of time. > > In the meantime, this article focuses on the concepts, > implementation, and performance, leaving naming debates > to others. > > This level snarkiness would go over reasonably well in an LWN article, > I would -not- suggest this approach in an academic paper, just in case > you were wondering. But if there is too much discomfort with snarking, > I just might be convinced to take another approach. > > Thanx, Paul > > On Tue, Nov 27, 2012 at 08:38:38PM -0800, Paul E. McKenney wrote: > > I guess I just have to be grateful that people mostly agree on the > acronym, > > regardless of the expansion. > > > > Thanx, Paul > > > > On Tue, Nov 27, 2012 at 07:43:56PM -0800, Kathleen Nichols wrote: > > > > > > It would be me that tries to say "stochastic flow queuing with CoDel" > > > as I like to be accurate. But I think FQ-Codel is Flow queuing with > CoDel. > > > JimG suggests "smart flow queuing" because he is ever mindful of the > > > big audience. > > > > > > On 11/27/12 4:27 PM, Paul E. McKenney wrote: > > > > On Tue, Nov 27, 2012 at 04:53:34PM -0700, Greg White wrote: > > > >> BTW, I've heard some use the term "stochastic flow queueing" as a > > > >> replacement to avoid the term "fair". Seems like a more apt term > anyway. > > > > > > > > Would that mean that FQ-CoDel is Flow Queue Controlled Delay? ;-) > > > > > > > > Thanx, Paul > > > > > > > >> -Greg > > > >> > > > >> > > > >> On 11/27/12 3:49 PM, "Paul E. McKenney" > wrote: > > > >> > > > >>> Thank you for the review and comments, Jim! I will apply them wh= en > > > >>> I get the pen back from Dave. And yes, that is the thing about > > > >>> "fairness" -- there are a great many definitions, many of the mos= t > > > >>> useful of which appear to many to be patently unfair. ;-) > > > >>> > > > >>> As you suggest, it might well be best to drop discussion of > fairness, > > > >>> or to at the least supply the corresponding definition. > > > >>> > > > >>> Thanx, Pa= ul > > > >>> > > > >>> On Tue, Nov 27, 2012 at 05:03:02PM -0500, Jim Gettys wrote: > > > >>>> Some points worth making: > > > >>>> > > > >>>> 1) It is important to point out that (and how) fq_codel avoids > > > >>>> starvation: > > > >>>> unpleasant as elephant flows are, it would be very unfriendly to > never > > > >>>> service them at all until they time out. > > > >>>> > > > >>>> 2) "fairness" is not necessarily what we ultimately want at all; > you'd > > > >>>> really like to penalize those who induce congestion the most. > But we > > > >>>> don't > > > >>>> currently have a solution (though Bob Briscoe at BT thinks he > does, and > > > >>>> is > > > >>>> seeing if he can get it out from under a BT patent), so the > current > > > >>>> fq_codel round robins ultimately until/unless we can do somethin= g > like > > > >>>> Bob's idea. This is a local information only subset of the idea= s > he's > > > >>>> been > > > >>>> working on in the congestion exposure (conex) group at the IETF. > > > >>>> > > > >>>> 3) "fairness" is always in the eyes of the beholder (and should > be left > > > >>>> to > > > >>>> the beholder to determine). "fairness" depends on where in the > network > > > >>>> you > > > >>>> are. While being "fair" among TCP flows is sensible default > policy for > > > >>>> a > > > >>>> host, else where in the network it may not be/usually isn't. > > > >>>> > > > >>>> Two examples: > > > >>>> o at a home router, you probably want to be "fair" according to > transmit > > > >>>> opportunities. We really don't want a single system remote from > the > > > >>>> router > > > >>>> to be able to starve the network so that devices near the router > get > > > >>>> much > > > >>>> less bandwidth than you might hope/expect. > > > >>>> > > > >>>> What is more, you probably want to account for a single host > using many > > > >>>> flows, and regulate that they not be able to "hog" bandwidth in > the home > > > >>>> environment, but only use their "fair" share. > > > >>>> > > > >>>> o at an ISP, you must to be "fair" between customers; it is best > to > > > >>>> leave > > > >>>> the judgement of "fairness" at finer granularity (e.g. host and > TCP > > > >>>> flows) > > > >>>> to the points closer to the customer's systems, so that they can > enforce > > > >>>> whatever definition of "fair" they need to themselves. > > > >>>> > > > >>>> > > > >>>> Algorithms like fq_codel can be/should be adjusted to the > circumstances. > > > >>>> > > > >>>> And therefore exactly what you choose to hash against to form th= e > > > >>>> buckets > > > >>>> will vary depending on where you are. That at least one step (a= t > the > > > >>>> user's device) of this be TCP flow "fair" does have the great > advantage > > > >>>> of > > > >>>> helping the RTT unfairness problem that violates the principle o= f > "least > > > >>>> surprise", such as that routinely seen in places like New Zealan= d. > > > >>>> > > > >>>> This is why I have so many problems using the word "fair" near > this > > > >>>> algorithm. "fair" is impossible to define, overloaded in > people's mind > > > >>>> with TCP fair queuing, not even desirable much of the time, and = by > > > >>>> definition and design, even today's fq_codel isn't fair to lots = of > > > >>>> things, > > > >>>> and the same basic algorithm can/should be tweaked in lots of > directions > > > >>>> depending on what we need to do. Calling this "smart" queuing o= r > some > > > >>>> such > > > >>>> would be better. > > > >>>> > > > >>>> When you've done another round on the document, I'll do a more > detailed > > > >>>> read. > > > >>>> - Jim > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> On Fri, Nov 23, 2012 at 5:18 PM, Paul E. McKenney < > > > >>>> paulmck@linux.vnet.ibm.com> wrote: > > > >>>> > > > >>>>> On Fri, Nov 23, 2012 at 09:57:34AM +0100, Dave Taht wrote: > > > >>>>>> David Woodhouse and I fiddled a lot with adsl and openwrt and = a > > > >>>>>> variety of drivers and network layers in a typical bonded adsl > stack > > > >>>>>> yesterday. The complexity of it all makes my head hurt. I'm > happy > > > >>>> that > > > >>>>>> a newly BQL'd ethernet driver (for the geos and qemu) emerged > from > > > >>>> it, > > > >>>>>> which he submitted to netdev... > > > >>>>> > > > >>>>> Cool!!! ;-) > > > >>>>> > > > >>>>>> I made a recording of us last night discussing the layers, > which I > > > >>>>>> will produce and distribute later... > > > >>>>>> > > > >>>>>> Anyway, along the way, we fiddled a lot with trying to analyze > where > > > >>>>>> the 350ms or so of added latency was coming from in the traver= se > > > >>>> geo's > > > >>>>>> adsl implementation and overlying stack.... > > > >>>>>> > > > >>>>>> Plots: http://david.woodhou.se/dwmw2-netperf-plots.tar.gz > > > >>>>>> > > > >>>>>> Note: 1: > > > >>>>>> > > > >>>>>> The netperf sample rate on the rrul test needs to be higher > than > > > >>>>>> 100ms in order to get a decent result at sub 10Mbit speeds. > > > >>>>>> > > > >>>>>> Note 2: > > > >>>>>> > > > >>>>>> The two nicest graphs here are nofq.svg vs fq.svg, which were > taken > > > >>>> on > > > >>>>>> a gigE link from a Mac running Linux to another gigE link. (in > other > > > >>>>>> words, NOT on the friggin adsl link) (firefox can display svg,= I > > > >>>> don't > > > >>>>>> know what else) I find the T+10 delay before stream start in t= he > > > >>>>>> fq.svg graph suspicious and think the "throw out the outlier" > code > > > >>>> in > > > >>>>>> the netperf-wrapper code is at fault. Prior to that, codel is > merely > > > >>>>>> buffering up things madly, which can also be seen in the > pfifo_fast > > > >>>>>> behavior, with 1000pkts it's default. > > > >>>>> > > > >>>>> I am using these two in a new "Effectiveness of FQ-CoDel" > section. > > > >>>>> Chrome can display .svg, and if it becomes a problem, I am sure > that > > > >>>>> they can be converted. Please let me know if some other data > would > > > >>>>> make the point better. > > > >>>>> > > > >>>>> I am assuming that the colored throughput spikes are due to > occasional > > > >>>>> packet losses. Please let me know if this interpretation is > overly > > > >>>> naive. > > > >>>>> > > > >>>>> Also, I know what ICMP is, but the UDP variants are new to me. > Could > > > >>>>> you please expand the "EF", "BK", "BE", and "CSS" acronyms? > > > >>>>> > > > >>>>>> (Arguably, the default queue length in codel can be reduced > from 10k > > > >>>>>> packets to something more reasonable at GigE speeds) > > > >>>>>> > > > >>>>>> (the indicator that it's the graph, not the reality, is that t= he > > > >>>>>> fq.svg pings and udp start at T+5 and grow minimally, as is > usual > > > >>>> with > > > >>>>>> fq_codel.) > > > >>>>> > > > >>>>> All sessions were started at T+5, then? > > > >>>>> > > > >>>>>> As for the *.ps graphs, well, they would take david's network > > > >>>> topology > > > >>>>>> to explain, and were conducted over a variety of circumstances= , > > > >>>>>> including wifi, with more variables in play than I care to thi= nk > > > >>>>>> about. > > > >>>>>> > > > >>>>>> We didn't really get anywhere on digging deeper. As we got to > purer > > > >>>>>> tests - with a minimal number of boxes, running pure ethernet, > > > >>>>>> switched over a couple of switches, even in the simplest two b= ox > > > >>>> case, > > > >>>>>> my HTB based "ceroshaper" implementation had multiple problems > in > > > >>>>>> cutting median latencies below 100ms, on this very slow ADSL > link. > > > >>>>>> David suspects problems on the path along the carrier backbone > as a > > > >>>>>> potential issue, and the only way to measure that is with two > one > > > >>>> way > > > >>>>>> trip time measurements (rather than rtt), time synced via > ntp... I > > > >>>>>> keep hoping to find a rtp test, but I'm open to just about any > > > >>>> option > > > >>>>>> at this point. anyone? > > > >>>>>> > > > >>>>>> We also found a probable bug in mtr in that multiple mtrs on t= he > > > >>>> same > > > >>>>>> box don't co-exist. > > > >>>>> > > > >>>>> I must confess that I am not seeing all that clear a difference > > > >>>> between > > > >>>>> the behaviors of ceroshaper and FQ-CoDel. Maybe somewhat bette= r > > > >>>> latencies > > > >>>>> for FQ-CoDel, but not unambiguously so. > > > >>>>> > > > >>>>>> Moving back to more scientific clarity and simpler tests... > > > >>>>>> > > > >>>>>> The two graphs, taken a few weeks back, on pages 5 and 6 of > this: > > > >>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > http://www.teklibre.com/~d/bloat/Not_every_packet_is_sacred-Battling_Buff > > > >>>> erbloat_on_wifi.pdf > > > >>>>>> > > > >>>>>> appear to show the advantage of fq_codel fq + codel + head dro= p > over > > > >>>>>> tail drop during the slow start period on a 10Mbit link - (see > how > > > >>>>>> squiggly slow start is on pfifo fast?) as well as the marvelou= s > > > >>>>>> interstream latency that can be achieved with BQL=3D3000 (on a= 10 > mbit > > > >>>>>> link.) Even that latency can be halved by reducing BQL to 150= 0, > > > >>>> which > > > >>>>>> is just fine on a 10mbit. Below those rates I'd like to be rid > of > > > >>>> BQL > > > >>>>>> entirely, and just have a single packet outstanding... in > everything > > > >>>>>> from adsl to cable... > > > >>>>>> > > > >>>>>> That said, I'd welcome other explanations of the squiggly > slowstart > > > >>>>>> pfifo_fast behavior before I put that explanation on the > slide.... > > > >>>> ECN > > > >>>>>> was in play here, too. I can redo this test easily, it's > basically > > > >>>>>> running a netperf TCP_RR for 70 seconds, and starting up a > > > >>>> TCP_MAERTS > > > >>>>>> and TCP_STREAM for 60 seconds a T+5, after hammering down on > BQL's > > > >>>>>> limit and the link speeds on two sides of a directly connected > > > >>>> laptop > > > >>>>>> connection. > > > >>>>> > > > >>>>> I must defer to others on this one. I do note the much lower > > > >>>> latencies > > > >>>>> on slide 6 compared to slide 5, though. > > > >>>>> > > > >>>>> Please see attached for update including .git directory. > > > >>>>> > > > >>>>> Thanx, > Paul > > > >>>>> > > > >>>>>> ethtool -s eth0 advertise 0x002 # 10 Mbit > > > >>>>>> > > > >>>>> > > > >>>>> _______________________________________________ > > > >>>>> Cerowrt-devel mailing list > > > >>>>> Cerowrt-devel@lists.bufferbloat.net > > > >>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel > > > >>>>> > > > >>>>> > > > >>> > > > >>> _______________________________________________ > > > >>> Codel mailing list > > > >>> Codel@lists.bufferbloat.net > > > >>> https://lists.bufferbloat.net/listinfo/codel > > > >> > > > > > > > > _______________________________________________ > > > > Codel mailing list > > > > Codel@lists.bufferbloat.net > > > > https://lists.bufferbloat.net/listinfo/codel > > > > > > > > > _______________________________________________ > Codel mailing list > Codel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/codel > --0015174a0456f5142304cf907b3b Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable

It may be worth noting that fq-codel is not stochastic in it's fairn= ess mechanism. SFQ suffers from the birthday effect because it hashes packe= ts into buffers, which is what makes it stochastic.

- Jonathan Morton

On Nov 28, 2012 6:02 PM, "Paul E. McKenney&= quot; <paulmck@linux.vnet.= ibm.com> wrote:
Dave gave me back the pen, so I looked to see what I had expanded
FQ-CoDel to. =A0The answer was... =A0Nothing. =A0Nothing at all.

So I added a Quick Quiz as follows:

=A0 =A0 =A0 =A0 Quick Quiz 2: What does the FQ-CoDel acronym expand to?

=A0 =A0 =A0 =A0 Answer: There are some differences of opinion on this. The<= br> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 comment header in net/sched/sch_fq_codel.c = says
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =93Fair Queue CoDel=94 (presumably by analo= gy to SFQ's
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 expansion of =93Stochastic Fairness Queuein= g=94), and
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =93CoDel=94 is generally agreed to expand t= o =93controlled
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 delay=94. However, some prefer =93Flow Queu= e Controlled
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Delay=94 and still others prefer to prepend= a silent and
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 invisible "S", expanding to =93St= ochastic Flow Queue
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Controlled Delay=94 or =93Smart Flow Queue = Controlled
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Delay=94. No doubt additional expansions wi= ll appear in
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 the fullness of time.

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 In the meantime, this article focuses on th= e concepts,
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 implementation, and performance, leaving na= ming debates
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to others.

This level snarkiness would go over reasonably well in an LWN article,
I would -not- suggest this approach in an academic paper, just in case
you were wondering. =A0But if there is too much discomfort with snarking, I just might be convinced to take another approach.

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Thanx, Paul

On Tue, Nov 27, 2012 at 08:38:38PM -0800, Paul E. McKenney wrote:
> I guess I just have to be grateful that people mostly agree on the acr= onym,
> regardless of the expansion.
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Thanx, Paul
>
> On Tue, Nov 27, 2012 at 07:43:56PM -0800, Kathleen Nichols wrote:
> >
> > It would be me that tries to say "stochastic flow queuing wi= th CoDel"
> > as I like to be accurate. But I think FQ-Codel is Flow queuing wi= th CoDel.
> > JimG suggests "smart flow queuing" because he is ever m= indful of the
> > big audience.
> >
> > On 11/27/12 4:27 PM, Paul E. McKenney wrote:
> > > On Tue, Nov 27, 2012 at 04:53:34PM -0700, Greg White wrote:<= br> > > >> BTW, I've heard some use the term "stochastic f= low queueing" as a
> > >> replacement to avoid the term "fair". =A0Seems= like a more apt term anyway.
> > >
> > > Would that mean that FQ-CoDel is Flow Queue Controlled Delay= ? =A0;-)
> > >
> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Thanx, Paul
> > >
> > >> -Greg
> > >>
> > >>
> > >> On 11/27/12 3:49 PM, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com&= gt; wrote:
> > >>
> > >>> Thank you for the review and comments, Jim! =A0I wil= l apply them when
> > >>> I get the pen back from Dave. =A0And yes, that is th= e thing about
> > >>> "fairness" -- there are a great many defin= itions, many of the most
> > >>> useful of which appear to many to be patently unfair= . =A0;-)
> > >>>
> > >>> As you suggest, it might well be best to drop discus= sion of fairness,
> > >>> or to at the least supply the corresponding definiti= on.
> > >>>
> > >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Thanx, Paul
> > >>>
> > >>> On Tue, Nov 27, 2012 at 05:03:02PM -0500, Jim Gettys= wrote:
> > >>>> Some points worth making:
> > >>>>
> > >>>> 1) It is important to point out that (and how) f= q_codel avoids
> > >>>> starvation:
> > >>>> unpleasant as elephant flows are, it would be ve= ry unfriendly to never
> > >>>> service them at all until they time out.
> > >>>>
> > >>>> 2) "fairness" is not necessarily what = we ultimately want at all; you'd
> > >>>> really like to penalize those who induce congest= ion the most. =A0But we
> > >>>> don't
> > >>>> currently have a solution (though Bob Briscoe at= BT thinks he does, and
> > >>>> is
> > >>>> seeing if he can get it out from under a BT pate= nt), so the current
> > >>>> fq_codel round robins ultimately until/unless we= can do something like
> > >>>> Bob's idea. =A0This is a local information o= nly subset of the ideas he's
> > >>>> been
> > >>>> working on in the congestion exposure (conex) gr= oup at the IETF.
> > >>>>
> > >>>> 3) "fairness" is always in the eyes of= the beholder (and should be left
> > >>>> to
> > >>>> the beholder to determine). "fairness"= depends on where in the network
> > >>>> you
> > >>>> are. =A0While being "fair" among TCP f= lows is sensible default policy for
> > >>>> a
> > >>>> host, else where in the network it may not be/us= ually isn't.
> > >>>>
> > >>>> Two examples:
> > >>>> o at a home router, you probably want to be &quo= t;fair" according to transmit
> > >>>> opportunities. =A0We really don't want a sin= gle system remote from the
> > >>>> router
> > >>>> to be able to starve the network so that devices= near the router get
> > >>>> much
> > >>>> less bandwidth than you might hope/expect.
> > >>>>
> > >>>> What is more, you probably want to account for a= single host using many
> > >>>> flows, and regulate that they not be able to &qu= ot;hog" bandwidth in the home
> > >>>> environment, but only use their "fair"= share.
> > >>>>
> > >>>> o at an ISP, you must to be "fair" bet= ween customers; it is best to
> > >>>> leave
> > >>>> the judgement of "fairness" at finer g= ranularity (e.g. host and TCP
> > >>>> flows)
> > >>>> to the points closer to the customer's syste= ms, so that they can enforce
> > >>>> whatever definition of "fair" they nee= d to themselves.
> > >>>>
> > >>>>
> > >>>> Algorithms like fq_codel can be/should be adjust= ed to the circumstances.
> > >>>>
> > >>>> And therefore exactly what you choose to hash ag= ainst to form the
> > >>>> buckets
> > >>>> will vary depending on where you are. =A0That at= least one step (at the
> > >>>> user's device) of this be TCP flow "fai= r" does have the great advantage
> > >>>> of
> > >>>> helping the RTT unfairness problem that violates= the principle of "least
> > >>>> surprise", such as that routinely seen in p= laces like New Zealand.
> > >>>>
> > >>>> This is why I have so many problems using the wo= rd "fair" near this
> > >>>> algorithm. =A0"fair" is impossible to = define, overloaded in people's mind
> > >>>> with TCP fair queuing, not even desirable much o= f the time, and by
> > >>>> definition and design, even today's fq_codel= isn't fair to lots of
> > >>>> things,
> > >>>> and the same basic algorithm can/should be tweak= ed in lots of directions
> > >>>> depending on what we need to do. =A0Calling this= "smart" queuing or some
> > >>>> such
> > >>>> would be better.
> > >>>>
> > >>>> When you've done another round on the docume= nt, I'll do a more detailed
> > >>>> read.
> > >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0- Jim
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Fri, Nov 23, 2012 at 5:18 PM, Paul E. McKenne= y <
> > >>>> pa= ulmck@linux.vnet.ibm.com> wrote:
> > >>>>
> > >>>>> On Fri, Nov 23, 2012 at 09:57:34AM +0100, Da= ve Taht wrote:
> > >>>>>> David Woodhouse and I fiddled a lot with= adsl and openwrt and a
> > >>>>>> variety of drivers and network layers in= a typical bonded adsl stack
> > >>>>>> yesterday. The complexity of it all make= s my head hurt. I'm happy
> > >>>> that
> > >>>>>> a newly BQL'd ethernet driver (for t= he geos and qemu) emerged from
> > >>>> it,
> > >>>>>> which he submitted to netdev...
> > >>>>>
> > >>>>> Cool!!! =A0;-)
> > >>>>>
> > >>>>>> I made a recording of us last night disc= ussing the layers, which I
> > >>>>>> will produce and distribute later...
> > >>>>>>
> > >>>>>> Anyway, along the way, we fiddled a lot = with trying to analyze where
> > >>>>>> the 350ms or so of added latency was com= ing from in the traverse
> > >>>> geo's
> > >>>>>> adsl implementation and overlying stack.= ...
> > >>>>>>
> > >>>>>> Plots: http://david.woodhou.se/dwm= w2-netperf-plots.tar.gz
> > >>>>>>
> > >>>>>> Note: 1:
> > >>>>>>
> > >>>>>> The =A0netperf sample rate on the rrul t= est needs to be higher than
> > >>>>>> 100ms in order to get a decent result at= sub 10Mbit speeds.
> > >>>>>>
> > >>>>>> Note 2:
> > >>>>>>
> > >>>>>> The two nicest graphs here are nofq.svg = vs fq.svg, which were taken
> > >>>> on
> > >>>>>> a gigE link from a Mac running Linux to = another gigE link. (in other
> > >>>>>> words, NOT on the friggin adsl link) (fi= refox can display svg, I
> > >>>> don't
> > >>>>>> know what else) I find the T+10 delay be= fore stream start in the
> > >>>>>> fq.svg graph suspicious and think the &q= uot;throw out the outlier" code
> > >>>> in
> > >>>>>> the netperf-wrapper code is at fault. Pr= ior to that, codel is merely
> > >>>>>> buffering up things madly, which can als= o be seen in the pfifo_fast
> > >>>>>> behavior, with 1000pkts it's default= .
> > >>>>>
> > >>>>> I am using these two in a new "Effectiv= eness of FQ-CoDel" section.
> > >>>>> Chrome can display .svg, and if it becomes a= problem, I am sure that
> > >>>>> they can be converted. =A0Please let me know= if some other data would
> > >>>>> make the point better.
> > >>>>>
> > >>>>> I am assuming that the colored throughput sp= ikes are due to occasional
> > >>>>> packet losses. =A0Please let me know if this= interpretation is overly
> > >>>> naive.
> > >>>>>
> > >>>>> Also, I know what ICMP is, but the UDP varia= nts are new to me. =A0Could
> > >>>>> you please expand the "EF", "= BK", "BE", and "CSS" acronyms?
> > >>>>>
> > >>>>>> (Arguably, the default queue length in c= odel can be reduced from 10k
> > >>>>>> packets to something more reasonable at = GigE speeds)
> > >>>>>>
> > >>>>>> (the indicator that it's the graph, = not the reality, is that the
> > >>>>>> fq.svg pings and udp start at T+5 and gr= ow minimally, as is usual
> > >>>> with
> > >>>>>> fq_codel.)
> > >>>>>
> > >>>>> All sessions were started at T+5, then?
> > >>>>>
> > >>>>>> As for the *.ps graphs, well, they would= take david's network
> > >>>> topology
> > >>>>>> to explain, and were conducted over a va= riety of circumstances,
> > >>>>>> including wifi, with more variables in p= lay than I care to think
> > >>>>>> about.
> > >>>>>>
> > >>>>>> We didn't really get anywhere on dig= ging deeper. As we got to purer
> > >>>>>> tests - with a minimal number of boxes, = running pure ethernet,
> > >>>>>> switched over a couple of switches, even= in the simplest two box
> > >>>> case,
> > >>>>>> my HTB based "ceroshaper" impl= ementation had multiple problems in
> > >>>>>> cutting median latencies below 100ms, on= this very slow ADSL link.
> > >>>>>> David suspects problems on the path alon= g the carrier backbone as a
> > >>>>>> potential issue, and the only way to mea= sure that is with two one
> > >>>> way
> > >>>>>> trip time measurements (rather than rtt)= , time synced via ntp... I
> > >>>>>> keep hoping to find a rtp test, but I= 9;m open to just about any
> > >>>> option
> > >>>>>> at this point. anyone?
> > >>>>>>
> > >>>>>> We also found a probable bug in mtr in t= hat multiple mtrs on the
> > >>>> same
> > >>>>>> box don't co-exist.
> > >>>>>
> > >>>>> I must confess that I am not seeing all that= clear a difference
> > >>>> between
> > >>>>> the behaviors of ceroshaper and FQ-CoDel. = =A0Maybe somewhat better
> > >>>> latencies
> > >>>>> for FQ-CoDel, but not unambiguously so.
> > >>>>>
> > >>>>>> Moving back to more scientific clarity a= nd simpler tests...
> > >>>>>>
> > >>>>>> The two graphs, taken a few weeks back, = on pages 5 and 6 of this:
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>> http://www.teklibre= .com/~d/bloat/Not_every_packet_is_sacred-Battling_Buff
> > >>>> erbloat_on_wifi.pdf
> > >>>>>>
> > >>>>>> appear to show the advantage of fq_codel= fq + codel + head drop over
> > >>>>>> tail drop during the slow start period o= n a 10Mbit link - (see how
> > >>>>>> squiggly slow start is on pfifo fast?) a= s well as the marvelous
> > >>>>>> interstream latency that can be achieved= with BQL=3D3000 (on a 10 mbit
> > >>>>>> link.) =A0Even that latency can be halve= d by reducing BQL to 1500,
> > >>>> which
> > >>>>>> is just fine on a 10mbit. Below those ra= tes I'd like to be rid of
> > >>>> BQL
> > >>>>>> entirely, and just have a single packet = outstanding... in everything
> > >>>>>> from adsl to cable...
> > >>>>>>
> > >>>>>> That said, I'd welcome other explana= tions of the squiggly slowstart
> > >>>>>> pfifo_fast behavior before I put that ex= planation on the slide....
> > >>>> ECN
> > >>>>>> was in play here, too. I can redo this t= est easily, it's basically
> > >>>>>> running a netperf TCP_RR for 70 seconds,= and starting up a
> > >>>> TCP_MAERTS
> > >>>>>> and TCP_STREAM for 60 seconds a T+5, aft= er hammering down on BQL's
> > >>>>>> limit and the link speeds on two sides o= f a directly connected
> > >>>> laptop
> > >>>>>> connection.
> > >>>>>
> > >>>>> I must defer to others on this one. =A0I do = note the much lower
> > >>>> latencies
> > >>>>> on slide 6 compared to slide 5, though.
> > >>>>>
> > >>>>> Please see attached for update including .gi= t directory.
> > >>>>>
> > >>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Thanx, = Paul
> > >>>>>
> > >>>>>> ethtool -s eth0 advertise 0x002 # 10 Mbi= t
> > >>>>>>
> > >>>>>
> > >>>>> ____________________________________________= ___
> > >>>>> Cerowrt-devel mailing list
> > >>>>> Cerowrt-devel@lists.bufferbloat.net
> > >>>>> https://lists.bufferbloat.net/listin= fo/cerowrt-devel
> > >>>>>
> > >>>>>
> > >>>
> > >>> _______________________________________________
> > >>> Codel mailing list
> > >>> Codel= @lists.bufferbloat.net
> > >>> https://lists.bufferbloat.net/listinfo/codel
> > >>
> > >
> > > _______________________________________________
> > > Codel mailing list
> > > Codel@lists.b= ufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/codel
> > >
> >

_______________________________________________
Codel mailing list
Codel@lists.bufferbloat.net<= /a>
= https://lists.bufferbloat.net/listinfo/codel
--0015174a0456f5142304cf907b3b--