From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <nichols@pollere.com>
Received: from mho-02-ewr.mailhop.org (mho-04-ewr.mailhop.org [204.13.248.74])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 9EA4221F15A;
	Tue, 27 Nov 2012 19:43:39 -0800 (PST)
Received: from c-24-4-217-203.hsd1.ca.comcast.net ([24.4.217.203]
	helo=kmn.local)
	by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
	(Exim 4.72) (envelope-from <nichols@pollere.com>)
	id 1TdYYf-0000eb-3Y; Wed, 28 Nov 2012 03:43:33 +0000
X-Mail-Handler: Dyn Standard SMTP by Dyn
X-Originating-IP: 24.4.217.203
X-Report-Abuse-To: abuse@dyndns.com (see
	http://www.dyndns.com/services/sendlabs/outbound_abuse.html for
	abuse reporting information)
X-MHO-User: U2FsdGVkX1/cnuJe9CAfVkrJR6ULLPiqCf2knQlvoCs=
Message-ID: <50B5887C.7010605@pollere.com>
Date: Tue, 27 Nov 2012 19:43:56 -0800
From: Kathleen Nichols <nichols@pollere.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7;
	rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: paulmck@linux.vnet.ibm.com
References: <20121127224915.GM2474@linux.vnet.ibm.com>
	<CCDA9A91.16B11%g.white@cablelabs.com>
	<20121128002710.GS2474@linux.vnet.ibm.com>
In-Reply-To: <20121128002710.GS2474@linux.vnet.ibm.com>
X-Enigmail-Version: 1.4.6
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: Paolo Valente <paolo.valente@unimore.it>,
	=?ISO-8859-1?Q?Toke_H=F8iland-J=F8rgensen?= <toke@toke.dk>,
	"codel@lists.bufferbloat.net" <codel@lists.bufferbloat.net>,
	"cerowrt-devel@lists.bufferbloat.net"
	<cerowrt-devel@lists.bufferbloat.net>, bloat <bloat@lists.bufferbloat.net>,
	John Crispin <blogic@openwrt.org>
Subject: Re: [Bloat] [Codel] [Cerowrt-devel] FQ_Codel lwn draft article
	review
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Wed, 28 Nov 2012 03:43:40 -0000


It would be me that tries to say "stochastic flow queuing with CoDel"
as I like to be accurate. But I think FQ-Codel is Flow queuing with CoDel.
JimG suggests "smart flow queuing" because he is ever mindful of the
big audience.

On 11/27/12 4:27 PM, Paul E. McKenney wrote:
> On Tue, Nov 27, 2012 at 04:53:34PM -0700, Greg White wrote:
>> BTW, I've heard some use the term "stochastic flow queueing" as a
>> replacement to avoid the term "fair".  Seems like a more apt term anyway.
> 
> Would that mean that FQ-CoDel is Flow Queue Controlled Delay?  ;-)
> 
> 							Thanx, Paul
> 
>> -Greg
>>
>>
>> On 11/27/12 3:49 PM, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>>
>>> Thank you for the review and comments, Jim!  I will apply them when
>>> I get the pen back from Dave.  And yes, that is the thing about
>>> "fairness" -- there are a great many definitions, many of the most
>>> useful of which appear to many to be patently unfair.  ;-)
>>>
>>> As you suggest, it might well be best to drop discussion of fairness,
>>> or to at the least supply the corresponding definition.
>>>
>>> 							Thanx, Paul
>>>
>>> On Tue, Nov 27, 2012 at 05:03:02PM -0500, Jim Gettys wrote:
>>>> Some points worth making:
>>>>
>>>> 1) It is important to point out that (and how) fq_codel avoids
>>>> starvation:
>>>> unpleasant as elephant flows are, it would be very unfriendly to never
>>>> service them at all until they time out.
>>>>
>>>> 2) "fairness" is not necessarily what we ultimately want at all; you'd
>>>> really like to penalize those who induce congestion the most.  But we
>>>> don't
>>>> currently have a solution (though Bob Briscoe at BT thinks he does, and
>>>> is
>>>> seeing if he can get it out from under a BT patent), so the current
>>>> fq_codel round robins ultimately until/unless we can do something like
>>>> Bob's idea.  This is a local information only subset of the ideas he's
>>>> been
>>>> working on in the congestion exposure (conex) group at the IETF.
>>>>
>>>> 3) "fairness" is always in the eyes of the beholder (and should be left
>>>> to
>>>> the beholder to determine). "fairness" depends on where in the network
>>>> you
>>>> are.  While being "fair" among TCP flows is sensible default policy for
>>>> a
>>>> host, else where in the network it may not be/usually isn't.
>>>>
>>>> Two examples:
>>>> o at a home router, you probably want to be "fair" according to transmit
>>>> opportunities.  We really don't want a single system remote from the
>>>> router
>>>> to be able to starve the network so that devices near the router get
>>>> much
>>>> less bandwidth than you might hope/expect.
>>>>
>>>> What is more, you probably want to account for a single host using many
>>>> flows, and regulate that they not be able to "hog" bandwidth in the home
>>>> environment, but only use their "fair" share.
>>>>
>>>> o at an ISP, you must to be "fair" between customers; it is best to
>>>> leave
>>>> the judgement of "fairness" at finer granularity (e.g. host and TCP
>>>> flows)
>>>> to the points closer to the customer's systems, so that they can enforce
>>>> whatever definition of "fair" they need to themselves.
>>>>
>>>>
>>>> Algorithms like fq_codel can be/should be adjusted to the circumstances.
>>>>
>>>> And therefore exactly what you choose to hash against to form the
>>>> buckets
>>>> will vary depending on where you are.  That at least one step (at the
>>>> user's device) of this be TCP flow "fair" does have the great advantage
>>>> of
>>>> helping the RTT unfairness problem that violates the principle of "least
>>>> surprise", such as that routinely seen in places like New Zealand.
>>>>
>>>> This is why I have so many problems using the word "fair" near this
>>>> algorithm.  "fair" is impossible to define, overloaded in people's mind
>>>> with TCP fair queuing, not even desirable much of the time, and by
>>>> definition and design, even today's fq_codel isn't fair to lots of
>>>> things,
>>>> and the same basic algorithm can/should be tweaked in lots of directions
>>>> depending on what we need to do.  Calling this "smart" queuing or some
>>>> such
>>>> would be better.
>>>>
>>>> When you've done another round on the document, I'll do a more detailed
>>>> read.
>>>>                              - Jim
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Nov 23, 2012 at 5:18 PM, Paul E. McKenney <
>>>> paulmck@linux.vnet.ibm.com> wrote:
>>>>
>>>>> On Fri, Nov 23, 2012 at 09:57:34AM +0100, Dave Taht wrote:
>>>>>> David Woodhouse and I fiddled a lot with adsl and openwrt and a
>>>>>> variety of drivers and network layers in a typical bonded adsl stack
>>>>>> yesterday. The complexity of it all makes my head hurt. I'm happy
>>>> that
>>>>>> a newly BQL'd ethernet driver (for the geos and qemu) emerged from
>>>> it,
>>>>>> which he submitted to netdev...
>>>>>
>>>>> Cool!!!  ;-)
>>>>>
>>>>>> I made a recording of us last night discussing the layers, which I
>>>>>> will produce and distribute later...
>>>>>>
>>>>>> Anyway, along the way, we fiddled a lot with trying to analyze where
>>>>>> the 350ms or so of added latency was coming from in the traverse
>>>> geo's
>>>>>> adsl implementation and overlying stack....
>>>>>>
>>>>>> Plots: http://david.woodhou.se/dwmw2-netperf-plots.tar.gz
>>>>>>
>>>>>> Note: 1:
>>>>>>
>>>>>> The  netperf sample rate on the rrul test needs to be higher than
>>>>>> 100ms in order to get a decent result at sub 10Mbit speeds.
>>>>>>
>>>>>> Note 2:
>>>>>>
>>>>>> The two nicest graphs here are nofq.svg vs fq.svg, which were taken
>>>> on
>>>>>> a gigE link from a Mac running Linux to another gigE link. (in other
>>>>>> words, NOT on the friggin adsl link) (firefox can display svg, I
>>>> don't
>>>>>> know what else) I find the T+10 delay before stream start in the
>>>>>> fq.svg graph suspicious and think the "throw out the outlier" code
>>>> in
>>>>>> the netperf-wrapper code is at fault. Prior to that, codel is merely
>>>>>> buffering up things madly, which can also be seen in the pfifo_fast
>>>>>> behavior, with 1000pkts it's default.
>>>>>
>>>>> I am using these two in a new "Effectiveness of FQ-CoDel" section.
>>>>> Chrome can display .svg, and if it becomes a problem, I am sure that
>>>>> they can be converted.  Please let me know if some other data would
>>>>> make the point better.
>>>>>
>>>>> I am assuming that the colored throughput spikes are due to occasional
>>>>> packet losses.  Please let me know if this interpretation is overly
>>>> naive.
>>>>>
>>>>> Also, I know what ICMP is, but the UDP variants are new to me.  Could
>>>>> you please expand the "EF", "BK", "BE", and "CSS" acronyms?
>>>>>
>>>>>> (Arguably, the default queue length in codel can be reduced from 10k
>>>>>> packets to something more reasonable at GigE speeds)
>>>>>>
>>>>>> (the indicator that it's the graph, not the reality, is that the
>>>>>> fq.svg pings and udp start at T+5 and grow minimally, as is usual
>>>> with
>>>>>> fq_codel.)
>>>>>
>>>>> All sessions were started at T+5, then?
>>>>>
>>>>>> As for the *.ps graphs, well, they would take david's network
>>>> topology
>>>>>> to explain, and were conducted over a variety of circumstances,
>>>>>> including wifi, with more variables in play than I care to think
>>>>>> about.
>>>>>>
>>>>>> We didn't really get anywhere on digging deeper. As we got to purer
>>>>>> tests - with a minimal number of boxes, running pure ethernet,
>>>>>> switched over a couple of switches, even in the simplest two box
>>>> case,
>>>>>> my HTB based "ceroshaper" implementation had multiple problems in
>>>>>> cutting median latencies below 100ms, on this very slow ADSL link.
>>>>>> David suspects problems on the path along the carrier backbone as a
>>>>>> potential issue, and the only way to measure that is with two one
>>>> way
>>>>>> trip time measurements (rather than rtt), time synced via ntp... I
>>>>>> keep hoping to find a rtp test, but I'm open to just about any
>>>> option
>>>>>> at this point. anyone?
>>>>>>
>>>>>> We also found a probable bug in mtr in that multiple mtrs on the
>>>> same
>>>>>> box don't co-exist.
>>>>>
>>>>> I must confess that I am not seeing all that clear a difference
>>>> between
>>>>> the behaviors of ceroshaper and FQ-CoDel.  Maybe somewhat better
>>>> latencies
>>>>> for FQ-CoDel, but not unambiguously so.
>>>>>
>>>>>> Moving back to more scientific clarity and simpler tests...
>>>>>>
>>>>>> The two graphs, taken a few weeks back, on pages 5 and 6 of this:
>>>>>>
>>>>>>
>>>>>
>>>> http://www.teklibre.com/~d/bloat/Not_every_packet_is_sacred-Battling_Buff
>>>> erbloat_on_wifi.pdf
>>>>>>
>>>>>> appear to show the advantage of fq_codel fq + codel + head drop over
>>>>>> tail drop during the slow start period on a 10Mbit link - (see how
>>>>>> squiggly slow start is on pfifo fast?) as well as the marvelous
>>>>>> interstream latency that can be achieved with BQL=3000 (on a 10 mbit
>>>>>> link.)  Even that latency can be halved by reducing BQL to 1500,
>>>> which
>>>>>> is just fine on a 10mbit. Below those rates I'd like to be rid of
>>>> BQL
>>>>>> entirely, and just have a single packet outstanding... in everything
>>>>>> from adsl to cable...
>>>>>>
>>>>>> That said, I'd welcome other explanations of the squiggly slowstart
>>>>>> pfifo_fast behavior before I put that explanation on the slide....
>>>> ECN
>>>>>> was in play here, too. I can redo this test easily, it's basically
>>>>>> running a netperf TCP_RR for 70 seconds, and starting up a
>>>> TCP_MAERTS
>>>>>> and TCP_STREAM for 60 seconds a T+5, after hammering down on BQL's
>>>>>> limit and the link speeds on two sides of a directly connected
>>>> laptop
>>>>>> connection.
>>>>>
>>>>> I must defer to others on this one.  I do note the much lower
>>>> latencies
>>>>> on slide 6 compared to slide 5, though.
>>>>>
>>>>> Please see attached for update including .git directory.
>>>>>
>>>>>                                                         Thanx, Paul
>>>>>
>>>>>> ethtool -s eth0 advertise 0x002 # 10 Mbit
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Cerowrt-devel mailing list
>>>>> Cerowrt-devel@lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>>>>
>>>>>
>>>
>>> _______________________________________________
>>> Codel mailing list
>>> Codel@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/codel
>>
> 
> _______________________________________________
> Codel mailing list
> Codel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel
>