From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Kathleen Nichols <nichols@pollere.com>
Cc: "Paolo Valente" <paolo.valente@unimore.it>,
"Toke Høiland-Jørgensen" <toke@toke.dk>,
"Eric Raymond" <esr@thyrsus.com>,
"codel@lists.bufferbloat.net" <codel@lists.bufferbloat.net>,
"cerowrt-devel@lists.bufferbloat.net"
<cerowrt-devel@lists.bufferbloat.net>,
bloat <bloat@lists.bufferbloat.net>,
"John Crispin" <blogic@openwrt.org>
Subject: Re: [Codel] [Cerowrt-devel] FQ_Codel lwn draft article review
Date: Tue, 27 Nov 2012 20:38:38 -0800 [thread overview]
Message-ID: <20121128043838.GX2474@linux.vnet.ibm.com> (raw)
In-Reply-To: <50B5887C.7010605@pollere.com>
I guess I just have to be grateful that people mostly agree on the acronym,
regardless of the expansion.
Thanx, Paul
On Tue, Nov 27, 2012 at 07:43:56PM -0800, Kathleen Nichols wrote:
>
> It would be me that tries to say "stochastic flow queuing with CoDel"
> as I like to be accurate. But I think FQ-Codel is Flow queuing with CoDel.
> JimG suggests "smart flow queuing" because he is ever mindful of the
> big audience.
>
> On 11/27/12 4:27 PM, Paul E. McKenney wrote:
> > On Tue, Nov 27, 2012 at 04:53:34PM -0700, Greg White wrote:
> >> BTW, I've heard some use the term "stochastic flow queueing" as a
> >> replacement to avoid the term "fair". Seems like a more apt term anyway.
> >
> > Would that mean that FQ-CoDel is Flow Queue Controlled Delay? ;-)
> >
> > Thanx, Paul
> >
> >> -Greg
> >>
> >>
> >> On 11/27/12 3:49 PM, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >>
> >>> Thank you for the review and comments, Jim! I will apply them when
> >>> I get the pen back from Dave. And yes, that is the thing about
> >>> "fairness" -- there are a great many definitions, many of the most
> >>> useful of which appear to many to be patently unfair. ;-)
> >>>
> >>> As you suggest, it might well be best to drop discussion of fairness,
> >>> or to at the least supply the corresponding definition.
> >>>
> >>> Thanx, Paul
> >>>
> >>> On Tue, Nov 27, 2012 at 05:03:02PM -0500, Jim Gettys wrote:
> >>>> Some points worth making:
> >>>>
> >>>> 1) It is important to point out that (and how) fq_codel avoids
> >>>> starvation:
> >>>> unpleasant as elephant flows are, it would be very unfriendly to never
> >>>> service them at all until they time out.
> >>>>
> >>>> 2) "fairness" is not necessarily what we ultimately want at all; you'd
> >>>> really like to penalize those who induce congestion the most. But we
> >>>> don't
> >>>> currently have a solution (though Bob Briscoe at BT thinks he does, and
> >>>> is
> >>>> seeing if he can get it out from under a BT patent), so the current
> >>>> fq_codel round robins ultimately until/unless we can do something like
> >>>> Bob's idea. This is a local information only subset of the ideas he's
> >>>> been
> >>>> working on in the congestion exposure (conex) group at the IETF.
> >>>>
> >>>> 3) "fairness" is always in the eyes of the beholder (and should be left
> >>>> to
> >>>> the beholder to determine). "fairness" depends on where in the network
> >>>> you
> >>>> are. While being "fair" among TCP flows is sensible default policy for
> >>>> a
> >>>> host, else where in the network it may not be/usually isn't.
> >>>>
> >>>> Two examples:
> >>>> o at a home router, you probably want to be "fair" according to transmit
> >>>> opportunities. We really don't want a single system remote from the
> >>>> router
> >>>> to be able to starve the network so that devices near the router get
> >>>> much
> >>>> less bandwidth than you might hope/expect.
> >>>>
> >>>> What is more, you probably want to account for a single host using many
> >>>> flows, and regulate that they not be able to "hog" bandwidth in the home
> >>>> environment, but only use their "fair" share.
> >>>>
> >>>> o at an ISP, you must to be "fair" between customers; it is best to
> >>>> leave
> >>>> the judgement of "fairness" at finer granularity (e.g. host and TCP
> >>>> flows)
> >>>> to the points closer to the customer's systems, so that they can enforce
> >>>> whatever definition of "fair" they need to themselves.
> >>>>
> >>>>
> >>>> Algorithms like fq_codel can be/should be adjusted to the circumstances.
> >>>>
> >>>> And therefore exactly what you choose to hash against to form the
> >>>> buckets
> >>>> will vary depending on where you are. That at least one step (at the
> >>>> user's device) of this be TCP flow "fair" does have the great advantage
> >>>> of
> >>>> helping the RTT unfairness problem that violates the principle of "least
> >>>> surprise", such as that routinely seen in places like New Zealand.
> >>>>
> >>>> This is why I have so many problems using the word "fair" near this
> >>>> algorithm. "fair" is impossible to define, overloaded in people's mind
> >>>> with TCP fair queuing, not even desirable much of the time, and by
> >>>> definition and design, even today's fq_codel isn't fair to lots of
> >>>> things,
> >>>> and the same basic algorithm can/should be tweaked in lots of directions
> >>>> depending on what we need to do. Calling this "smart" queuing or some
> >>>> such
> >>>> would be better.
> >>>>
> >>>> When you've done another round on the document, I'll do a more detailed
> >>>> read.
> >>>> - Jim
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Nov 23, 2012 at 5:18 PM, Paul E. McKenney <
> >>>> paulmck@linux.vnet.ibm.com> wrote:
> >>>>
> >>>>> On Fri, Nov 23, 2012 at 09:57:34AM +0100, Dave Taht wrote:
> >>>>>> David Woodhouse and I fiddled a lot with adsl and openwrt and a
> >>>>>> variety of drivers and network layers in a typical bonded adsl stack
> >>>>>> yesterday. The complexity of it all makes my head hurt. I'm happy
> >>>> that
> >>>>>> a newly BQL'd ethernet driver (for the geos and qemu) emerged from
> >>>> it,
> >>>>>> which he submitted to netdev...
> >>>>>
> >>>>> Cool!!! ;-)
> >>>>>
> >>>>>> I made a recording of us last night discussing the layers, which I
> >>>>>> will produce and distribute later...
> >>>>>>
> >>>>>> Anyway, along the way, we fiddled a lot with trying to analyze where
> >>>>>> the 350ms or so of added latency was coming from in the traverse
> >>>> geo's
> >>>>>> adsl implementation and overlying stack....
> >>>>>>
> >>>>>> Plots: http://david.woodhou.se/dwmw2-netperf-plots.tar.gz
> >>>>>>
> >>>>>> Note: 1:
> >>>>>>
> >>>>>> The netperf sample rate on the rrul test needs to be higher than
> >>>>>> 100ms in order to get a decent result at sub 10Mbit speeds.
> >>>>>>
> >>>>>> Note 2:
> >>>>>>
> >>>>>> The two nicest graphs here are nofq.svg vs fq.svg, which were taken
> >>>> on
> >>>>>> a gigE link from a Mac running Linux to another gigE link. (in other
> >>>>>> words, NOT on the friggin adsl link) (firefox can display svg, I
> >>>> don't
> >>>>>> know what else) I find the T+10 delay before stream start in the
> >>>>>> fq.svg graph suspicious and think the "throw out the outlier" code
> >>>> in
> >>>>>> the netperf-wrapper code is at fault. Prior to that, codel is merely
> >>>>>> buffering up things madly, which can also be seen in the pfifo_fast
> >>>>>> behavior, with 1000pkts it's default.
> >>>>>
> >>>>> I am using these two in a new "Effectiveness of FQ-CoDel" section.
> >>>>> Chrome can display .svg, and if it becomes a problem, I am sure that
> >>>>> they can be converted. Please let me know if some other data would
> >>>>> make the point better.
> >>>>>
> >>>>> I am assuming that the colored throughput spikes are due to occasional
> >>>>> packet losses. Please let me know if this interpretation is overly
> >>>> naive.
> >>>>>
> >>>>> Also, I know what ICMP is, but the UDP variants are new to me. Could
> >>>>> you please expand the "EF", "BK", "BE", and "CSS" acronyms?
> >>>>>
> >>>>>> (Arguably, the default queue length in codel can be reduced from 10k
> >>>>>> packets to something more reasonable at GigE speeds)
> >>>>>>
> >>>>>> (the indicator that it's the graph, not the reality, is that the
> >>>>>> fq.svg pings and udp start at T+5 and grow minimally, as is usual
> >>>> with
> >>>>>> fq_codel.)
> >>>>>
> >>>>> All sessions were started at T+5, then?
> >>>>>
> >>>>>> As for the *.ps graphs, well, they would take david's network
> >>>> topology
> >>>>>> to explain, and were conducted over a variety of circumstances,
> >>>>>> including wifi, with more variables in play than I care to think
> >>>>>> about.
> >>>>>>
> >>>>>> We didn't really get anywhere on digging deeper. As we got to purer
> >>>>>> tests - with a minimal number of boxes, running pure ethernet,
> >>>>>> switched over a couple of switches, even in the simplest two box
> >>>> case,
> >>>>>> my HTB based "ceroshaper" implementation had multiple problems in
> >>>>>> cutting median latencies below 100ms, on this very slow ADSL link.
> >>>>>> David suspects problems on the path along the carrier backbone as a
> >>>>>> potential issue, and the only way to measure that is with two one
> >>>> way
> >>>>>> trip time measurements (rather than rtt), time synced via ntp... I
> >>>>>> keep hoping to find a rtp test, but I'm open to just about any
> >>>> option
> >>>>>> at this point. anyone?
> >>>>>>
> >>>>>> We also found a probable bug in mtr in that multiple mtrs on the
> >>>> same
> >>>>>> box don't co-exist.
> >>>>>
> >>>>> I must confess that I am not seeing all that clear a difference
> >>>> between
> >>>>> the behaviors of ceroshaper and FQ-CoDel. Maybe somewhat better
> >>>> latencies
> >>>>> for FQ-CoDel, but not unambiguously so.
> >>>>>
> >>>>>> Moving back to more scientific clarity and simpler tests...
> >>>>>>
> >>>>>> The two graphs, taken a few weeks back, on pages 5 and 6 of this:
> >>>>>>
> >>>>>>
> >>>>>
> >>>> http://www.teklibre.com/~d/bloat/Not_every_packet_is_sacred-Battling_Buff
> >>>> erbloat_on_wifi.pdf
> >>>>>>
> >>>>>> appear to show the advantage of fq_codel fq + codel + head drop over
> >>>>>> tail drop during the slow start period on a 10Mbit link - (see how
> >>>>>> squiggly slow start is on pfifo fast?) as well as the marvelous
> >>>>>> interstream latency that can be achieved with BQL=3000 (on a 10 mbit
> >>>>>> link.) Even that latency can be halved by reducing BQL to 1500,
> >>>> which
> >>>>>> is just fine on a 10mbit. Below those rates I'd like to be rid of
> >>>> BQL
> >>>>>> entirely, and just have a single packet outstanding... in everything
> >>>>>> from adsl to cable...
> >>>>>>
> >>>>>> That said, I'd welcome other explanations of the squiggly slowstart
> >>>>>> pfifo_fast behavior before I put that explanation on the slide....
> >>>> ECN
> >>>>>> was in play here, too. I can redo this test easily, it's basically
> >>>>>> running a netperf TCP_RR for 70 seconds, and starting up a
> >>>> TCP_MAERTS
> >>>>>> and TCP_STREAM for 60 seconds a T+5, after hammering down on BQL's
> >>>>>> limit and the link speeds on two sides of a directly connected
> >>>> laptop
> >>>>>> connection.
> >>>>>
> >>>>> I must defer to others on this one. I do note the much lower
> >>>> latencies
> >>>>> on slide 6 compared to slide 5, though.
> >>>>>
> >>>>> Please see attached for update including .git directory.
> >>>>>
> >>>>> Thanx, Paul
> >>>>>
> >>>>>> ethtool -s eth0 advertise 0x002 # 10 Mbit
> >>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Cerowrt-devel mailing list
> >>>>> Cerowrt-devel@lists.bufferbloat.net
> >>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> >>>>>
> >>>>>
> >>>
> >>> _______________________________________________
> >>> Codel mailing list
> >>> Codel@lists.bufferbloat.net
> >>> https://lists.bufferbloat.net/listinfo/codel
> >>
> >
> > _______________________________________________
> > Codel mailing list
> > Codel@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/codel
> >
>
next prev parent reply other threads:[~2012-11-28 4:38 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAA93jw5yFvrOyXu2s2DY3oK_0v3OaNfnL+1zTteJodfxtAAzcQ@mail.gmail.com>
2012-11-23 8:57 ` [Codel] " Dave Taht
2012-11-23 22:18 ` Paul E. McKenney
2012-11-24 0:07 ` Toke Høiland-Jørgensen
2012-11-24 16:19 ` Dave Taht
2012-11-24 16:36 ` [Codel] [Cerowrt-devel] " dpreed
2012-11-24 19:57 ` [Codel] " Andrew McGregor
2012-11-26 21:13 ` Rick Jones
2012-11-26 21:19 ` Dave Taht
2012-11-26 22:16 ` Toke Høiland-Jørgensen
2012-11-26 23:21 ` Toke Høiland-Jørgensen
2012-11-26 23:39 ` [Codel] [Cerowrt-devel] " dpreed
2012-11-26 23:58 ` Toke Høiland-Jørgensen
2012-11-26 17:20 ` [Codel] " Paul E. McKenney
2012-11-26 21:05 ` Rick Jones
2012-11-26 23:18 ` [Codel] [Bloat] " Rick Jones
2012-11-27 22:03 ` [Codel] [Cerowrt-devel] " Jim Gettys
2012-11-27 22:31 ` [Codel] [Bloat] " David Lang
2012-11-27 22:54 ` Paul E. McKenney
2012-11-27 23:15 ` Andrew McGregor
2012-11-28 0:51 ` Paul E. McKenney
2012-11-28 17:36 ` Paul E. McKenney
2012-11-28 14:06 ` [Codel] [Cerowrt-devel] [Bloat] " Michael Richardson
2012-11-27 22:49 ` [Codel] [Cerowrt-devel] " Paul E. McKenney
2012-11-27 23:53 ` Greg White
2012-11-28 0:27 ` Paul E. McKenney
2012-11-28 3:43 ` Kathleen Nichols
2012-11-28 4:38 ` Paul E. McKenney [this message]
2012-11-28 16:01 ` Paul E. McKenney
2012-11-28 16:16 ` Jonathan Morton
2012-11-28 17:44 ` Paul E. McKenney
2012-11-28 18:37 ` [Codel] [Bloat] " Michael Richardson
2012-11-28 18:51 ` Eric Dumazet
2012-11-28 21:44 ` Michael Richardson
2012-11-28 19:00 ` Eric Dumazet
2012-12-02 21:37 ` Toke Høiland-Jørgensen
2012-12-02 21:47 ` Andrew McGregor
2012-12-03 8:04 ` Dave Taht
2012-12-02 22:07 ` Eric Dumazet
2012-12-02 22:15 ` Toke Høiland-Jørgensen
2012-12-02 22:30 ` Eric Dumazet
2012-12-02 22:51 ` Toke Høiland-Jørgensen
2012-11-28 17:20 ` [Codel] " Paul E. McKenney
2012-12-02 23:06 ` Paul E. McKenney
2012-12-03 11:24 ` Toke Høiland-Jørgensen
2012-12-03 11:31 ` Dave Taht
2012-12-03 12:54 ` Toke Høiland-Jørgensen
2012-12-03 14:58 ` Paul E. McKenney
2012-12-03 15:19 ` Toke Høiland-Jørgensen
2012-12-03 15:49 ` Eric Dumazet
2012-12-03 15:03 ` Paul E. McKenney
2012-12-03 15:58 ` David Woodhouse
2012-12-04 3:13 ` Dan Siemon
2012-12-05 0:01 ` Sebastian Moeller
[not found] ` <1354613026.72238.YahooMailNeo@web126202.mail.ne1.yahoo.com>
2012-12-05 3:41 ` [Codel] [Bloat] " Dan Siemon
[not found] ` <1354739624.4431.YahooMailNeo@web126205.mail.ne1.yahoo.com>
2012-12-06 4:12 ` Dan Siemon
2012-11-30 1:09 ` Dan Siemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/codel.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121128043838.GX2474@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=bloat@lists.bufferbloat.net \
--cc=blogic@openwrt.org \
--cc=cerowrt-devel@lists.bufferbloat.net \
--cc=codel@lists.bufferbloat.net \
--cc=esr@thyrsus.com \
--cc=nichols@pollere.com \
--cc=paolo.valente@unimore.it \
--cc=toke@toke.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox