CoDel AQM discussions
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Greg White <g.white@cablelabs.com>
Cc: "Paolo Valente" <paolo.valente@unimore.it>,
	"Toke Høiland-Jørgensen" <toke@toke.dk>,
	"Eric Raymond" <esr@thyrsus.com>,
	"codel@lists.bufferbloat.net" <codel@lists.bufferbloat.net>,
	"cerowrt-devel@lists.bufferbloat.net"
	<cerowrt-devel@lists.bufferbloat.net>,
	bloat <bloat@lists.bufferbloat.net>,
	"John Crispin" <blogic@openwrt.org>
Subject: Re: [Codel] [Cerowrt-devel] FQ_Codel lwn draft article review
Date: Tue, 27 Nov 2012 16:27:10 -0800	[thread overview]
Message-ID: <20121128002710.GS2474@linux.vnet.ibm.com> (raw)
In-Reply-To: <CCDA9A91.16B11%g.white@cablelabs.com>

On Tue, Nov 27, 2012 at 04:53:34PM -0700, Greg White wrote:
> BTW, I've heard some use the term "stochastic flow queueing" as a
> replacement to avoid the term "fair".  Seems like a more apt term anyway.

Would that mean that FQ-CoDel is Flow Queue Controlled Delay?  ;-)

							Thanx, Paul

> -Greg
> 
> 
> On 11/27/12 3:49 PM, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> >Thank you for the review and comments, Jim!  I will apply them when
> >I get the pen back from Dave.  And yes, that is the thing about
> >"fairness" -- there are a great many definitions, many of the most
> >useful of which appear to many to be patently unfair.  ;-)
> >
> >As you suggest, it might well be best to drop discussion of fairness,
> >or to at the least supply the corresponding definition.
> >
> >							Thanx, Paul
> >
> >On Tue, Nov 27, 2012 at 05:03:02PM -0500, Jim Gettys wrote:
> >> Some points worth making:
> >> 
> >> 1) It is important to point out that (and how) fq_codel avoids
> >>starvation:
> >> unpleasant as elephant flows are, it would be very unfriendly to never
> >> service them at all until they time out.
> >> 
> >> 2) "fairness" is not necessarily what we ultimately want at all; you'd
> >> really like to penalize those who induce congestion the most.  But we
> >>don't
> >> currently have a solution (though Bob Briscoe at BT thinks he does, and
> >>is
> >> seeing if he can get it out from under a BT patent), so the current
> >> fq_codel round robins ultimately until/unless we can do something like
> >> Bob's idea.  This is a local information only subset of the ideas he's
> >>been
> >> working on in the congestion exposure (conex) group at the IETF.
> >> 
> >> 3) "fairness" is always in the eyes of the beholder (and should be left
> >>to
> >> the beholder to determine). "fairness" depends on where in the network
> >>you
> >> are.  While being "fair" among TCP flows is sensible default policy for
> >>a
> >> host, else where in the network it may not be/usually isn't.
> >> 
> >> Two examples:
> >> o at a home router, you probably want to be "fair" according to transmit
> >> opportunities.  We really don't want a single system remote from the
> >>router
> >> to be able to starve the network so that devices near the router get
> >>much
> >> less bandwidth than you might hope/expect.
> >> 
> >> What is more, you probably want to account for a single host using many
> >> flows, and regulate that they not be able to "hog" bandwidth in the home
> >> environment, but only use their "fair" share.
> >> 
> >> o at an ISP, you must to be "fair" between customers; it is best to
> >>leave
> >> the judgement of "fairness" at finer granularity (e.g. host and TCP
> >>flows)
> >> to the points closer to the customer's systems, so that they can enforce
> >> whatever definition of "fair" they need to themselves.
> >> 
> >> 
> >> Algorithms like fq_codel can be/should be adjusted to the circumstances.
> >> 
> >> And therefore exactly what you choose to hash against to form the
> >>buckets
> >> will vary depending on where you are.  That at least one step (at the
> >> user's device) of this be TCP flow "fair" does have the great advantage
> >>of
> >> helping the RTT unfairness problem that violates the principle of "least
> >> surprise", such as that routinely seen in places like New Zealand.
> >> 
> >> This is why I have so many problems using the word "fair" near this
> >> algorithm.  "fair" is impossible to define, overloaded in people's mind
> >> with TCP fair queuing, not even desirable much of the time, and by
> >> definition and design, even today's fq_codel isn't fair to lots of
> >>things,
> >> and the same basic algorithm can/should be tweaked in lots of directions
> >> depending on what we need to do.  Calling this "smart" queuing or some
> >>such
> >> would be better.
> >> 
> >> When you've done another round on the document, I'll do a more detailed
> >> read.
> >>                              - Jim
> >> 
> >> 
> >> 
> >> 
> >> On Fri, Nov 23, 2012 at 5:18 PM, Paul E. McKenney <
> >> paulmck@linux.vnet.ibm.com> wrote:
> >> 
> >> > On Fri, Nov 23, 2012 at 09:57:34AM +0100, Dave Taht wrote:
> >> > > David Woodhouse and I fiddled a lot with adsl and openwrt and a
> >> > > variety of drivers and network layers in a typical bonded adsl stack
> >> > > yesterday. The complexity of it all makes my head hurt. I'm happy
> >>that
> >> > > a newly BQL'd ethernet driver (for the geos and qemu) emerged from
> >>it,
> >> > > which he submitted to netdev...
> >> >
> >> > Cool!!!  ;-)
> >> >
> >> > > I made a recording of us last night discussing the layers, which I
> >> > > will produce and distribute later...
> >> > >
> >> > > Anyway, along the way, we fiddled a lot with trying to analyze where
> >> > > the 350ms or so of added latency was coming from in the traverse
> >>geo's
> >> > > adsl implementation and overlying stack....
> >> > >
> >> > > Plots: http://david.woodhou.se/dwmw2-netperf-plots.tar.gz
> >> > >
> >> > > Note: 1:
> >> > >
> >> > > The  netperf sample rate on the rrul test needs to be higher than
> >> > > 100ms in order to get a decent result at sub 10Mbit speeds.
> >> > >
> >> > > Note 2:
> >> > >
> >> > > The two nicest graphs here are nofq.svg vs fq.svg, which were taken
> >>on
> >> > > a gigE link from a Mac running Linux to another gigE link. (in other
> >> > > words, NOT on the friggin adsl link) (firefox can display svg, I
> >>don't
> >> > > know what else) I find the T+10 delay before stream start in the
> >> > > fq.svg graph suspicious and think the "throw out the outlier" code
> >>in
> >> > > the netperf-wrapper code is at fault. Prior to that, codel is merely
> >> > > buffering up things madly, which can also be seen in the pfifo_fast
> >> > > behavior, with 1000pkts it's default.
> >> >
> >> > I am using these two in a new "Effectiveness of FQ-CoDel" section.
> >> > Chrome can display .svg, and if it becomes a problem, I am sure that
> >> > they can be converted.  Please let me know if some other data would
> >> > make the point better.
> >> >
> >> > I am assuming that the colored throughput spikes are due to occasional
> >> > packet losses.  Please let me know if this interpretation is overly
> >>naive.
> >> >
> >> > Also, I know what ICMP is, but the UDP variants are new to me.  Could
> >> > you please expand the "EF", "BK", "BE", and "CSS" acronyms?
> >> >
> >> > > (Arguably, the default queue length in codel can be reduced from 10k
> >> > > packets to something more reasonable at GigE speeds)
> >> > >
> >> > > (the indicator that it's the graph, not the reality, is that the
> >> > > fq.svg pings and udp start at T+5 and grow minimally, as is usual
> >>with
> >> > > fq_codel.)
> >> >
> >> > All sessions were started at T+5, then?
> >> >
> >> > > As for the *.ps graphs, well, they would take david's network
> >>topology
> >> > > to explain, and were conducted over a variety of circumstances,
> >> > > including wifi, with more variables in play than I care to think
> >> > > about.
> >> > >
> >> > > We didn't really get anywhere on digging deeper. As we got to purer
> >> > > tests - with a minimal number of boxes, running pure ethernet,
> >> > > switched over a couple of switches, even in the simplest two box
> >>case,
> >> > > my HTB based "ceroshaper" implementation had multiple problems in
> >> > > cutting median latencies below 100ms, on this very slow ADSL link.
> >> > > David suspects problems on the path along the carrier backbone as a
> >> > > potential issue, and the only way to measure that is with two one
> >>way
> >> > > trip time measurements (rather than rtt), time synced via ntp... I
> >> > > keep hoping to find a rtp test, but I'm open to just about any
> >>option
> >> > > at this point. anyone?
> >> > >
> >> > > We also found a probable bug in mtr in that multiple mtrs on the
> >>same
> >> > > box don't co-exist.
> >> >
> >> > I must confess that I am not seeing all that clear a difference
> >>between
> >> > the behaviors of ceroshaper and FQ-CoDel.  Maybe somewhat better
> >>latencies
> >> > for FQ-CoDel, but not unambiguously so.
> >> >
> >> > > Moving back to more scientific clarity and simpler tests...
> >> > >
> >> > > The two graphs, taken a few weeks back, on pages 5 and 6 of this:
> >> > >
> >> > >
> >> > 
> >>http://www.teklibre.com/~d/bloat/Not_every_packet_is_sacred-Battling_Buff
> >>erbloat_on_wifi.pdf
> >> > >
> >> > > appear to show the advantage of fq_codel fq + codel + head drop over
> >> > > tail drop during the slow start period on a 10Mbit link - (see how
> >> > > squiggly slow start is on pfifo fast?) as well as the marvelous
> >> > > interstream latency that can be achieved with BQL=3000 (on a 10 mbit
> >> > > link.)  Even that latency can be halved by reducing BQL to 1500,
> >>which
> >> > > is just fine on a 10mbit. Below those rates I'd like to be rid of
> >>BQL
> >> > > entirely, and just have a single packet outstanding... in everything
> >> > > from adsl to cable...
> >> > >
> >> > > That said, I'd welcome other explanations of the squiggly slowstart
> >> > > pfifo_fast behavior before I put that explanation on the slide....
> >>ECN
> >> > > was in play here, too. I can redo this test easily, it's basically
> >> > > running a netperf TCP_RR for 70 seconds, and starting up a
> >>TCP_MAERTS
> >> > > and TCP_STREAM for 60 seconds a T+5, after hammering down on BQL's
> >> > > limit and the link speeds on two sides of a directly connected
> >>laptop
> >> > > connection.
> >> >
> >> > I must defer to others on this one.  I do note the much lower
> >>latencies
> >> > on slide 6 compared to slide 5, though.
> >> >
> >> > Please see attached for update including .git directory.
> >> >
> >> >                                                         Thanx, Paul
> >> >
> >> > > ethtool -s eth0 advertise 0x002 # 10 Mbit
> >> > >
> >> >
> >> > _______________________________________________
> >> > Cerowrt-devel mailing list
> >> > Cerowrt-devel@lists.bufferbloat.net
> >> > https://lists.bufferbloat.net/listinfo/cerowrt-devel
> >> >
> >> >
> >
> >_______________________________________________
> >Codel mailing list
> >Codel@lists.bufferbloat.net
> >https://lists.bufferbloat.net/listinfo/codel
> 


  reply	other threads:[~2012-11-28  0:40 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAA93jw5yFvrOyXu2s2DY3oK_0v3OaNfnL+1zTteJodfxtAAzcQ@mail.gmail.com>
2012-11-23  8:57 ` [Codel] " Dave Taht
2012-11-23 22:18   ` Paul E. McKenney
2012-11-24  0:07     ` Toke Høiland-Jørgensen
2012-11-24 16:19       ` Dave Taht
2012-11-24 16:36         ` [Codel] [Cerowrt-devel] " dpreed
2012-11-24 19:57         ` [Codel] " Andrew McGregor
2012-11-26 21:13         ` Rick Jones
2012-11-26 21:19           ` Dave Taht
2012-11-26 22:16         ` Toke Høiland-Jørgensen
2012-11-26 23:21           ` Toke Høiland-Jørgensen
2012-11-26 23:39             ` [Codel] [Cerowrt-devel] " dpreed
2012-11-26 23:58               ` Toke Høiland-Jørgensen
2012-11-26 17:20       ` [Codel] " Paul E. McKenney
2012-11-26 21:05       ` Rick Jones
2012-11-26 23:18         ` [Codel] [Bloat] " Rick Jones
2012-11-27 22:03     ` [Codel] [Cerowrt-devel] " Jim Gettys
2012-11-27 22:31       ` [Codel] [Bloat] " David Lang
2012-11-27 22:54         ` Paul E. McKenney
2012-11-27 23:15           ` Andrew McGregor
2012-11-28  0:51             ` Paul E. McKenney
2012-11-28 17:36             ` Paul E. McKenney
2012-11-28 14:06         ` [Codel] [Cerowrt-devel] [Bloat] " Michael Richardson
2012-11-27 22:49       ` [Codel] [Cerowrt-devel] " Paul E. McKenney
2012-11-27 23:53         ` Greg White
2012-11-28  0:27           ` Paul E. McKenney [this message]
2012-11-28  3:43             ` Kathleen Nichols
2012-11-28  4:38               ` Paul E. McKenney
2012-11-28 16:01                 ` Paul E. McKenney
2012-11-28 16:16                   ` Jonathan Morton
2012-11-28 17:44                     ` Paul E. McKenney
2012-11-28 18:37                       ` [Codel] [Bloat] " Michael Richardson
2012-11-28 18:51                         ` Eric Dumazet
2012-11-28 21:44                           ` Michael Richardson
2012-11-28 19:00                       ` Eric Dumazet
2012-12-02 21:37                         ` Toke Høiland-Jørgensen
2012-12-02 21:47                           ` Andrew McGregor
2012-12-03  8:04                             ` Dave Taht
2012-12-02 22:07                           ` Eric Dumazet
2012-12-02 22:15                             ` Toke Høiland-Jørgensen
2012-12-02 22:30                               ` Eric Dumazet
2012-12-02 22:51                                 ` Toke Høiland-Jørgensen
2012-11-28 17:20       ` [Codel] " Paul E. McKenney
2012-12-02 23:06         ` Paul E. McKenney
2012-12-03 11:24           ` Toke Høiland-Jørgensen
2012-12-03 11:31             ` Dave Taht
2012-12-03 12:54               ` Toke Høiland-Jørgensen
2012-12-03 14:58                 ` Paul E. McKenney
2012-12-03 15:19                   ` Toke Høiland-Jørgensen
2012-12-03 15:49                   ` Eric Dumazet
2012-12-03 15:03               ` Paul E. McKenney
2012-12-03 15:58               ` David Woodhouse
2012-12-04  3:13                 ` Dan Siemon
2012-12-05  0:01                   ` Sebastian Moeller
     [not found]                   ` <1354613026.72238.YahooMailNeo@web126202.mail.ne1.yahoo.com>
2012-12-05  3:41                     ` [Codel] [Bloat] " Dan Siemon
     [not found]                       ` <1354739624.4431.YahooMailNeo@web126205.mail.ne1.yahoo.com>
2012-12-06  4:12                         ` Dan Siemon
2012-11-30  1:09       ` Dan Siemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/codel.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121128002710.GS2474@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=bloat@lists.bufferbloat.net \
    --cc=blogic@openwrt.org \
    --cc=cerowrt-devel@lists.bufferbloat.net \
    --cc=codel@lists.bufferbloat.net \
    --cc=esr@thyrsus.com \
    --cc=g.white@cablelabs.com \
    --cc=paolo.valente@unimore.it \
    --cc=toke@toke.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox