[Cerowrt-devel] [Codel] [Bloat] FQ_Codel lwn draft article review
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Wed Nov 28 12:36:21 EST 2012
On Wed, Nov 28, 2012 at 12:15:35PM +1300, Andrew McGregor wrote:
> On 28/11/2012, at 11:54 AM, "Paul E. McKenney" <paulmck at linux.vnet.ibm.com> wrote:
> > On Tue, Nov 27, 2012 at 02:31:53PM -0800, David Lang wrote:
> >> On Tue, 27 Nov 2012, Jim Gettys wrote:
> >>> 2) "fairness" is not necessarily what we ultimately want at all; you'd
> >>> really like to penalize those who induce congestion the most. But we don't
> >>> currently have a solution (though Bob Briscoe at BT thinks he does, and is
> >>> seeing if he can get it out from under a BT patent), so the current
> >>> fq_codel round robins ultimately until/unless we can do something like
> >>> Bob's idea. This is a local information only subset of the ideas he's been
> >>> working on in the congestion exposure (conex) group at the IETF.
> >> Even more than this, we _know_ that we don't want to be fair in
> >> terms of the raw packet priority.
> >> For example, we know that we want to prioritize DNS traffic over TCP
> >> streams (due to the fact that the TCP traffic usually can't even
> >> start until DNS resolution finishes)
> >> We strongly suspect that we want to prioritize short-lived
> >> connections over long lived connections. We don't know a good way to
> >> do this, but one good starting point would be to prioritize syn
> >> packets so that the initialization of the connection happens as fast
> >> as possible.
> >> Ideally we'd probably like to prioritize the first couple of packets
> >> of a connection so that very short lived connections finish quickly
> fq_codel does all of this, although it isn't explicit about it so it is hard to see how it happens.
> >> it may make sense to prioritize fin packets so that connection
> >> teardown (and the resulting release of resources and connection
> >> tracking) happens as fast as possible
> >> all of these are horribly unfair when you are looking at the raw
> >> packet flow, but they significantly help the user's percieved
> >> response time without making much difference on the large download
> >> cases.
> > In all cases, to Jim's point, as long as we avoid starvation. And there
> > will likely be more corner cases that show up under extreme overload.
> > Thanx, Paul
> So, fq_codel exhibits a new kind of fairness: it is jitter fair, or in other words, each flow gets the same bound on how much jitter it can induce in the whole ensemble of flows. Exceed that bound, and flows get deprioritised. This achieves thin-flow and DNS prioritisation, while allowing TCP flows to build more buffer if required. The sub-flow CoDel queues then allow short flows to use a reasonably large buffer, while draining standing buffers for long TCP flows.
> The really interesting part of the jitter-fair behaviour is that jitter-sensitive traffic is protected as much as it can be, provided its own sending rate control does something sensible. Good news for interactive video, in other words.
> The actual jitter bound is the transmission time of max(mtu, quantum) * n_thin_flows bytes, where a thin flow is one that has not exceeded its own jitter allowance since the last time its queue drained. While it is possible that there might instantaneously be a fairly large number of thin flows, in practice on a home network link there are normally only a very few of these at any one moment, and so the jitter experienced is pretty good.
OK, let me see if I can restate this in terms of the code.
Each flow gets to induce one quantum q of jitter. If there are n thin
flows and m thick flows, then each thin flow will see at most (n+m)*q
jitter, which is the case when a new packet arrives just after the last
packet was transmitted, so that the flow has been placed at the end
of the thick-flows list. Thick flows are allowed q+interval before
dropping (where "q" is the "target" parameter in the code), so see at
most (n*q+m*(q+interval)) -- any attempt to exceed this will result in
packets being dropped.
Seem reasonable or am I confused?
More information about the Cerowrt-devel