[Bloat] What is fairness, anyway? was: Re: finally... winning on wired!

Mon Jan 9 00:38:19 EST 2012

Dave,

At 00:40 08/01/2012, Dave Taht wrote:
> > To address buffer bloat, I advise we "do one thing and do it well": bulk
> > AQM.
>
>If you have an algorithm to suggest, I'd gladly look at it.

I haven't been working on any AQM algos.
I'm merely challenging the use of FQ.

>Without AQM or FQ, we have a situation where one stream from one user
>at a site, can eat more than 100% of the bandwidth.
>
>1/u would be a substantial improvement!

1/u is only a substantial improvement if you need to protect against 
some bogeyman that is out to take all your b/w. If this bogeyman is 
solely in your imagination, don't add FQ.

If there were an application that ate 100% of the b/w of a 
homegateway (or a host) then early adopters would uninstall it and it 
would never become popular. E.g. the original Joost app.

FQ forces all flows into a straightjacket of equality with each 
other. Usually FQ won't completely stop anything working (it could 
stop an inelastic app working tho). However, if the apps are trying 
to determine their own allocations (e.g. LEDBAT), then FQ 
unnecessarily screws that all up.

I'm not saying no-one should write fairness-policing code. I'm saying 
it's not appropriate to bundle fairness code into AQM code, when the 
case for needing it is based on an imagined bogeyman. Otherwise:
* Either you dump your arbitrary assumptions on the world about what 
allocations each flow should get, bundled with your AQM.
* Or you kill the chances of getting your AQM deployed because half 
the world thinks your allocation assumptions suck.

>Secondly, as most devices lack both AQM and FQ these days despite as -
>one the papers referenced said - "considering that to be a bug"
>people are doing DoS attacks on themselves whenever they attempt
>to do something that requires high bandwidth AND do something interactively.

You're conflating removal of standing queues with bandwidth 
allocation. The former is a problem in HGs and hosts. The latter 
isn't a problem in HGs and hosts.

>Thirdly I kind of need to clarify the usage of three terms in this discussion.
>
>To me, a *user* is - mom, dad, son, daughter, and to some extent their ipods,
>ipads, and tivos, The interesting thing about this scenario is that it
>is the device
>you are in front of that you want the best interactive performance from. A
>second interesting thing is that the highest interactive, yet, bulk flow
>- tv streams - have an upper limit on their rate, and also tend to adjust
>fairly well to lower ones.
>
>A *site* is a home, small business, cybercafe, office, etc. As the size of a
>'site' scales up to well beyond what a polyamorous catholic family could
>produce in terms of users, additional techniques do apply.
>
>(while what we're working on should work well on handhelds, desktops,
>many kinds of servers, home routers, etc - I am far from convinced it's
>applicable in the isp->customer side, aside from the FQ portion breaking
>up streams enough for various algorithms to work better.)
>
>Anyway:
>
>I think sometimes 'users' and 'sites' get conflated.
>
>As you touch upon, the policies that a 'site' implements in order to manage
>their internet access do not enter the political arena in any way. Those
>policies are in the interest of family harmony only, and giving dad or
>daughter knobs to turn that actually work is in everyones best interest.
>
>Thirdly, the definition of a "flow", within a site, is more flexible than
>what a provider can see beyond nat. The device on the customers
>site can regulate flows at levels ranging from mere IP address
>(even the mac address) - to ports, to tuples consisting of the time
>of day and dscp information...

Agree with all the above definitions.

All I would add is that in the family scenario, all the users have 
control over the hosts which are already able to control transfer 
rates. So no-one needs or wants to twiddle knobs on home gateways to 
improve family harmony. In the first instance, app developers have 
the interests of family harmony at heart. They don't want to write 
apps that pee off their users. And if there's a chance they will be 
peeved, the app developer can add a knob in the app.

>As a demo to myself I got 1/u to work nearly perfectly a while
>back.

Perfect 1/u != desirable.
Perfect 1/u == over-constrained.

>And I left in the ability to manage multicast (to limit it
>*enough* so it could be *used* in wireless, without damaging the
>rest of the network at the *site*. I LIKE multicast, but the
>development of switches and wireless need it to be managed
>properly.

I'm assuming the problem you mean is unresponsiveness of certain 
multicast streaming apps to congestion. And I'm assuming "managing 
multicast" means giving it some arbitrary proportion of the link 
capacity (irrespective of whether the multicast app works with that 
allocation).

Your assumption is that the multicast app isn't managing itself. But 
what if it is? For instance, my company operates a multicast app that 
manages its bandwidth. It doesn't have equal bandwidth to other apps, 
but it prevents other apps being starved while ensuring it has the 
min b/w it needs. If your code messes with it and forces it to have 
equal b/w to everything else it won't work.

I'm basically quoting the e2e principle at you.

It's fine to signal out to the transport from an AQM, so the 
transport can keep standing queues to a minimum.

It's much more tricky to do bandwidth allocation, fairness etc. If 
you're only doing fairness as a side-project while you do the AQM, 
it's best not to dabble at all.

> >
> > Since 2004, we now understand that fairness has to involve accounting over
> > time. That requires per-user state, which is not something you can do, or
> > that you need to do, within every queue. We should leave fairness to
> > separate code, probably on machines specialising in this at the edge of a
> > provider's network domain - where it knows about users/customers - separate
> > from the AQM code of each specific queue.
>
>You are taking the provider in approach, we're on the device out approach.
>The two approaches meet on the edge gw in the site.

I'm not taking a provider approach. I'm taking an e2e approach. I'm 
not talking as a myopic carrier, I'm talking as a comms architect. 
Yes, my company operates servers, network and HGs. However my company 
recognises that it has to work with app-code and OS on the hosts. And 
with HGs that we don't manage. I'm saying pls don't put arbitrary b/w 
allocation assumptions in your HG code or low down in the OS stack. 
It's not the right place for this.

(BTW, we remotely manage a few million HGs - we used to do a lot more 
smarts in the HGs but we're reducing that now.)

>Firstly, FQ helps on the user's machine, as does AQM. Not so much
>at gigE speeds, as the instantaneous queue length is rarely long enough,
>but a lot, on wireless, where you always have ~4ms worth of queue.

Again, what evidence do you have that FQ is nec to reduce the delay, 
and AQM alone wouldn't do the job just fine?

>FQ hurts the user herself on bittorrent like applications, and
>there are multiple ways to manage that. For example, a new way showed
>up in the current tree for putting applications in cgroup containers 
>that could
>set their network 'priority'.

An app is written without knowing what network it will connect to. 
How does it know that it needs to set this priority? Or are you 
saying the containers set their own priority (in which case, we're 
back to the problem of arbitrary assumptions in the network)?

>The std way is for bittorrent to have rate limit.

Noooo. The whole point is to be able to use the full capacity when 
no-one else is.

>Another way IS for an AQM or app to be congestion aware much like
>with your proposed Conex stuff.

Don't know what you mean here.

>The way I was considering was playing
>with the TCP-ledbat code to be more datacenter-tcp like...

OK. Sounds interesting.

>Secondly, a combination FQ/AQM is the approach we are taking, more or less.
>
>I started with HTB + QFQ + RED/SFB etc... back in august...
>
>At the moment eric is scaling up SFQ and adding
>a couple variants of RED. The net behavior should be similar to what
>you describe. If you care to try it, it's mostly in linus's kernel at this
>point.
>
>It was possibly unclear to others that we have never thought that
>FQ alone was the answer. In fact we started off saying that
>better AQM was the answer, and then realized that the
>problem had multiple aspects that could be addressed
>independently, and thus started saying FQ + AQM was
>a way forward.

You need to explain this step - this is the nub of the disagreement.

>I would LOVE a 'bulk AQM' solution, and I'll be more
>than glad to start working on solving 2008's problems...
>after we get done with solving 1985's problems...

By bulk I meant all traffic together in a FIFO queue. That was the 
problem Van/Sally started addressing with RED in 1993. And AFAIK, 
that's the problem this bufferbloat list is focused on. I'm not 
introducing anything 2008ish :|

>by applying techniques that were well understood by the late 90s...
>and disremembered by an entire generation of device makers,
>driver writers, OS, network designers, buyers, and benchmarkers.
>
>But first...
>
>1) We needed *any* form of packet scheduler to
>actually work, which we couldn't do until a few months back
>due to overly huge tx rings in all the current ethernet devices.
>
>fixed by BQL
>
>2) We then needed a FQ mechanism to actually behave as designed
>- which both SFQ and and QFQ do now - they weren't, for years.

This is the sticking point. I'm saying it is now fairly widely 
accepted that the goal of FQ was not useful, so whether it works or 
not, we don't want it.

>3) We also needed AQMs that worked - Still do. I'm looking forward
>to a successor to RED, and there were some implementation bugs
>in RED that made turning it's knobs do nothing that are fixed now
>that need testing,

Agreed (strongly).

>and a couple new twists on it in SFQRED...
>
>4) We needed a way to handle TSO properly, as byte oriented AQMs
>     handled that badly. Sort of fixed now.

Agreed.

>Please feel free to evaluate and critique SFQRED.

If I don't agree with the goal, are you expecting me to critique the 
detailed implementation?

>In my case I plan to continue working with HTB, QFQ
>(which, btw, has multiple interesting ways to engage
>weighting mechanisms) and various sub qdiscs... *after SFQ stablizes*.
>
>I kind of view QFQ as a qdisc construction set.

Do you have any knowledge (accessible to the code) of what the 
weights should be?

The $10M question is: What's the argument against not doing FQ?

>We're well aware that these may not be the ultimate answer to
>all the known networking problems in the universe, but what
>we have working in the lab is making a big dent in them.
>
>And now that bugfix #1 above exists...
>
>Let a thousand new AQM's bloom!
>
>If there is anything in what we're up to that will damage the
>internet worse that it is damaged already, let us know soonest.

More *FQ is worse than what there already is.

Bob

________________________________________________________________
Bob Briscoe,                                BT Innovate & Design