[Cake] [Bloat] active sensing queue management

Fri Jun 12 13:55:05 EDT 2015

On 12/06/15 15:35, Daniel Havey wrote:
> On Fri, Jun 12, 2015 at 6:00 AM, Alan Jenkins
> <alan.christopher.jenkins at gmail.com> wrote:
>> On 12/06/15 02:44, David Lang wrote:
>>> On Thu, 11 Jun 2015, Sebastian Moeller wrote:
>>>
>>>> On Jun 11, 2015, at 03:05 , Alan Jenkins
>>>> <alan.christopher.jenkins at gmail.com> wrote:
>>>>
>>>>> On 10/06/15 21:54, Sebastian Moeller wrote:
>>>>>
>>>>> One solution would be if ISPs made sure upload is 100% provisioned.
>>>>> Could be cheaper than for (the higher rate) download.
>>>>
>>>>      Not going to happen, in my opinion, as economically unfeasible for a
>>>> publicly traded ISP. I would settle for that approach as long as the ISP is
>>>> willing to fix its provisioning so that oversubscription episodes are
>>>> reasonable rare, though.
>>>
>>> not going to happen on any network, publicly traded or not.
>>
>> Sure, I'm flailing.  Note this was in the context of AQSM as Daniel
>> describes it.  (Possibly misnamed given it only drops.  All the queuing is
>> "underneath" AQSM, "in the MAC layer" as the paper says :).
>>
> Noooooooooo!  I am a huge supporter of ECN.  ECE Everywhere!  I'm sure
> I wrote "mark/drop" in the paper.  I might have dyslexically written
> "drop/mark", but, if I ever gave the impression then I categorically
> deny that right now and till forever.  ECN everywhere :^)

My bad!  Imagine I wrote mark/drop.  It was just a definitional wibble.  
Is it AQM if it's not your Q, is it policing (and does AQM include 
policing)?

>> - AQSM isn't distinguishing up/down bloat.  When it detects bloat it has to
>> limit both directions in equal proportion.
>>
>> => if there is upload contention (and your user is uploading), you may hurt
>> apps sensitive to download bandwidth (streaming video), when you don't need
>> to.
>>
>> What would the solutions look like?
>>
>> i) If contention in one direction was negligible, you could limit the other
>> direction only.  Consumer connections are highly asymmetric, and AQSM is
>> only measuring the first IP hop.  So it's more feasible than 100% in both
>> directions.  And this isn't about core networks (with larger statistical
>> universes... whether that helps or not).
>>
>> I'm sure you're right and they're not asymmetric _enough_.
>>
>>
>> ii) Sebastian points out if you implement AQSM in the modem (as the paper
>> claims :p), you may as well BQL the modem drivers and run AQM.  *But that
>> doesn't work on ingress* - ingress requires tbf/htb with a set rate - but
>> the achievable rate is lower in peak hours. So run AQSM on ingress only!
>> Point being that download bloat could be improved without changing the other
>> end (CMTS).
>>
> This is pretty cool.  I had not considered BQL (though Dave and Jim
> were evangelizing about it at the time :).  This solves the
> upload/download problem which I was not able to get past in the paper.
> BQL on the egress and ASQM for the ingress.  BQL will make sure that
> the upload is under control so that ASQM can get a good measurement on
> the download side.  Woot!  Woot!  Uncooperative ISP problem solved!
>
> BTW...Why doesn't BQL work on the ingress?

Both AQM & BQL manage transmit queues.  It's the natural way to work.

If you don't control the transmit queue(s) at the bottleneck, you have 
to insert an artificial bottleneck that you _do_ control.

The problem is we don't have the source to hack any modems, only Linux 
routers like in your paper :).  It's why, to set up a router with our 
SQM, you have to know the line rate (both directions).  And tell SQM to 
use maybe 95% of it.  And assume the full rate is available even during 
peak hours :(.

So we still want co-operative ISPs to solve this, because that's who 
procures the modems.  That said, once they've been _developed_ it's 
going to be easier to buy a good modem regardless of ISP.

We rate-limit with htb, so we build a queue for fq_codel to work on.  
(Then Johnathan smushed the two qdiscs together and called the resulting 
code "cake").

Regarding ingress rate-limiting, note that

  - it involves creating a virtual network interface (IFB), to get an 
artificial transmit queue to apply AQM to.   (Something like that, I 
don't know the exact syntax).
  - bursts can still pile up at the upstream bottleneck, so you can't 
eliminate latency increases altogether.  (E.g. caused by tcp's initial 
multiplicative increase).  It can still be a big win though, because isp 
queues would often be allowed to grow many times larger than needed 
(expected rtt / ~100ms).

>>> The question is not "can the theoretical max of all downstream devices
>>> exceed the upstream bandwidth" because that answer is going to be "yes" for
>>> every network built, LAN or WAN, but rather "does the demand in practice of
>>> the combined downstream devices exceed the upstream bandwidth for long
>>> enough to be a problem"
>>>
>>> it's not even a matter of what percentage are they oversubscribed.
>>>
>>> someone with 100 1.5Mb DSL lines downstream and a 50Mb upstream (30% of
>>> theoretical requirements) is probably a lot worse than someone with 100 1G
>>> lines downstream and a 10G upstream (10% of theoretical requirements)
>>> because it's far less likely that the users of the 1G lines are actually
>>> going to saturate them (let alone simultaniously for a noticable timeframe),
>>> while it's very likely that the users of the 1.5M DSL lines are going to
>>> saturate their lines for extended timeframes.
>>>
>>> The problem shows up when either usage changes rapidly, or the network
>>> operator is not keeping up with required upgrades as gradual usage changes
>>> happen (including when they are prevented from upgrading because a peer
>>> won't cooperate)
>>>
>>> As for the "100% provisioning" ideal, think through the theoretical
>>> aggregate and realize that before you get past very many layers, you get to
>>> a bandwidh requirement that it's not technically possible to provide.
>>>
>>> David Lang
>>
> Yuppers!  Dave is right.  The FCC studies (especially the 80/80 study
> out of UNC) from 2010 - 2014 (footnoted in the paper) indicate that
> during peak hours it is quite common for an ISP not to provide 100% of
> the rated throughput.

Note the same technical problem applies if they provide 100%-150%.[1]  
If bloat is a problem, you can probably still hack around it as a user, 
but then you lose the _benefits_ of the oversubscription.

[1] Comcast "Powerboost" http://www.dslreports.com/faq/14520

>    In fact in 2014 it indicates that a full on 50%
> of the ISPs measured provided less than 100%.  The 100% all the time
> goal is unreasonable because it implies too much waste.  Many ISPs get
> to 90% or above even during peak hours.  This is good!  We could live
> with that :)  Providing that last 10% would mean that they would have
> to provide for a lot of excess capacity that goes unused during
> non-peak hours.  Wasteful.  That money should be allocated for more
> important things like providing AQM for all or saving the planet or
> something else. :^)

:)

I grant Dave's caveats.  Couldn't resist looking for the graphs. (Around 
page 40 
http://data.fcc.gov/download/measuring-broadband-america/2014/2014-Fixed-Measuring-Broadband-America-Report.pdf).

It looks like cable ISPs happen to hit an average of 100% during peak 
hours (as you might hope from figure of 50% going below, apparently 50% 
are going above).

Alan