[Bloat] [Cake] active sensing queue management

Fri Jun 12 12:02:30 EDT 2015

Hi Daniel,

On Jun 12, 2015, at 17:02 , Daniel Havey <dhavey at gmail.com> wrote:
> On Thu, Jun 11, 2015 at 12:27 AM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>> [...]
>>        Except that DOCSIS 3.1 pie in the modem does not work that way. As I understand http://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf section 3.2 MAP PROCESSING REQUIREMENTS, cable-modem pie will not stuff data into the lower layers until it has received a grant to actually send that data, hence no uncontrolled sub-IP layer buffer bloat possible (unless there are severe RF issues that take out data during transmission). So at least for the upstream direction docsis 3.1 pie will not suffer from buffer displacement, if I understand the cable labs white paper correctly.
> 
> Hmmm, interesting.  Are you sure?  I'm a CS not a EE so the PHY layer
> is like black magic to me.  However, I still think (although I am
> willing to be convinced otherwise by someone with superior knowledge
> :)) that the IP layer puts packet into a MAC layer queue.  Then the
> MAC layer makes a queue depth based request for bandwidth in order to
> serialize and send the data.

	I am not sure, but maybe Greg White (CCd) can help us out here? @Greg, is it right that the docsis3.1 pie implementation will keep a close lid on how many packets/bytes are queued in lower layers of the stack? 

> 
> If somebody really knows how this works, please help!  :^)  Is the
> upload of a docsis 3.1 modem really unbloatable below the IP layer?  I
> just want to know for my own edification :)
> 
>> Your solution still might be a valuable add-on to control the downstream buffer bloat in addition.
> I agree!  If that reading of the cablelabs paper is correct then this
> nicely solves the upload vs. download problem and we don't really need
> BQL either.  If it is not true, then we use BQL on the egress to solve
> the upload bloat problem and ASQM to solve the download bloat problem.
> Perfect solution!  I love it when a plan comes together!  :^)

	ASFAIK, BQL so far is only implemented in ethernet drivers, so if your uplink is egaul or slightly higher than 10, 100, 1000, or 1000Mbps BQL with fq_codel will not need and shaper on egress and should still hold the buffers at bay. Unfortunately often the actual egress rates are quite off of these ethernet sleep tiers. I believe Dave Taeht is trying to convince ethernet drivers to set their egress at non-traditional rates, I could be wrong though...

> 
>>        I also believe that free in france had a modified DSL driver for their box that made sure sub-IP buffering was bound to a low number of packets as well, so no displaced buffers there as well. Now it seems that this solution for DSL was unique so far and has not caught on, but once docsis3.1 modems hit the market upstream PIE in the modems will be reality.
> 
> freefrance?  Dave isn't that your provider?  I thought they were
> running fq_CoDel?
> In any case, just because PIE in the modems is a reality don't be
> tempted to declare the problem solved and go home.  Never
> underestimate the ability of the ISPs to do the wrong thing for very
> good reasons :^)  What happens if they don't turn it on?  This is
> really what I was trying to solve with ASQM.  What if your provider
> won't run CoDel or PIE for whatever incomprehensible reason?  Then you
> run ASQM and be done with it.

	I like your active sensing approach, I just believe that the scenario you set-out n the paper is not fully true, so I tried to voice my concerns. Personally I am on a vdsl-link so docsis pie or no docsis pie, my link is still bloated and I am looking for new solutions. I like your magic packet idea, even though my taste in these matters is debatable ;) but I fear to work this needs to run on the modem, and thee are only few fully opensource modems around (if any) on which to implement your active probe. Plus on a DSL link the congestion typically comes between DSLAM and BRAS (as the DSL link is not shared, unlike the cable situation) and I fear the DSLAM might already return the probe packet…

Best Regards
	Sebastian

> 
>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Jun 10, 2015 at 1:54 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>>>> Hi Dave,
>>>> 
>>>> 
>>>> On Jun 10, 2015, at 21:53 , Dave Taht <dave.taht at gmail.com> wrote:
>>>> 
>>>>> http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf
>>>>> 
>>>>> gargoyle's qos system follows a similar approach, using htb + sfq, and
>>>>> a short ttl udp flow.
>>>>> 
>>>>> Doing this sort of measured, then floating the rate control with
>>>>> "cake" would be fairly easy (although it tends to be a bit more
>>>>> compute intensive not being on a fast path)
>>>>> 
>>>>> What is sort of missing here is trying to figure out which side of the
>>>>> bottleneck is the bottleneck (up or down).
>>>> 
>>> 
>>> Yeah, we never did figure out how to separate the up from the
>>> downlink.  However, we just consider the access link as a whole (up +
>>> down) and mark/drop according to ratios of queuing time.
>> 
>>        This is a bit sad; why reduce say the effective uplink bandwidth if only the downstream is contended? Not that I have a good alternative solution that will not require help from outside boxes.
>> 
>>> Overall it
>>> seems to work well, but, we never did a mathematical analysis.  Kind
>>> of like saying it's not a "bug", it is a feature.  And it this case it
>>> is true since both sides can experience bloat.
>> 
>>        Yes, but you only want to throttle traffic on the congested leg of the link, otherwise bandwidth efficiency goes to hell if you look at bi-direction link-saturating traffic.
>> 
>>> 
>>> 
>>>>       Yeah, they relay on having a reliable packet reflector upstream of the “bottleneck” so they get their timestamped probe packets returned. In the paper they used either uplink or downlink traffic so figuring where the bottleneck was easy at least this is how I interpret “Experiments were performed in the upload (data flowing from the users to the CDNs) as well as in the download direction.". At least this is what I get from their short description in glossing over the paper.
>>>>       Nice paper, but really not a full solution either. Unless the ISPs cooperate in supplying stable reflectors powerful enough to support all downstream customers. But if the ISPs cooperate, I would guess, they could eradicate downstream buffer bloat to begin with. Or the ISPs could have the reflector also add its own UTC time stamp which would allow to dissect the RTT into its constituting one-way delays to detect the currently bloated direction. (Think ICMP type 13/14 message pairs "on steroids", with higher resolution than milliseconds, but for buffer bloat detection ms resolution would probably be sufficient anyways). Currently, I hear that ISP equipment will not treat ICMP requests with priority though.
>>> 
>>> Not exactly.  We thought this through for some time and considered
>>> many angles.  Each method has its advantages and disadvantages.
>>> 
>>> We decided not to use ICMP at all because of the reasons you stated
>>> above.  We also decided not to use a "reflector" although as you said
>>> it would allow us to separate upload queue time from download.  We
>>> decided not to use this because it would be difficult to get ISPs to
>>> do this.
>>> 
>>> Are final choice for the paper was "magic" IP packets.  This consists
>>> of an IP packet header and the timestamp.  The IP packet is "self
>>> addressed" and we trick the iptables to emit the packet on the correct
>>> interface.  This packet will be returned to us as soon as it reaches
>>> another IP layer (typically at the CMTS).
>> 
>>        Ah, thanks; I did not get this from reading over your paper (but that is probably caused by me being a layman and having read it very quickly). Question how large is that packet on-the-wire? IP header plus 8 byte makes me assume 20+8 = 28, but that is missing the ethernet header, so rather 14+20+8 = 42, but isn’t the shorts ethernet frame 64bytes?
>> 
>>> 
>>> Here's a quick summary:
>>> ICMP -- Simple, but, needs the ISP's cooperation (good luck :)
>>> Reflector  -- Separates upload queue time from download queue time,
>>> but, requires the ISP to cooperate and to build something for us.
>>> (good luck :)
>>> Magic IP packets -- Requires nothing from the ISP (YaY!  We have a
>>> winner!), but, is a little more complex.
>> 
>>        At the cost that you only get RTT instead of two one-way delays as one ideally would like. But as stated above if you combine your method with say docsis3.1 pie which promises to keep the upstream under tight control, the any RTT changes should (mainly) be caused by downstream over-buffering (effectively allowing you use you method to control the downstream well).
>> 
>>> 
>>> 
>>>>       Also I am confused what they actually simulated: “The modems and CMTS were equipped with ASQM, CoDel and PIE,” and “However, the problem pop- ularly called bufferbloat can move about among many queues some of which are resistant to traditional AQM such as Layer 2 MAC protocols used in cable/DSL links. We call this problem bufferbloat displacement.” seem to be slightly at odds. If modems and CTMS have decent AQMs all they need to do is not stuff their sub-IP layer queuesand be done with it. The way I understood the cable labs PIE story, they intended to do exactly that, so at least the “buffer displacement” remedy by ASQM reads a bit like a straw man argument. But as I am a) not of the cs field, and b) only glossed over the paper, most likely I am missing something important that is clearly in the paper...
>>> 
>>> Good point!  However, once again it's not quite that simple.  Queues
>>> are necessary to absorb short term variations in packet arrival rate
>>> (or bursts).  The queue required for any flow is given by the
>>> bandwidth delay product.
>> 
>>        Not a CS person, but that does not ring fully true; this basically assumes a physical medium that will dump all packets into the buffer at one time point and send them out a full delay period later; I think in reality packets will be serialized and hence some packet will most likely have left the buffer already before all have arrived, so the BDP is an more estimate of an upper bound… not that there is anything wrong with designing solutions aim to handle the worst case well.
>> 
>>> Since we don't know the delay we can't
>>> predict the queue size in advance.  What I'm getting at is the
>>> equipment manufacturers aren't putting in humongous queues because
>>> they are stupid, they are putting them there because in some cases you
>>> might really need that large of a queue.
>> 
>>        I thought our current pet hypothesis is that they aim for BDP at their highest rated speeds or so, and all customers running that (huh speed capable) equipment at lower rates are out of luck.
>> 
>>> 
>>> Statically sizing the queues is not the answer.  Managing the size of
>>> the queue with an algorithm is the answer.  :)
>> 
>>        No disagreement here, we just discuss the how not the why ;)
>> 
>> Best Regards
>>        Sebastian
>> 
>>> 
>>> 
>>> 
>>>> 
>>>> Best Regards
>>>>       Sebastian
>>>> 
>>>>> 
>>>>> --
>>>>> Dave Täht
>>>>> What will it take to vastly improve wifi for everyone?
>>>>> https://plus.google.com/u/0/explore/makewififast
>>>>> _______________________________________________
>>>>> Cake mailing list
>>>>> Cake at lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/cake
>>>> 
>>