[Cake] active sensing queue management

Sebastian Moeller moeller0 at gmx.de
Thu Jun 11 03:27:10 EDT 2015

Hi Daniel,

thanks for the clarifications.

On Jun 11, 2015, at 02:10 , Daniel Havey <dhavey at gmail.com> wrote:

> Hmmm, maybe I can help clarify.  Bufferbloat occurs in the slowest
> queue on the path.  This is because the other queues are faster and
> will drain.  AQM algorithms work only if they are placed where the
> packets pile up (e.g. the slowest queue in the path).  This is
> documented in Kathy and Van's CoDel paper.

	I am with you so far.

> This is usually all well and good because we know where the bottleneck
> (the slowest queue in the path) is.  It is the IP layer in the modem
> where the ISP implements their rate limiter.  That is why algorithms
> such as PIE and CoDel are implemented in the IP layer on the modem.


> Suppose the full committed rate of the token bucket rate limiter is 8
> Mbps.  This means that the queue at the IP layer in the modem is
> capable of emitting packets at 8 Mbps sustained rate.  The problem
> occurs during peak hours when the ISP is not providing the full
> committed rate of 8 Mbps or that some queue in the system (probably in
> the access link) is providing something less than 8 Mbps (say for sake
> of discussion that the number is 7.5 Mbps).
> We know that (see Kathy and Van's paper) that AQM algorithms only work
> when they are placed at the slowest queue.  However, the AQM is placed
> at the queue that is capable of providing 8 Mbps and this is not the
> slowest queue.  The AQM algorithm will not work in these conditions.
> This is what is shown in the paper where the CoDel and PIE performance
> goes to hell in a handbasket.  The ASQM algorithm is designed to
> address this problem.

	Except that DOCSIS 3.1 pie in the modem does not work that way. As I understand http://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf section 3.2 MAP PROCESSING REQUIREMENTS, cable-modem pie will not stuff data into the lower layers until it has received a grant to actually send that data, hence no uncontrolled sub-IP layer buffer bloat possible (unless there are severe RF issues that take out data during transmission). So at least for the upstream direction docsis 3.1 pie will not suffer from buffer displacement, if I understand the cable labs white paper correctly. Your solution still might be a valuable add-on to control the downstream buffer bloat in addition. 
	I also believe that free in france had a modified DSL driver for their box that made sure sub-IP buffering was bound to a low number of packets as well, so no displaced buffers there as well. Now it seems that this solution for DSL was unique so far and has not caught on, but once docsis3.1 modems hit the market upstream PIE in the modems will be reality.

> On Wed, Jun 10, 2015 at 1:54 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>> Hi Dave,
>> On Jun 10, 2015, at 21:53 , Dave Taht <dave.taht at gmail.com> wrote:
>>> http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf
>>> gargoyle's qos system follows a similar approach, using htb + sfq, and
>>> a short ttl udp flow.
>>> Doing this sort of measured, then floating the rate control with
>>> "cake" would be fairly easy (although it tends to be a bit more
>>> compute intensive not being on a fast path)
>>> What is sort of missing here is trying to figure out which side of the
>>> bottleneck is the bottleneck (up or down).
> Yeah, we never did figure out how to separate the up from the
> downlink.  However, we just consider the access link as a whole (up +
> down) and mark/drop according to ratios of queuing time.  

	This is a bit sad; why reduce say the effective uplink bandwidth if only the downstream is contended? Not that I have a good alternative solution that will not require help from outside boxes.

> Overall it
> seems to work well, but, we never did a mathematical analysis.  Kind
> of like saying it's not a "bug", it is a feature.  And it this case it
> is true since both sides can experience bloat.

	Yes, but you only want to throttle traffic on the congested leg of the link, otherwise bandwidth efficiency goes to hell if you look at bi-direction link-saturating traffic.

>>        Yeah, they relay on having a reliable packet reflector upstream of the “bottleneck” so they get their timestamped probe packets returned. In the paper they used either uplink or downlink traffic so figuring where the bottleneck was easy at least this is how I interpret “Experiments were performed in the upload (data flowing from the users to the CDNs) as well as in the download direction.". At least this is what I get from their short description in glossing over the paper.
>>        Nice paper, but really not a full solution either. Unless the ISPs cooperate in supplying stable reflectors powerful enough to support all downstream customers. But if the ISPs cooperate, I would guess, they could eradicate downstream buffer bloat to begin with. Or the ISPs could have the reflector also add its own UTC time stamp which would allow to dissect the RTT into its constituting one-way delays to detect the currently bloated direction. (Think ICMP type 13/14 message pairs "on steroids", with higher resolution than milliseconds, but for buffer bloat detection ms resolution would probably be sufficient anyways). Currently, I hear that ISP equipment will not treat ICMP requests with priority though.
> Not exactly.  We thought this through for some time and considered
> many angles.  Each method has its advantages and disadvantages.
> We decided not to use ICMP at all because of the reasons you stated
> above.  We also decided not to use a "reflector" although as you said
> it would allow us to separate upload queue time from download.  We
> decided not to use this because it would be difficult to get ISPs to
> do this.
> Are final choice for the paper was "magic" IP packets.  This consists
> of an IP packet header and the timestamp.  The IP packet is "self
> addressed" and we trick the iptables to emit the packet on the correct
> interface.  This packet will be returned to us as soon as it reaches
> another IP layer (typically at the CMTS).

	Ah, thanks; I did not get this from reading over your paper (but that is probably caused by me being a layman and having read it very quickly). Question how large is that packet on-the-wire? IP header plus 8 byte makes me assume 20+8 = 28, but that is missing the ethernet header, so rather 14+20+8 = 42, but isn’t the shorts ethernet frame 64bytes?

> Here's a quick summary:
> ICMP -- Simple, but, needs the ISP's cooperation (good luck :)
> Reflector  -- Separates upload queue time from download queue time,
> but, requires the ISP to cooperate and to build something for us.
> (good luck :)
> Magic IP packets -- Requires nothing from the ISP (YaY!  We have a
> winner!), but, is a little more complex.

	At the cost that you only get RTT instead of two one-way delays as one ideally would like. But as stated above if you combine your method with say docsis3.1 pie which promises to keep the upstream under tight control, the any RTT changes should (mainly) be caused by downstream over-buffering (effectively allowing you use you method to control the downstream well).

>>        Also I am confused what they actually simulated: “The modems and CMTS were equipped with ASQM, CoDel and PIE,” and “However, the problem pop- ularly called bufferbloat can move about among many queues some of which are resistant to traditional AQM such as Layer 2 MAC protocols used in cable/DSL links. We call this problem bufferbloat displacement.” seem to be slightly at odds. If modems and CTMS have decent AQMs all they need to do is not stuff their sub-IP layer queuesand be done with it. The way I understood the cable labs PIE story, they intended to do exactly that, so at least the “buffer displacement” remedy by ASQM reads a bit like a straw man argument. But as I am a) not of the cs field, and b) only glossed over the paper, most likely I am missing something important that is clearly in the paper...
> Good point!  However, once again it's not quite that simple.  Queues
> are necessary to absorb short term variations in packet arrival rate
> (or bursts).  The queue required for any flow is given by the
> bandwidth delay product.  

	Not a CS person, but that does not ring fully true; this basically assumes a physical medium that will dump all packets into the buffer at one time point and send them out a full delay period later; I think in reality packets will be serialized and hence some packet will most likely have left the buffer already before all have arrived, so the BDP is an more estimate of an upper bound… not that there is anything wrong with designing solutions aim to handle the worst case well.

> Since we don't know the delay we can't
> predict the queue size in advance.  What I'm getting at is the
> equipment manufacturers aren't putting in humongous queues because
> they are stupid, they are putting them there because in some cases you
> might really need that large of a queue.

	I thought our current pet hypothesis is that they aim for BDP at their highest rated speeds or so, and all customers running that (huh speed capable) equipment at lower rates are out of luck.

> Statically sizing the queues is not the answer.  Managing the size of
> the queue with an algorithm is the answer.  :)

	No disagreement here, we just discuss the how not the why ;)

Best Regards

>> Best Regards
>>        Sebastian
>>> --
>>> Dave Täht
>>> What will it take to vastly improve wifi for everyone?
>>> https://plus.google.com/u/0/explore/makewififast
>>> _______________________________________________
>>> Cake mailing list
>>> Cake at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cake

More information about the Cake mailing list