[Cake] active sensing queue management

Cake - FQ_codel the next generation
 help / color / mirror / Atom feed

* [Cake] active sensing queue management
@ 2015-06-12 20:51 Benjamin Cronce
  0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Cronce @ 2015-06-12 20:51 UTC (permalink / raw)
  To: cake

[-- Attachment #1: Type: text/plain, Size: 2863 bytes --]

> On Wed, 10 Jun 2015, Daniel Havey wrote:
>
> > We know that (see Kathy and Van's paper) that AQM algorithms only work
> > when they are placed at the slowest queue.  However, the AQM is placed
> > at the queue that is capable of providing 8 Mbps and this is not the
> > slowest queue.  The AQM algorithm will not work in these conditions.
>
> so the answer is that you don't deploy the AQM algorithm only at the
perimeter,
> you deploy it much more widely.
>
> Eventually you get to core devices that have multiple routes they can use
to get
> to a destination. Those devices should notice that one route is getting
> congested and start sending the packets through alternate paths.
>
> Now, if the problem is that the aggregate of inbound packets to your
downstreams
> where you are the only path becomes higher than the available downstream
> bandwidth, you need to be running an AQM to handle things.
>
> David Lang

Dynamically load balancing routes is a hard problem because you can only
load balance on which routes you send, not on which routes you receive. You
also want to make sure the same flows take the same routes in a stateless
way. This works fine if your routes all have the same bandwidth, but breaks
down as soon as they are not the same.

Generally you don't want AQMs on core routers. The term "core router" gets
used generally as whichever router is at the core of your network, but
there is an actual class of routers called "core routers" that are meant to
handle the core of the internet. Depending on which group you mean, AQMs
may not work. Core routers that are handling ten of gigabits could use
AQMs, but real core routers where you're using 100Gb, 400Gb, or soon to be
1Tb/s ports, they can not do AQMs. Many times they cannot do basic QoS
without taking a huge penalty to routing speed, like losing 50%-80% of
their routing speed. These routers are running up against the laws of
physics and any increase in processing comes directly as a cost against
performance.

For me this topic is kind of a sore spot of why I loath many ISPs. Transit
provides like Level 3 handle this stuff very well. They deal with lots of
routes and lots of peering. They specialize in making sure you get reliable
bandwidth to/from anywhere. When a non-transit ISP, even big ones like
Comcast, try to handle their own peering and routing, they do a horrible
job. Yeah, they can provide you some bandwidth and slap a CDN on their
network and now YouTube is decent, but when it comes to long haul transit
and peering, they fail horribly.

Level 3 has stated that their rule of thumb for maintaining nearly
non-existent congestion is when a port reaches a 95th percentile above 50%
of the link rate, it's time for an upgrade. Even if that link is only at
50% utilization for 1.5 hours a day and near 0% the other 22.5 hours, that
port should be upgraded.

[-- Attachment #2: Type: text/html, Size: 3303 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cake] active sensing queue management
@ 2015-06-10 19:53 Dave Taht
  2015-06-10 20:54 ` Sebastian Moeller
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Taht @ 2015-06-10 19:53 UTC (permalink / raw)
  To: Daniel Havey, bloat, cake, cerowrt-devel

http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf

gargoyle's qos system follows a similar approach, using htb + sfq, and
a short ttl udp flow.

Doing this sort of measured, then floating the rate control with
"cake" would be fairly easy (although it tends to be a bit more
compute intensive not being on a fast path)

What is sort of missing here is trying to figure out which side of the
bottleneck is the bottleneck (up or down).

-- 
Dave Täht
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-10 19:53 Dave Taht
@ 2015-06-10 20:54 ` Sebastian Moeller
  2015-06-11  0:10   ` Daniel Havey
  2015-06-11  1:05   ` Alan Jenkins
  0 siblings, 2 replies; 13+ messages in thread
From: Sebastian Moeller @ 2015-06-10 20:54 UTC (permalink / raw)
  To: Dave Täht; +Cc: cake, Daniel Havey, cerowrt-devel, bloat

Hi Dave,


On Jun 10, 2015, at 21:53 , Dave Taht <dave.taht@gmail.com> wrote:

> http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf
> 
> gargoyle's qos system follows a similar approach, using htb + sfq, and
> a short ttl udp flow.
> 
> Doing this sort of measured, then floating the rate control with
> "cake" would be fairly easy (although it tends to be a bit more
> compute intensive not being on a fast path)
> 
> What is sort of missing here is trying to figure out which side of the
> bottleneck is the bottleneck (up or down).

	Yeah, they relay on having a reliable packet reflector upstream of the “bottleneck” so they get their timestamped probe packets returned. In the paper they used either uplink or downlink traffic so figuring where the bottleneck was easy at least this is how I interpret “Experiments were performed in the upload (data flowing from the users to the CDNs) as well as in the download direction.". At least this is what I get from their short description in glossing over the paper.
	Nice paper, but really not a full solution either. Unless the ISPs cooperate in supplying stable reflectors powerful enough to support all downstream customers. But if the ISPs cooperate, I would guess, they could eradicate downstream buffer bloat to begin with. Or the ISPs could have the reflector also add its own UTC time stamp which would allow to dissect the RTT into its constituting one-way delays to detect the currently bloated direction. (Think ICMP type 13/14 message pairs "on steroids", with higher resolution than milliseconds, but for buffer bloat detection ms resolution would probably be sufficient anyways). Currently, I hear that ISP equipment will not treat ICMP requests with priority though.
	Also I am confused what they actually simulated: “The modems and CMTS were equipped with ASQM, CoDel and PIE,” and “However, the problem pop- ularly called bufferbloat can move about among many queues some of which are resistant to traditional AQM such as Layer 2 MAC protocols used in cable/DSL links. We call this problem bufferbloat displacement.” seem to be slightly at odds. If modems and CTMS have decent AQMs all they need to do is not stuff their sub-IP layer queuesand be done with it. The way I understood the cable labs PIE story, they intended to do exactly that, so at least the “buffer displacement” remedy by ASQM reads a bit like a straw man argument. But as I am a) not of the cs field, and b) only glossed over the paper, most likely I am missing something important that is clearly in the paper...

Best Regards
	Sebastian

> 
> -- 
> Dave Täht
> What will it take to vastly improve wifi for everyone?
> https://plus.google.com/u/0/explore/makewififast
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-10 20:54 ` Sebastian Moeller
@ 2015-06-11  0:10   ` Daniel Havey
  2015-06-11  7:27     ` Sebastian Moeller
  2015-06-12  1:49     ` David Lang
  2015-06-11  1:05   ` Alan Jenkins
  1 sibling, 2 replies; 13+ messages in thread
From: Daniel Havey @ 2015-06-11  0:10 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cake, cerowrt-devel, bloat

Hmmm, maybe I can help clarify.  Bufferbloat occurs in the slowest
queue on the path.  This is because the other queues are faster and
will drain.  AQM algorithms work only if they are placed where the
packets pile up (e.g. the slowest queue in the path).  This is
documented in Kathy and Van's CoDel paper.

This is usually all well and good because we know where the bottleneck
(the slowest queue in the path) is.  It is the IP layer in the modem
where the ISP implements their rate limiter.  That is why algorithms
such as PIE and CoDel are implemented in the IP layer on the modem.

Suppose the full committed rate of the token bucket rate limiter is 8
Mbps.  This means that the queue at the IP layer in the modem is
capable of emitting packets at 8 Mbps sustained rate.  The problem
occurs during peak hours when the ISP is not providing the full
committed rate of 8 Mbps or that some queue in the system (probably in
the access link) is providing something less than 8 Mbps (say for sake
of discussion that the number is 7.5 Mbps).

We know that (see Kathy and Van's paper) that AQM algorithms only work
when they are placed at the slowest queue.  However, the AQM is placed
at the queue that is capable of providing 8 Mbps and this is not the
slowest queue.  The AQM algorithm will not work in these conditions.

This is what is shown in the paper where the CoDel and PIE performance
goes to hell in a handbasket.  The ASQM algorithm is designed to
address this problem.

On Wed, Jun 10, 2015 at 1:54 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Dave,
>
>
> On Jun 10, 2015, at 21:53 , Dave Taht <dave.taht@gmail.com> wrote:
>
>> http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf
>>
>> gargoyle's qos system follows a similar approach, using htb + sfq, and
>> a short ttl udp flow.
>>
>> Doing this sort of measured, then floating the rate control with
>> "cake" would be fairly easy (although it tends to be a bit more
>> compute intensive not being on a fast path)
>>
>> What is sort of missing here is trying to figure out which side of the
>> bottleneck is the bottleneck (up or down).
>

Yeah, we never did figure out how to separate the up from the
downlink.  However, we just consider the access link as a whole (up +
down) and mark/drop according to ratios of queuing time.  Overall it
seems to work well, but, we never did a mathematical analysis.  Kind
of like saying it's not a "bug", it is a feature.  And it this case it
is true since both sides can experience bloat.

>         Yeah, they relay on having a reliable packet reflector upstream of the “bottleneck” so they get their timestamped probe packets returned. In the paper they used either uplink or downlink traffic so figuring where the bottleneck was easy at least this is how I interpret “Experiments were performed in the upload (data flowing from the users to the CDNs) as well as in the download direction.". At least this is what I get from their short description in glossing over the paper.
>         Nice paper, but really not a full solution either. Unless the ISPs cooperate in supplying stable reflectors powerful enough to support all downstream customers. But if the ISPs cooperate, I would guess, they could eradicate downstream buffer bloat to begin with. Or the ISPs could have the reflector also add its own UTC time stamp which would allow to dissect the RTT into its constituting one-way delays to detect the currently bloated direction. (Think ICMP type 13/14 message pairs "on steroids", with higher resolution than milliseconds, but for buffer bloat detection ms resolution would probably be sufficient anyways). Currently, I hear that ISP equipment will not treat ICMP requests with priority though.

Not exactly.  We thought this through for some time and considered
many angles.  Each method has its advantages and disadvantages.

We decided not to use ICMP at all because of the reasons you stated
above.  We also decided not to use a "reflector" although as you said
it would allow us to separate upload queue time from download.  We
decided not to use this because it would be difficult to get ISPs to
do this.

Are final choice for the paper was "magic" IP packets.  This consists
of an IP packet header and the timestamp.  The IP packet is "self
addressed" and we trick the iptables to emit the packet on the correct
interface.  This packet will be returned to us as soon as it reaches
another IP layer (typically at the CMTS).

Here's a quick summary:
ICMP -- Simple, but, needs the ISP's cooperation (good luck :)
Reflector  -- Separates upload queue time from download queue time,
but, requires the ISP to cooperate and to build something for us.
(good luck :)
Magic IP packets -- Requires nothing from the ISP (YaY!  We have a
winner!), but, is a little more complex.

>         Also I am confused what they actually simulated: “The modems and CMTS were equipped with ASQM, CoDel and PIE,” and “However, the problem pop- ularly called bufferbloat can move about among many queues some of which are resistant to traditional AQM such as Layer 2 MAC protocols used in cable/DSL links. We call this problem bufferbloat displacement.” seem to be slightly at odds. If modems and CTMS have decent AQMs all they need to do is not stuff their sub-IP layer queuesand be done with it. The way I understood the cable labs PIE story, they intended to do exactly that, so at least the “buffer displacement” remedy by ASQM reads a bit like a straw man argument. But as I am a) not of the cs field, and b) only glossed over the paper, most likely I am missing something important that is clearly in the paper...

Good point!  However, once again it's not quite that simple.  Queues
are necessary to absorb short term variations in packet arrival rate
(or bursts).  The queue required for any flow is given by the
bandwidth delay product.  Since we don't know the delay we can't
predict the queue size in advance.  What I'm getting at is the
equipment manufacturers aren't putting in humongous queues because
they are stupid, they are putting them there because in some cases you
might really need that large of a queue.

Statically sizing the queues is not the answer.  Managing the size of
the queue with an algorithm is the answer.  :)

>
> Best Regards
>         Sebastian
>
>>
>> --
>> Dave Täht
>> What will it take to vastly improve wifi for everyone?
>> https://plus.google.com/u/0/explore/makewififast
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-11  0:10   ` Daniel Havey
@ 2015-06-11  7:27     ` Sebastian Moeller
  2015-06-12 15:02       ` Daniel Havey
  2015-06-12  1:49     ` David Lang
  1 sibling, 1 reply; 13+ messages in thread
From: Sebastian Moeller @ 2015-06-11  7:27 UTC (permalink / raw)
  To: Daniel Havey; +Cc: cake, cerowrt-devel, bloat

Hi Daniel,

thanks for the clarifications.

On Jun 11, 2015, at 02:10 , Daniel Havey <dhavey@gmail.com> wrote:

> Hmmm, maybe I can help clarify.  Bufferbloat occurs in the slowest
> queue on the path.  This is because the other queues are faster and
> will drain.  AQM algorithms work only if they are placed where the
> packets pile up (e.g. the slowest queue in the path).  This is
> documented in Kathy and Van's CoDel paper.

	I am with you so far.

> 
> This is usually all well and good because we know where the bottleneck
> (the slowest queue in the path) is.  It is the IP layer in the modem
> where the ISP implements their rate limiter.  That is why algorithms
> such as PIE and CoDel are implemented in the IP layer on the modem.

	Okay.

> 
> Suppose the full committed rate of the token bucket rate limiter is 8
> Mbps.  This means that the queue at the IP layer in the modem is
> capable of emitting packets at 8 Mbps sustained rate.  The problem
> occurs during peak hours when the ISP is not providing the full
> committed rate of 8 Mbps or that some queue in the system (probably in
> the access link) is providing something less than 8 Mbps (say for sake
> of discussion that the number is 7.5 Mbps).
> 
> We know that (see Kathy and Van's paper) that AQM algorithms only work
> when they are placed at the slowest queue.  However, the AQM is placed
> at the queue that is capable of providing 8 Mbps and this is not the
> slowest queue.  The AQM algorithm will not work in these conditions.
> 
> This is what is shown in the paper where the CoDel and PIE performance
> goes to hell in a handbasket.  The ASQM algorithm is designed to
> address this problem.

	Except that DOCSIS 3.1 pie in the modem does not work that way. As I understand http://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf section 3.2 MAP PROCESSING REQUIREMENTS, cable-modem pie will not stuff data into the lower layers until it has received a grant to actually send that data, hence no uncontrolled sub-IP layer buffer bloat possible (unless there are severe RF issues that take out data during transmission). So at least for the upstream direction docsis 3.1 pie will not suffer from buffer displacement, if I understand the cable labs white paper correctly. Your solution still might be a valuable add-on to control the downstream buffer bloat in addition. 
	I also believe that free in france had a modified DSL driver for their box that made sure sub-IP buffering was bound to a low number of packets as well, so no displaced buffers there as well. Now it seems that this solution for DSL was unique so far and has not caught on, but once docsis3.1 modems hit the market upstream PIE in the modems will be reality.

> 
> 
> 
> 
> 
> On Wed, Jun 10, 2015 at 1:54 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> Hi Dave,
>> 
>> 
>> On Jun 10, 2015, at 21:53 , Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf
>>> 
>>> gargoyle's qos system follows a similar approach, using htb + sfq, and
>>> a short ttl udp flow.
>>> 
>>> Doing this sort of measured, then floating the rate control with
>>> "cake" would be fairly easy (although it tends to be a bit more
>>> compute intensive not being on a fast path)
>>> 
>>> What is sort of missing here is trying to figure out which side of the
>>> bottleneck is the bottleneck (up or down).
>> 
> 
> Yeah, we never did figure out how to separate the up from the
> downlink.  However, we just consider the access link as a whole (up +
> down) and mark/drop according to ratios of queuing time.  

	This is a bit sad; why reduce say the effective uplink bandwidth if only the downstream is contended? Not that I have a good alternative solution that will not require help from outside boxes.

> Overall it
> seems to work well, but, we never did a mathematical analysis.  Kind
> of like saying it's not a "bug", it is a feature.  And it this case it
> is true since both sides can experience bloat.

	Yes, but you only want to throttle traffic on the congested leg of the link, otherwise bandwidth efficiency goes to hell if you look at bi-direction link-saturating traffic.

> 
> 
>>        Yeah, they relay on having a reliable packet reflector upstream of the “bottleneck” so they get their timestamped probe packets returned. In the paper they used either uplink or downlink traffic so figuring where the bottleneck was easy at least this is how I interpret “Experiments were performed in the upload (data flowing from the users to the CDNs) as well as in the download direction.". At least this is what I get from their short description in glossing over the paper.
>>        Nice paper, but really not a full solution either. Unless the ISPs cooperate in supplying stable reflectors powerful enough to support all downstream customers. But if the ISPs cooperate, I would guess, they could eradicate downstream buffer bloat to begin with. Or the ISPs could have the reflector also add its own UTC time stamp which would allow to dissect the RTT into its constituting one-way delays to detect the currently bloated direction. (Think ICMP type 13/14 message pairs "on steroids", with higher resolution than milliseconds, but for buffer bloat detection ms resolution would probably be sufficient anyways). Currently, I hear that ISP equipment will not treat ICMP requests with priority though.
> 
> Not exactly.  We thought this through for some time and considered
> many angles.  Each method has its advantages and disadvantages.
> 
> We decided not to use ICMP at all because of the reasons you stated
> above.  We also decided not to use a "reflector" although as you said
> it would allow us to separate upload queue time from download.  We
> decided not to use this because it would be difficult to get ISPs to
> do this.
> 
> Are final choice for the paper was "magic" IP packets.  This consists
> of an IP packet header and the timestamp.  The IP packet is "self
> addressed" and we trick the iptables to emit the packet on the correct
> interface.  This packet will be returned to us as soon as it reaches
> another IP layer (typically at the CMTS).

	Ah, thanks; I did not get this from reading over your paper (but that is probably caused by me being a layman and having read it very quickly). Question how large is that packet on-the-wire? IP header plus 8 byte makes me assume 20+8 = 28, but that is missing the ethernet header, so rather 14+20+8 = 42, but isn’t the shorts ethernet frame 64bytes?

> 
> Here's a quick summary:
> ICMP -- Simple, but, needs the ISP's cooperation (good luck :)
> Reflector  -- Separates upload queue time from download queue time,
> but, requires the ISP to cooperate and to build something for us.
> (good luck :)
> Magic IP packets -- Requires nothing from the ISP (YaY!  We have a
> winner!), but, is a little more complex.

	At the cost that you only get RTT instead of two one-way delays as one ideally would like. But as stated above if you combine your method with say docsis3.1 pie which promises to keep the upstream under tight control, the any RTT changes should (mainly) be caused by downstream over-buffering (effectively allowing you use you method to control the downstream well).

> 
> 
>>        Also I am confused what they actually simulated: “The modems and CMTS were equipped with ASQM, CoDel and PIE,” and “However, the problem pop- ularly called bufferbloat can move about among many queues some of which are resistant to traditional AQM such as Layer 2 MAC protocols used in cable/DSL links. We call this problem bufferbloat displacement.” seem to be slightly at odds. If modems and CTMS have decent AQMs all they need to do is not stuff their sub-IP layer queuesand be done with it. The way I understood the cable labs PIE story, they intended to do exactly that, so at least the “buffer displacement” remedy by ASQM reads a bit like a straw man argument. But as I am a) not of the cs field, and b) only glossed over the paper, most likely I am missing something important that is clearly in the paper...
> 
> Good point!  However, once again it's not quite that simple.  Queues
> are necessary to absorb short term variations in packet arrival rate
> (or bursts).  The queue required for any flow is given by the
> bandwidth delay product.  

	Not a CS person, but that does not ring fully true; this basically assumes a physical medium that will dump all packets into the buffer at one time point and send them out a full delay period later; I think in reality packets will be serialized and hence some packet will most likely have left the buffer already before all have arrived, so the BDP is an more estimate of an upper bound… not that there is anything wrong with designing solutions aim to handle the worst case well.

> Since we don't know the delay we can't
> predict the queue size in advance.  What I'm getting at is the
> equipment manufacturers aren't putting in humongous queues because
> they are stupid, they are putting them there because in some cases you
> might really need that large of a queue.

	I thought our current pet hypothesis is that they aim for BDP at their highest rated speeds or so, and all customers running that (huh speed capable) equipment at lower rates are out of luck.

> 
> Statically sizing the queues is not the answer.  Managing the size of
> the queue with an algorithm is the answer.  :)

	No disagreement here, we just discuss the how not the why ;)

Best Regards
	Sebastian

> 
> 
> 
>> 
>> Best Regards
>>        Sebastian
>> 
>>> 
>>> --
>>> Dave Täht
>>> What will it take to vastly improve wifi for everyone?
>>> https://plus.google.com/u/0/explore/makewififast
>>> _______________________________________________
>>> Cake mailing list
>>> Cake@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cake
>> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-11  7:27     ` Sebastian Moeller
@ 2015-06-12 15:02       ` Daniel Havey
  2015-06-12 16:02         ` Sebastian Moeller
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Havey @ 2015-06-12 15:02 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cake, cerowrt-devel, bloat

On Thu, Jun 11, 2015 at 12:27 AM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Daniel,
>
> thanks for the clarifications.
>
> On Jun 11, 2015, at 02:10 , Daniel Havey <dhavey@gmail.com> wrote:
>
>> Hmmm, maybe I can help clarify.  Bufferbloat occurs in the slowest
>> queue on the path.  This is because the other queues are faster and
>> will drain.  AQM algorithms work only if they are placed where the
>> packets pile up (e.g. the slowest queue in the path).  This is
>> documented in Kathy and Van's CoDel paper.
>
>         I am with you so far.
>
>>
>> This is usually all well and good because we know where the bottleneck
>> (the slowest queue in the path) is.  It is the IP layer in the modem
>> where the ISP implements their rate limiter.  That is why algorithms
>> such as PIE and CoDel are implemented in the IP layer on the modem.
>
>         Okay.
>
>>
>> Suppose the full committed rate of the token bucket rate limiter is 8
>> Mbps.  This means that the queue at the IP layer in the modem is
>> capable of emitting packets at 8 Mbps sustained rate.  The problem
>> occurs during peak hours when the ISP is not providing the full
>> committed rate of 8 Mbps or that some queue in the system (probably in
>> the access link) is providing something less than 8 Mbps (say for sake
>> of discussion that the number is 7.5 Mbps).
>>
>> We know that (see Kathy and Van's paper) that AQM algorithms only work
>> when they are placed at the slowest queue.  However, the AQM is placed
>> at the queue that is capable of providing 8 Mbps and this is not the
>> slowest queue.  The AQM algorithm will not work in these conditions.
>>
>> This is what is shown in the paper where the CoDel and PIE performance
>> goes to hell in a handbasket.  The ASQM algorithm is designed to
>> address this problem.
>
>         Except that DOCSIS 3.1 pie in the modem does not work that way. As I understand http://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf section 3.2 MAP PROCESSING REQUIREMENTS, cable-modem pie will not stuff data into the lower layers until it has received a grant to actually send that data, hence no uncontrolled sub-IP layer buffer bloat possible (unless there are severe RF issues that take out data during transmission). So at least for the upstream direction docsis 3.1 pie will not suffer from buffer displacement, if I understand the cable labs white paper correctly.

Hmmm, interesting.  Are you sure?  I'm a CS not a EE so the PHY layer
is like black magic to me.  However, I still think (although I am
willing to be convinced otherwise by someone with superior knowledge
:)) that the IP layer puts packet into a MAC layer queue.  Then the
MAC layer makes a queue depth based request for bandwidth in order to
serialize and send the data.

If somebody really knows how this works, please help!  :^)  Is the
upload of a docsis 3.1 modem really unbloatable below the IP layer?  I
just want to know for my own edification :)

> Your solution still might be a valuable add-on to control the downstream buffer bloat in addition.
I agree!  If that reading of the cablelabs paper is correct then this
nicely solves the upload vs. download problem and we don't really need
BQL either.  If it is not true, then we use BQL on the egress to solve
the upload bloat problem and ASQM to solve the download bloat problem.
Perfect solution!  I love it when a plan comes together!  :^)

>         I also believe that free in france had a modified DSL driver for their box that made sure sub-IP buffering was bound to a low number of packets as well, so no displaced buffers there as well. Now it seems that this solution for DSL was unique so far and has not caught on, but once docsis3.1 modems hit the market upstream PIE in the modems will be reality.

freefrance?  Dave isn't that your provider?  I thought they were
running fq_CoDel?
In any case, just because PIE in the modems is a reality don't be
tempted to declare the problem solved and go home.  Never
underestimate the ability of the ISPs to do the wrong thing for very
good reasons :^)  What happens if they don't turn it on?  This is
really what I was trying to solve with ASQM.  What if your provider
won't run CoDel or PIE for whatever incomprehensible reason?  Then you
run ASQM and be done with it.

>
>>
>>
>>
>>
>>
>> On Wed, Jun 10, 2015 at 1:54 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> Hi Dave,
>>>
>>>
>>> On Jun 10, 2015, at 21:53 , Dave Taht <dave.taht@gmail.com> wrote:
>>>
>>>> http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf
>>>>
>>>> gargoyle's qos system follows a similar approach, using htb + sfq, and
>>>> a short ttl udp flow.
>>>>
>>>> Doing this sort of measured, then floating the rate control with
>>>> "cake" would be fairly easy (although it tends to be a bit more
>>>> compute intensive not being on a fast path)
>>>>
>>>> What is sort of missing here is trying to figure out which side of the
>>>> bottleneck is the bottleneck (up or down).
>>>
>>
>> Yeah, we never did figure out how to separate the up from the
>> downlink.  However, we just consider the access link as a whole (up +
>> down) and mark/drop according to ratios of queuing time.
>
>         This is a bit sad; why reduce say the effective uplink bandwidth if only the downstream is contended? Not that I have a good alternative solution that will not require help from outside boxes.
>
>> Overall it
>> seems to work well, but, we never did a mathematical analysis.  Kind
>> of like saying it's not a "bug", it is a feature.  And it this case it
>> is true since both sides can experience bloat.
>
>         Yes, but you only want to throttle traffic on the congested leg of the link, otherwise bandwidth efficiency goes to hell if you look at bi-direction link-saturating traffic.
>
>>
>>
>>>        Yeah, they relay on having a reliable packet reflector upstream of the “bottleneck” so they get their timestamped probe packets returned. In the paper they used either uplink or downlink traffic so figuring where the bottleneck was easy at least this is how I interpret “Experiments were performed in the upload (data flowing from the users to the CDNs) as well as in the download direction.". At least this is what I get from their short description in glossing over the paper.
>>>        Nice paper, but really not a full solution either. Unless the ISPs cooperate in supplying stable reflectors powerful enough to support all downstream customers. But if the ISPs cooperate, I would guess, they could eradicate downstream buffer bloat to begin with. Or the ISPs could have the reflector also add its own UTC time stamp which would allow to dissect the RTT into its constituting one-way delays to detect the currently bloated direction. (Think ICMP type 13/14 message pairs "on steroids", with higher resolution than milliseconds, but for buffer bloat detection ms resolution would probably be sufficient anyways). Currently, I hear that ISP equipment will not treat ICMP requests with priority though.
>>
>> Not exactly.  We thought this through for some time and considered
>> many angles.  Each method has its advantages and disadvantages.
>>
>> We decided not to use ICMP at all because of the reasons you stated
>> above.  We also decided not to use a "reflector" although as you said
>> it would allow us to separate upload queue time from download.  We
>> decided not to use this because it would be difficult to get ISPs to
>> do this.
>>
>> Are final choice for the paper was "magic" IP packets.  This consists
>> of an IP packet header and the timestamp.  The IP packet is "self
>> addressed" and we trick the iptables to emit the packet on the correct
>> interface.  This packet will be returned to us as soon as it reaches
>> another IP layer (typically at the CMTS).
>
>         Ah, thanks; I did not get this from reading over your paper (but that is probably caused by me being a layman and having read it very quickly). Question how large is that packet on-the-wire? IP header plus 8 byte makes me assume 20+8 = 28, but that is missing the ethernet header, so rather 14+20+8 = 42, but isn’t the shorts ethernet frame 64bytes?
>
>>
>> Here's a quick summary:
>> ICMP -- Simple, but, needs the ISP's cooperation (good luck :)
>> Reflector  -- Separates upload queue time from download queue time,
>> but, requires the ISP to cooperate and to build something for us.
>> (good luck :)
>> Magic IP packets -- Requires nothing from the ISP (YaY!  We have a
>> winner!), but, is a little more complex.
>
>         At the cost that you only get RTT instead of two one-way delays as one ideally would like. But as stated above if you combine your method with say docsis3.1 pie which promises to keep the upstream under tight control, the any RTT changes should (mainly) be caused by downstream over-buffering (effectively allowing you use you method to control the downstream well).
>
>>
>>
>>>        Also I am confused what they actually simulated: “The modems and CMTS were equipped with ASQM, CoDel and PIE,” and “However, the problem pop- ularly called bufferbloat can move about among many queues some of which are resistant to traditional AQM such as Layer 2 MAC protocols used in cable/DSL links. We call this problem bufferbloat displacement.” seem to be slightly at odds. If modems and CTMS have decent AQMs all they need to do is not stuff their sub-IP layer queuesand be done with it. The way I understood the cable labs PIE story, they intended to do exactly that, so at least the “buffer displacement” remedy by ASQM reads a bit like a straw man argument. But as I am a) not of the cs field, and b) only glossed over the paper, most likely I am missing something important that is clearly in the paper...
>>
>> Good point!  However, once again it's not quite that simple.  Queues
>> are necessary to absorb short term variations in packet arrival rate
>> (or bursts).  The queue required for any flow is given by the
>> bandwidth delay product.
>
>         Not a CS person, but that does not ring fully true; this basically assumes a physical medium that will dump all packets into the buffer at one time point and send them out a full delay period later; I think in reality packets will be serialized and hence some packet will most likely have left the buffer already before all have arrived, so the BDP is an more estimate of an upper bound… not that there is anything wrong with designing solutions aim to handle the worst case well.
>
>> Since we don't know the delay we can't
>> predict the queue size in advance.  What I'm getting at is the
>> equipment manufacturers aren't putting in humongous queues because
>> they are stupid, they are putting them there because in some cases you
>> might really need that large of a queue.
>
>         I thought our current pet hypothesis is that they aim for BDP at their highest rated speeds or so, and all customers running that (huh speed capable) equipment at lower rates are out of luck.
>
>>
>> Statically sizing the queues is not the answer.  Managing the size of
>> the queue with an algorithm is the answer.  :)
>
>         No disagreement here, we just discuss the how not the why ;)
>
> Best Regards
>         Sebastian
>
>>
>>
>>
>>>
>>> Best Regards
>>>        Sebastian
>>>
>>>>
>>>> --
>>>> Dave Täht
>>>> What will it take to vastly improve wifi for everyone?
>>>> https://plus.google.com/u/0/explore/makewififast
>>>> _______________________________________________
>>>> Cake mailing list
>>>> Cake@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/cake
>>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-12 15:02       ` Daniel Havey
@ 2015-06-12 16:02         ` Sebastian Moeller
  0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Moeller @ 2015-06-12 16:02 UTC (permalink / raw)
  To: Daniel Havey; +Cc: cake, Greg White, cerowrt-devel, bloat

Hi Daniel,

On Jun 12, 2015, at 17:02 , Daniel Havey <dhavey@gmail.com> wrote:
> On Thu, Jun 11, 2015 at 12:27 AM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> [...]
>>        Except that DOCSIS 3.1 pie in the modem does not work that way. As I understand http://www.cablelabs.com/wp-content/uploads/2014/06/DOCSIS-AQM_May2014.pdf section 3.2 MAP PROCESSING REQUIREMENTS, cable-modem pie will not stuff data into the lower layers until it has received a grant to actually send that data, hence no uncontrolled sub-IP layer buffer bloat possible (unless there are severe RF issues that take out data during transmission). So at least for the upstream direction docsis 3.1 pie will not suffer from buffer displacement, if I understand the cable labs white paper correctly.
> 
> Hmmm, interesting.  Are you sure?  I'm a CS not a EE so the PHY layer
> is like black magic to me.  However, I still think (although I am
> willing to be convinced otherwise by someone with superior knowledge
> :)) that the IP layer puts packet into a MAC layer queue.  Then the
> MAC layer makes a queue depth based request for bandwidth in order to
> serialize and send the data.

	I am not sure, but maybe Greg White (CCd) can help us out here? @Greg, is it right that the docsis3.1 pie implementation will keep a close lid on how many packets/bytes are queued in lower layers of the stack? 


> 
> If somebody really knows how this works, please help!  :^)  Is the
> upload of a docsis 3.1 modem really unbloatable below the IP layer?  I
> just want to know for my own edification :)
> 
>> Your solution still might be a valuable add-on to control the downstream buffer bloat in addition.
> I agree!  If that reading of the cablelabs paper is correct then this
> nicely solves the upload vs. download problem and we don't really need
> BQL either.  If it is not true, then we use BQL on the egress to solve
> the upload bloat problem and ASQM to solve the download bloat problem.
> Perfect solution!  I love it when a plan comes together!  :^)

	ASFAIK, BQL so far is only implemented in ethernet drivers, so if your uplink is egaul or slightly higher than 10, 100, 1000, or 1000Mbps BQL with fq_codel will not need and shaper on egress and should still hold the buffers at bay. Unfortunately often the actual egress rates are quite off of these ethernet sleep tiers. I believe Dave Taeht is trying to convince ethernet drivers to set their egress at non-traditional rates, I could be wrong though...

> 
>>        I also believe that free in france had a modified DSL driver for their box that made sure sub-IP buffering was bound to a low number of packets as well, so no displaced buffers there as well. Now it seems that this solution for DSL was unique so far and has not caught on, but once docsis3.1 modems hit the market upstream PIE in the modems will be reality.
> 
> freefrance?  Dave isn't that your provider?  I thought they were
> running fq_CoDel?
> In any case, just because PIE in the modems is a reality don't be
> tempted to declare the problem solved and go home.  Never
> underestimate the ability of the ISPs to do the wrong thing for very
> good reasons :^)  What happens if they don't turn it on?  This is
> really what I was trying to solve with ASQM.  What if your provider
> won't run CoDel or PIE for whatever incomprehensible reason?  Then you
> run ASQM and be done with it.

	I like your active sensing approach, I just believe that the scenario you set-out n the paper is not fully true, so I tried to voice my concerns. Personally I am on a vdsl-link so docsis pie or no docsis pie, my link is still bloated and I am looking for new solutions. I like your magic packet idea, even though my taste in these matters is debatable ;) but I fear to work this needs to run on the modem, and thee are only few fully opensource modems around (if any) on which to implement your active probe. Plus on a DSL link the congestion typically comes between DSLAM and BRAS (as the DSL link is not shared, unlike the cable situation) and I fear the DSLAM might already return the probe packet…


Best Regards
	Sebastian


> 
>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Jun 10, 2015 at 1:54 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>> Hi Dave,
>>>> 
>>>> 
>>>> On Jun 10, 2015, at 21:53 , Dave Taht <dave.taht@gmail.com> wrote:
>>>> 
>>>>> http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf
>>>>> 
>>>>> gargoyle's qos system follows a similar approach, using htb + sfq, and
>>>>> a short ttl udp flow.
>>>>> 
>>>>> Doing this sort of measured, then floating the rate control with
>>>>> "cake" would be fairly easy (although it tends to be a bit more
>>>>> compute intensive not being on a fast path)
>>>>> 
>>>>> What is sort of missing here is trying to figure out which side of the
>>>>> bottleneck is the bottleneck (up or down).
>>>> 
>>> 
>>> Yeah, we never did figure out how to separate the up from the
>>> downlink.  However, we just consider the access link as a whole (up +
>>> down) and mark/drop according to ratios of queuing time.
>> 
>>        This is a bit sad; why reduce say the effective uplink bandwidth if only the downstream is contended? Not that I have a good alternative solution that will not require help from outside boxes.
>> 
>>> Overall it
>>> seems to work well, but, we never did a mathematical analysis.  Kind
>>> of like saying it's not a "bug", it is a feature.  And it this case it
>>> is true since both sides can experience bloat.
>> 
>>        Yes, but you only want to throttle traffic on the congested leg of the link, otherwise bandwidth efficiency goes to hell if you look at bi-direction link-saturating traffic.
>> 
>>> 
>>> 
>>>>       Yeah, they relay on having a reliable packet reflector upstream of the “bottleneck” so they get their timestamped probe packets returned. In the paper they used either uplink or downlink traffic so figuring where the bottleneck was easy at least this is how I interpret “Experiments were performed in the upload (data flowing from the users to the CDNs) as well as in the download direction.". At least this is what I get from their short description in glossing over the paper.
>>>>       Nice paper, but really not a full solution either. Unless the ISPs cooperate in supplying stable reflectors powerful enough to support all downstream customers. But if the ISPs cooperate, I would guess, they could eradicate downstream buffer bloat to begin with. Or the ISPs could have the reflector also add its own UTC time stamp which would allow to dissect the RTT into its constituting one-way delays to detect the currently bloated direction. (Think ICMP type 13/14 message pairs "on steroids", with higher resolution than milliseconds, but for buffer bloat detection ms resolution would probably be sufficient anyways). Currently, I hear that ISP equipment will not treat ICMP requests with priority though.
>>> 
>>> Not exactly.  We thought this through for some time and considered
>>> many angles.  Each method has its advantages and disadvantages.
>>> 
>>> We decided not to use ICMP at all because of the reasons you stated
>>> above.  We also decided not to use a "reflector" although as you said
>>> it would allow us to separate upload queue time from download.  We
>>> decided not to use this because it would be difficult to get ISPs to
>>> do this.
>>> 
>>> Are final choice for the paper was "magic" IP packets.  This consists
>>> of an IP packet header and the timestamp.  The IP packet is "self
>>> addressed" and we trick the iptables to emit the packet on the correct
>>> interface.  This packet will be returned to us as soon as it reaches
>>> another IP layer (typically at the CMTS).
>> 
>>        Ah, thanks; I did not get this from reading over your paper (but that is probably caused by me being a layman and having read it very quickly). Question how large is that packet on-the-wire? IP header plus 8 byte makes me assume 20+8 = 28, but that is missing the ethernet header, so rather 14+20+8 = 42, but isn’t the shorts ethernet frame 64bytes?
>> 
>>> 
>>> Here's a quick summary:
>>> ICMP -- Simple, but, needs the ISP's cooperation (good luck :)
>>> Reflector  -- Separates upload queue time from download queue time,
>>> but, requires the ISP to cooperate and to build something for us.
>>> (good luck :)
>>> Magic IP packets -- Requires nothing from the ISP (YaY!  We have a
>>> winner!), but, is a little more complex.
>> 
>>        At the cost that you only get RTT instead of two one-way delays as one ideally would like. But as stated above if you combine your method with say docsis3.1 pie which promises to keep the upstream under tight control, the any RTT changes should (mainly) be caused by downstream over-buffering (effectively allowing you use you method to control the downstream well).
>> 
>>> 
>>> 
>>>>       Also I am confused what they actually simulated: “The modems and CMTS were equipped with ASQM, CoDel and PIE,” and “However, the problem pop- ularly called bufferbloat can move about among many queues some of which are resistant to traditional AQM such as Layer 2 MAC protocols used in cable/DSL links. We call this problem bufferbloat displacement.” seem to be slightly at odds. If modems and CTMS have decent AQMs all they need to do is not stuff their sub-IP layer queuesand be done with it. The way I understood the cable labs PIE story, they intended to do exactly that, so at least the “buffer displacement” remedy by ASQM reads a bit like a straw man argument. But as I am a) not of the cs field, and b) only glossed over the paper, most likely I am missing something important that is clearly in the paper...
>>> 
>>> Good point!  However, once again it's not quite that simple.  Queues
>>> are necessary to absorb short term variations in packet arrival rate
>>> (or bursts).  The queue required for any flow is given by the
>>> bandwidth delay product.
>> 
>>        Not a CS person, but that does not ring fully true; this basically assumes a physical medium that will dump all packets into the buffer at one time point and send them out a full delay period later; I think in reality packets will be serialized and hence some packet will most likely have left the buffer already before all have arrived, so the BDP is an more estimate of an upper bound… not that there is anything wrong with designing solutions aim to handle the worst case well.
>> 
>>> Since we don't know the delay we can't
>>> predict the queue size in advance.  What I'm getting at is the
>>> equipment manufacturers aren't putting in humongous queues because
>>> they are stupid, they are putting them there because in some cases you
>>> might really need that large of a queue.
>> 
>>        I thought our current pet hypothesis is that they aim for BDP at their highest rated speeds or so, and all customers running that (huh speed capable) equipment at lower rates are out of luck.
>> 
>>> 
>>> Statically sizing the queues is not the answer.  Managing the size of
>>> the queue with an algorithm is the answer.  :)
>> 
>>        No disagreement here, we just discuss the how not the why ;)
>> 
>> Best Regards
>>        Sebastian
>> 
>>> 
>>> 
>>> 
>>>> 
>>>> Best Regards
>>>>       Sebastian
>>>> 
>>>>> 
>>>>> --
>>>>> Dave Täht
>>>>> What will it take to vastly improve wifi for everyone?
>>>>> https://plus.google.com/u/0/explore/makewififast
>>>>> _______________________________________________
>>>>> Cake mailing list
>>>>> Cake@lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/cake
>>>> 
>> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-11  0:10   ` Daniel Havey
  2015-06-11  7:27     ` Sebastian Moeller
@ 2015-06-12  1:49     ` David Lang
  2015-06-12 14:44       ` Daniel Havey
  1 sibling, 1 reply; 13+ messages in thread
From: David Lang @ 2015-06-12  1:49 UTC (permalink / raw)
  To: Daniel Havey; +Cc: cake, cerowrt-devel, bloat

On Wed, 10 Jun 2015, Daniel Havey wrote:

> We know that (see Kathy and Van's paper) that AQM algorithms only work
> when they are placed at the slowest queue.  However, the AQM is placed
> at the queue that is capable of providing 8 Mbps and this is not the
> slowest queue.  The AQM algorithm will not work in these conditions.

so the answer is that you don't deploy the AQM algorithm only at the perimeter, 
you deploy it much more widely.

Eventually you get to core devices that have multiple routes they can use to get 
to a destination. Those devices should notice that one route is getting 
congested and start sending the packets through alternate paths.

Now, if the problem is that the aggregate of inbound packets to your downstreams 
where you are the only path becomes higher than the available downstream 
bandwidth, you need to be running an AQM to handle things.

David Lang

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-12  1:49     ` David Lang
@ 2015-06-12 14:44       ` Daniel Havey
  2015-06-13  4:00         ` David Lang
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Havey @ 2015-06-12 14:44 UTC (permalink / raw)
  To: David Lang; +Cc: cake, cerowrt-devel, bloat

On Thu, Jun 11, 2015 at 6:49 PM, David Lang <david@lang.hm> wrote:
> On Wed, 10 Jun 2015, Daniel Havey wrote:
>
>> We know that (see Kathy and Van's paper) that AQM algorithms only work
>> when they are placed at the slowest queue.  However, the AQM is placed
>> at the queue that is capable of providing 8 Mbps and this is not the
>> slowest queue.  The AQM algorithm will not work in these conditions.
>
>
> so the answer is that you don't deploy the AQM algorithm only at the
> perimeter, you deploy it much more widely.
>
> Eventually you get to core devices that have multiple routes they can use to
> get to a destination. Those devices should notice that one route is getting
> congested and start sending the packets through alternate paths.
>
> Now, if the problem is that the aggregate of inbound packets to your
> downstreams where you are the only path becomes higher than the available
> downstream bandwidth, you need to be running an AQM to handle things.
>
> David Lang
>

Hmmmm, that is interesting.  There might be a problem with processing
power at the core though.  It could be difficult to manage all of
those packets flying through the core routers.

David does bring up an interesting point though.  The ASQM algorithm
was originally designed to solve the "Uncooperative ISP" problem.  I
coined the phrase, but, you can fill in your own adjective to fit your
personal favorite ISP :^)

The paper doesn't indicate this because I got roasted by a bunch of
reviewers for it, but, why not use an ASQM like algorithm other places
than the edge.  Suppose you are netflix and your ISP is shaping your
packets?  You cant do anything about the bandwidth reduction, but, you
can at least reduce the queuing...Just food for thought. :^)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-12 14:44       ` Daniel Havey
@ 2015-06-13  4:00         ` David Lang
  2015-06-13  5:50           ` Benjamin Cronce
  0 siblings, 1 reply; 13+ messages in thread
From: David Lang @ 2015-06-13  4:00 UTC (permalink / raw)
  To: Daniel Havey; +Cc: cake, cerowrt-devel, bloat

On Fri, 12 Jun 2015, Daniel Havey wrote:

> On Thu, Jun 11, 2015 at 6:49 PM, David Lang <david@lang.hm> wrote:
>> On Wed, 10 Jun 2015, Daniel Havey wrote:
>>
>>> We know that (see Kathy and Van's paper) that AQM algorithms only work
>>> when they are placed at the slowest queue.  However, the AQM is placed
>>> at the queue that is capable of providing 8 Mbps and this is not the
>>> slowest queue.  The AQM algorithm will not work in these conditions.
>>
>>
>> so the answer is that you don't deploy the AQM algorithm only at the
>> perimeter, you deploy it much more widely.
>>
>> Eventually you get to core devices that have multiple routes they can use to
>> get to a destination. Those devices should notice that one route is getting
>> congested and start sending the packets through alternate paths.
>>
>> Now, if the problem is that the aggregate of inbound packets to your
>> downstreams where you are the only path becomes higher than the available
>> downstream bandwidth, you need to be running an AQM to handle things.
>>
>> David Lang
>>
>
> Hmmmm, that is interesting.  There might be a problem with processing
> power at the core though.  It could be difficult to manage all of
> those packets flying through the core routers.

And that is the question that people are looking at.

But part of the practical question is at what speeds do you start to run into 
problems?

the core of the Internet is already doing dynamic routing of packets, spreading 
them across multiple parallel paths (peering points have multiple 10G links 
between peers), so this should be more of the same, with possibly a small 
variation to use more expensive paths if the cheap ones are congested.

But as you move out from there towards the edge, the packet handling 
requirements drop rather quickly, and I'll bet that you don't have to get very 
far out before you can start affording to implement AQM algorithms. I'm betting 
that you reach that point before you get to the point in the network where you 
no longer have multiple paths available

> David does bring up an interesting point though.  The ASQM algorithm
> was originally designed to solve the "Uncooperative ISP" problem.  I
> coined the phrase, but, you can fill in your own adjective to fit your
> personal favorite ISP :^)
>
> The paper doesn't indicate this because I got roasted by a bunch of
> reviewers for it, but, why not use an ASQM like algorithm other places
> than the edge.  Suppose you are netflix and your ISP is shaping your
> packets?  You cant do anything about the bandwidth reduction, but, you
> can at least reduce the queuing...Just food for thought. :^)

unfortunantly if you are trapped by the ISP/netflix peering war, you reducing 
the number of packets in flight for yourself isn't going to help any. It would 
have to happen on the netflix side of the bottleneck.

David Lang

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-13  4:00         ` David Lang
@ 2015-06-13  5:50           ` Benjamin Cronce
  0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Cronce @ 2015-06-13  5:50 UTC (permalink / raw)
  To: David Lang; +Cc: cake, Daniel Havey, cerowrt-devel, bloat

[-- Attachment #1: Type: text/plain, Size: 4121 bytes --]

On Fri, 12 Jun 2015, Daniel Havey wrote:
>
> > On Thu, Jun 11, 2015 at 6:49 PM, David Lang <david at lang.hm> wrote:
> >> On Wed, 10 Jun 2015, Daniel Havey wrote:
> >>
> >>> We know that (see Kathy and Van's paper) that AQM algorithms only work
> >>> when they are placed at the slowest queue.  However, the AQM is placed
> >>> at the queue that is capable of providing 8 Mbps and this is not the
> >>> slowest queue.  The AQM algorithm will not work in these conditions.
> >>
> >>
> >> so the answer is that you don't deploy the AQM algorithm only at the
> >> perimeter, you deploy it much more widely.
> >>
> >> Eventually you get to core devices that have multiple routes they can
use to
> >> get to a destination. Those devices should notice that one route is
getting
> >> congested and start sending the packets through alternate paths.
> >>
> >> Now, if the problem is that the aggregate of inbound packets to your
> >> downstreams where you are the only path becomes higher than the
available
> >> downstream bandwidth, you need to be running an AQM to handle things.
> >>
> >> David Lang
> >>
> >
> > Hmmmm, that is interesting.  There might be a problem with processing
> > power at the core though.  It could be difficult to manage all of
> > those packets flying through the core routers.
>
> And that is the question that people are looking at.
>
> But part of the practical question is at what speeds do you start to run
into
> problems?
>
> the core of the Internet is already doing dynamic routing of packets,
spreading
> them across multiple parallel paths (peering points have multiple 10G
links
> between peers), so this should be more of the same, with possibly a small
> variation to use more expensive paths if the cheap ones are congested.

Yes and no. Spreading data across parallel links is mostly done at the MAC
layer and does not show up as separate routes, thinking teaming ports for
Ethernet. Routing is dynamic, but typically takes a bit for the route
changes to propagate. For the most part, you can only control where you
send data, but not where you receive it. The core of the Internet typically
only has 2-3 routes to choose from with one primary route and the other
only used for fail over. Load balancing asymmetrical routes is a very messy
issue that you really don't want to do. Most of the time, the cheapest
route is also the fastest. If you had to choose between a $5k/month 100Gb
port at a peering location or a $30k 10Gb transit link, I'm sure you won't
be doing any load balancing over the 10Gb link unless your 100Gb failed.

Routes really don't change that often. You have a default transit route and
a bunch of peering routes. The peering routes take priority because they're
cheaper and the transit route is for when bad things happen or you just
don't have peering for that route. In the case of my ISP, everything is
transit, let Level 3 worry about peering.

> But as you move out from there towards the edge, the packet handling
> requirements drop rather quickly, and I'll bet that you don't have to get
very
> far out before you can start affording to implement AQM algorithms. I'm
betting
> that you reach that point before you get to the point in the network
where you
> no longer have multiple paths available
>
> > David does bring up an interesting point though.  The ASQM algorithm
> > was originally designed to solve the "Uncooperative ISP" problem.  I
> > coined the phrase, but, you can fill in your own adjective to fit your
> > personal favorite ISP :^)
> >
> > The paper doesn't indicate this because I got roasted by a bunch of
> > reviewers for it, but, why not use an ASQM like algorithm other places
> > than the edge.  Suppose you are netflix and your ISP is shaping your
> > packets?  You cant do anything about the bandwidth reduction, but, you
> > can at least reduce the queuing...Just food for thought. :^)
>
> unfortunantly if you are trapped by the ISP/netflix peering war, you
reducing
> the number of packets in flight for yourself isn't going to help any. It
would
> have to happen on the netflix side of the bottleneck.
>
> David Lang
>

[-- Attachment #2: Type: text/html, Size: 6673 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-10 20:54 ` Sebastian Moeller
  2015-06-11  0:10   ` Daniel Havey
@ 2015-06-11  1:05   ` Alan Jenkins
  2015-06-11  7:58     ` Sebastian Moeller
  1 sibling, 1 reply; 13+ messages in thread
From: Alan Jenkins @ 2015-06-11  1:05 UTC (permalink / raw)
  To: Sebastian Moeller, Dave Täht; +Cc: cake, cerowrt-devel, bloat

On 10/06/15 21:54, Sebastian Moeller wrote:
> Hi Dave,
>
>
> On Jun 10, 2015, at 21:53 , Dave Taht <dave.taht@gmail.com> wrote:
>
>> http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf
>>
>> gargoyle's qos system follows a similar approach, using htb + sfq, and
>> a short ttl udp flow.
>>
>> Doing this sort of measured, then floating the rate control with
>> "cake" would be fairly easy (although it tends to be a bit more
>> compute intensive not being on a fast path)
>>
>> What is sort of missing here is trying to figure out which side of the
>> bottleneck is the bottleneck (up or down).
> 	Yeah, they relay on having a reliable packet reflector upstream of the “bottleneck” so they get their timestamped probe packets returned.

They copy & frob real IP headers.  They don't _say_ how the reflection 
works, but I guess low TTL -> ICMP TTL exceeded, like traceroute.  Then 
I read Gargoyle also use ICMP TTL exceeded and I thought my guess is 
quite educated 8).

Note the size of the timestamp, a generous 8 bytes.  It "just happens" 
that ICMP responses are required to include the first 8 bytes of the IP 
payload 8).

>   In the paper they used either uplink or downlink traffic so figuring where the bottleneck was easy at least this is how I interpret “Experiments were performed in the upload (data flowing from the users to the CDNs) as well as in the download direction.". At least this is what I get from their short description in glossing over the paper.

Ow!  I hadn't noticed that.  You could reduce both rates proportionally 
but the effect is inelegant.  I wonder what Gargoyle does...

2012 gargoyle developer comment says "There are not settings for active 
congestion control on the uplink side. ACC concentrats on the download 
side only."

Random blog post points out this is sufficient to fix prioritization 
v.s. bufferbloat.  "In upstream direction this is not a big problem 
because your router can still prioritize which packet should be sent 
first".  (Yay, I get to classify every application I care about /s and 
still get killed by uploads in http).

One solution would be if ISPs made sure upload is 100% provisioned. 
Could be cheaper than for (the higher rate) download.

> 	Nice paper, but really not a full solution either. Unless the ISPs cooperate in supplying stable reflectors powerful enough to support all downstream customers.

I think that's a valid concern.  Is "TTL Exceeded" rate-limited like 
Echo (because it may be generated outside the highest-speed forwarding 
path?), and would this work as tested if everyone did it?

>   But if the ISPs cooperate, I would guess, they could eradicate downstream buffer bloat to begin with. Or the ISPs could have the reflector also add its own UTC time stamp which would allow to dissect the RTT into its constituting one-way delays to detect the currently bloated direction. (Think ICMP type 13/14 message pairs "on steroids", with higher resolution than milliseconds, but for buffer bloat detection ms resolution would probably be sufficient anyways). Currently, I hear that ISP equipment will not treat ICMP requests with priority though.
> 	Also I am confused what they actually simulated: “The modems and CMTS were equipped with ASQM, CoDel and PIE,” and “However, the problem pop- ularly called bufferbloat can move about among many queues some of which are resistant to traditional AQM such as Layer 2 MAC protocols used in cable/DSL links. We call this problem bufferbloat displacement.” seem to be slightly at odds. If modems and CTMS have decent AQMs all they need to do is not stuff their sub-IP layer queuesand be done with it. The way I understood the cable labs PIE story, they intended to do exactly that, so at least the “buffer displacement” remedy by ASQM reads a bit like a straw man argument. But as I am a) not of the cs field, and b) only glossed over the paper, most likely I am missing something important that is clearly in the paper...
>
> Best Regards
> 	Sebastian

I had your reaction about pie on the modem.

We could say there is likely room for improvement in any paper, that 
claims bufferbloat eliminated with a "target" parameter of 100ms :p.  
Results don't look that bad (why?) but I do see 25ms bloat v.s. 
codel/pie.  It may be inevitable but deserves not to be glossed over 
with comparisons to the unrelated 100ms default parameter of codel, 
which in reality is the one called "interval" not "target" :).  Good QM 
on the modem+cmts has got to be the best solution.

Alan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cake] active sensing queue management
  2015-06-11  1:05   ` Alan Jenkins
@ 2015-06-11  7:58     ` Sebastian Moeller
  0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Moeller @ 2015-06-11  7:58 UTC (permalink / raw)
  To: Alan Jenkins; +Cc: cake, Daniel Havey, cerowrt-devel, bloat

Hi Alan,

On Jun 11, 2015, at 03:05 , Alan Jenkins <alan.christopher.jenkins@gmail.com> wrote:

> On 10/06/15 21:54, Sebastian Moeller wrote:
>> Hi Dave,
>> 
>> 
>> On Jun 10, 2015, at 21:53 , Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> http://dl.ifip.org/db/conf/networking/networking2015/1570064417.pdf
>>> 
>>> gargoyle's qos system follows a similar approach, using htb + sfq, and
>>> a short ttl udp flow.
>>> 
>>> Doing this sort of measured, then floating the rate control with
>>> "cake" would be fairly easy (although it tends to be a bit more
>>> compute intensive not being on a fast path)
>>> 
>>> What is sort of missing here is trying to figure out which side of the
>>> bottleneck is the bottleneck (up or down).
>> 	Yeah, they relay on having a reliable packet reflector upstream of the “bottleneck” so they get their timestamped probe packets returned.
> 
> They copy & frob real IP headers.  They don't _say_ how the reflection works, but I guess low TTL -> ICMP TTL exceeded, like traceroute.  Then I read Gargoyle also use ICMP TTL exceeded and I thought my guess is quite educated 8).

	Daniel elucidated their magic packets: they create self-addressed IP packets at the simulated CPE and inject them in the simulated cable link; the other end will pass the data through its stack and once the sender-self-addressed packet reaches the IP-layer of the simulated CMTS it gets send back, since that IP layer sees the CPE’s IP address as the to address.
	@Daniel, this trick can only work if a) the magic packets are only passed one IP-hop since the first upstream IP-layer will effectively bounce them back (so the injector in the docsis case needs to be the cable modem) b) the CPE actually has an IP that can be reached from the outside and that is known to the person setting up your AQM, is that correct? How does this work if the CPE acts as an ethernet bridge without an external IP?

> 
> Note the size of the timestamp, a generous 8 bytes.  It "just happens" that ICMP responses are required to include the first 8 bytes of the IP payload 8).
> 
>>  In the paper they used either uplink or downlink traffic so figuring where the bottleneck was easy at least this is how I interpret “Experiments were performed in the upload (data flowing from the users to the CDNs) as well as in the download direction.". At least this is what I get from their short description in glossing over the paper.
> 
> Ow!  I hadn't noticed that.  You could reduce both rates proportionally but the effect is inelegant.

	I think that it what they do, as long as one only measures uni-directional saturating traffic this approach will work fine as the bandwidth loss in the opposite direction simply does not materialize.

>  I wonder what Gargoyle does...
> 
> 2012 gargoyle developer comment says "There are not settings for active congestion control on the uplink side. ACC concentrats on the download side only."
> 
> Random blog post points out this is sufficient to fix prioritization v.s. bufferbloat.  "In upstream direction this is not a big problem because your router can still prioritize which packet should be sent first".  (Yay, I get to classify every application I care about /s and still get killed by uploads in http).

	Not fully convinced that this is fully sane, as in cable systems the upstream bandwidth can fluctuate significantly depending on how many people are active. Actually scratch the “cable” since most customer links have shared oversubscribed links somewhere between the CPE and the internet that will make static bandwidth shaping mis-behave some of the time. A good ISP just manages the oversubscription well enough that this issue only occurs transiently… (I hope).


> 
> One solution would be if ISPs made sure upload is 100% provisioned. Could be cheaper than for (the higher rate) download.

	Not going to happen, in my opinion, as economically unfeasible for a publicly traded ISP. I would settle for that approach as long as the ISP is willing to fix its provisioning so that oversubscription episodes are reasonable rare, though.

> 
>> 	Nice paper, but really not a full solution either. Unless the ISPs cooperate in supplying stable reflectors powerful enough to support all downstream customers.
> 
> I think that's a valid concern.  Is "TTL Exceeded" rate-limited like Echo (because it may be generated outside the highest-speed forwarding path?), and would this work as tested if everyone did it?

	I thing Daniel agrees and that is why they came up with the “magic” packet approach (that drags in its own set of challenges as far as I can see).

> 
>>  But if the ISPs cooperate, I would guess, they could eradicate downstream buffer bloat to begin with. Or the ISPs could have the reflector also add its own UTC time stamp which would allow to dissect the RTT into its constituting one-way delays to detect the currently bloated direction. (Think ICMP type 13/14 message pairs "on steroids", with higher resolution than milliseconds, but for buffer bloat detection ms resolution would probably be sufficient anyways). Currently, I hear that ISP equipment will not treat ICMP requests with priority though.
>> 	Also I am confused what they actually simulated: “The modems and CMTS were equipped with ASQM, CoDel and PIE,” and “However, the problem pop- ularly called bufferbloat can move about among many queues some of which are resistant to traditional AQM such as Layer 2 MAC protocols used in cable/DSL links. We call this problem bufferbloat displacement.” seem to be slightly at odds. If modems and CTMS have decent AQMs all they need to do is not stuff their sub-IP layer queuesand be done with it. The way I understood the cable labs PIE story, they intended to do exactly that, so at least the “buffer displacement” remedy by ASQM reads a bit like a straw man argument. But as I am a) not of the cs field, and b) only glossed over the paper, most likely I am missing something important that is clearly in the paper...
>> 
>> Best Regards
>> 	Sebastian
> 
> I had your reaction about pie on the modem.
> 
> We could say there is likely room for improvement in any paper, that claims bufferbloat eliminated with a "target" parameter of 100ms :p.  Results don't look that bad (why?) but I do see 25ms bloat v.s. codel/pie.  It may be inevitable but deserves not to be glossed over with comparisons to the unrelated 100ms default parameter of codel, which in reality is the one called "interval" not "target" :).  Good QM on the modem+cmts has got to be the best solution.

	I fully agree. I have a hunch that their method might be used to supplement docsis 3.1 pie so that the CPEs can also meaningfully measure and control downstream buffer bloat in addition to the upstream without the need to fix the CMTSs. As far as I understand cable labs are quite proactive in trying to fix this in CPE’s  while I have heard nothing about the CMTS manufacturers’ plans (I think the Arris paper was about CPEs not CMTS). Maybe cable labs could be convinced to try this in addition to upstream PIE as a solution that will require no CMTS involvement… (I simple assume that the CMTS does not need to cooperate, but note that the paper seems to rely totally on simulated data, in so far as linux pc’s where used to model each of the network components. So "no real CMTS was harmed during the making of this paper")

Best Regards
	Sebastian




> 
> Alan


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-06-13  5:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-12 20:51 [Cake] active sensing queue management Benjamin Cronce
  -- strict thread matches above, loose matches on Subject: below --
2015-06-10 19:53 Dave Taht
2015-06-10 20:54 ` Sebastian Moeller
2015-06-11  0:10   ` Daniel Havey
2015-06-11  7:27     ` Sebastian Moeller
2015-06-12 15:02       ` Daniel Havey
2015-06-12 16:02         ` Sebastian Moeller
2015-06-12  1:49     ` David Lang
2015-06-12 14:44       ` Daniel Havey
2015-06-13  4:00         ` David Lang
2015-06-13  5:50           ` Benjamin Cronce
2015-06-11  1:05   ` Alan Jenkins
2015-06-11  7:58     ` Sebastian Moeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox