Re: [Cake] Recomended HW to run cake and fq

Cake - FQ_codel the next generation
 help / color / mirror / Atom feed

* Re: [Cake] Recomended HW to run cake and fq_codel?
       [not found] <mailman.1.1493740801.18318.cake@lists.bufferbloat.net>
@ 2017-05-02 18:44 ` Pete Heist
  2017-05-03  5:59   ` erik.taraldsen
  0 siblings, 1 reply; 16+ messages in thread
From: Pete Heist @ 2017-05-02 18:44 UTC (permalink / raw)
  To: erik.taraldsen; +Cc: cake


> Message: 1
> Date: Tue, 2 May 2017 10:34:45 +0000
> From: <erik.taraldsen@telenor.com>
> To: <me@lochnair.net>
> Cc: <cake@lists.bufferbloat.net>
> Subject: Re: [Cake] Recomended HW to run cake and fq_codel?
> Message-ID: <1493721285271.28909@telenor.com>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> I'm actually most interested in how this works on low bandwidth accesses.  Typicaly, what can we as an ISP do to make ADSL and VDSL less sucky for our customers.  So an Edge Router PoE-5  1) or the X sfp 2) would be a good platform for this?  (Don't need the PoE or sfp, but it's the easiest accessible version here in Norway).
> 
> Nils, very good of you to keep such packages precompiled! That will save me a lot of time.

> ------------------------------
> 
> Message: 2
> Date: Tue, 02 May 2017 14:11:20 +0200
> From: Nils Andreas Svee <me@lochnair.net>
> To: erik.taraldsen@telenor.com
> Cc: cake@lists.bufferbloat.net
> Subject: Re: [Cake] Recomended HW to run cake and fq_codel?
> Message-ID:
> 	<1493727080.1510042.962956680.40220FCB@webmail.messagingengine.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Kinda surprising that the plain ER-X isn't readily available. I know
> Dustin used to have them, but they're out of stock. Both of them will do
> just fine, but I'd probably pick the ER-X-SFP for the beefier CPU, if
> only to get some extra headroom. Mind the ER-X only have 256 MB RAM and
> 256 MB flash, if that matters to you.
> 
> DSL tends to suck pretty (read: very) bad without proper shaping, I
> know. On that note, are you planning to run an AQM on both ends of the
> bottleneck, or shape ingress traffic via a IFB device? CAKE helps a lot
> when running on ingress, but it can't come close to running on both
> ends.
> 
> Best Regards
> Nils

Just sharing some experience/thoughts from a few angles:

- As for low bandwidth, in my experience AQM works great on low bandwidth ADSL. A few years ago I used fq_codel at a campground to shape a 0.5 / 5 Mbps ADSL connection. With up to 130 people in the camp, it was a disaster before fq_codel, where one person saturating either the up or downstream could easily cause 600+ ms of induced latency. fq_codel could keep that to 40-50 ms under load, enough to make it usable for web browsing, at least, and Cake does better.

- I did the ER-X testing referred to in the UBNT forums (pgage). I’ve since learned much more about AQM by testing point-to-point WiFi setups, so I should really repeat those ER-X tests some time to make sure my results were accurate, but afaik ~120 Mbps using Cake is possible with the ER-X. I’m now using Cake in production on the ER-X at rates around 40 Mbps with Lochnair’s builds (on EdgeOS 1.9.1) and it does a great job.

- It’s an interesting question: what can be done as an ISP? Essentially it boils down to the fundamentals of deploying AQM- finding where the queues are forming and placing fq_codel or Cake at the bottleneck links, preferring “hardware” queue management like BQL or in the case of WiFi the ath9k’s driver in LEDE, over soft rate limiting, where possible. When soft rate limiting, the rate limiting strategy and chosen rate are the most CPU intensive and finicky parts of deploying AQM.

- I don't understand why ADSL modem vendors don’t just bake BQL-like functionality right into their devices so they can ship AQM without the need for soft rate limiting. AQM is so effective on ADSL's upstream that it seems it would just make a lot of sense. For that matter, why not on the DSLAM as well to shape the customer’s downstream, if that’s also a bottleneck?

- There seems to be a bit of upheaval now with BBR. If every host had BBR deployed, that would theoretically mitigate the need for AQM, but it’s a) it’s going to be years before that happens and b) I’m not sure all the BBR corner cases have been found yet. There are far more knowledgeable people than me on this and already more detailed discussions about it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-05-02 18:44 ` [Cake] Recomended HW to run cake and fq_codel? Pete Heist
@ 2017-05-03  5:59   ` erik.taraldsen
  2017-05-03  7:15     ` Pete Heist
  2017-11-27  8:35     ` Dave Taht
  0 siblings, 2 replies; 16+ messages in thread
From: erik.taraldsen @ 2017-05-03  5:59 UTC (permalink / raw)
  To: peteheist; +Cc: cake

> Fra: Pete Heist <peteheist@gmail.com>
> - As for low bandwidth, in my experience AQM works great on low bandwidth ADSL. A few years ago I
> used fq_codel at a campground to shape a 0.5 / 5 Mbps ADSL connection. With up to 130 people in the
> camp, it was a disaster before fq_codel, where one person saturating either the up or downstream 
> could easily cause 600+ ms of induced latency. fq_codel could keep that to 40-50 ms under load, 
>enough to make it usable for web browsing, at least, and Cake does better.

Thats encouraging! 

>  It’s an interesting question: what can be done as an ISP? Essentially it boils down to the fundamentals 
> of deploying AQM- finding where the queues are forming and placing fq_codel or Cake at the 
> bottleneck links, preferring “hardware” queue management like BQL or in the case of WiFi the ath9k’s 
> driver in LEDE, over soft rate limiting, where possible. When soft rate limiting, the rate limiting strategy 
> and chosen rate are the most CPU intensive and finicky parts of deploying AQM.

What I see as short term posibiliteis for us as ISP's is to push our vendors to include this as a part of the feature set.  We also could do better with the maketing.  Lets steal an idea from the Video area.  HD is often written as 1080P@60.  Why not do the same for internet speed?  60M@80ms.  Where the @80ms would be the larges latency in either direction that queue management would introduce?  (This of cource introduces the risk of artificialy tuning the @xxms to low and ending up with strict policing)

> - I don't understand why ADSL modem vendors don’t just bake BQL-like functionality right into their 
> devices so they can ship AQM without the need for soft rate limiting. AQM is so effective on ADSL's 
> upstream that it seems it would just make a lot of sense. For that matter, why not on the DSLAM as well 
> to shape the customer’s downstream, if that’s also a bottleneck?

I think most ISP's handle shaping on the BRAS level rather than on the DSLAM, as DSLAM's in general have very limited shaping/qos capabilites.

Regarding CPEs, to be fair, up coming devices from Intel (Lantiq) will more or less do away with HW accellerators and do everything in software.  Then the vendors are a lot more free to implement better shapeing strategies.

The trade shows and all sales pitches focuses mostly on next gen stuff.  There are comparatively very little recources dedicated to ADSL, where the best schedulers is most needed. 

-Erik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-05-03  5:59   ` erik.taraldsen
@ 2017-05-03  7:15     ` Pete Heist
  2017-05-03 10:03       ` Andy Furniss
  2017-05-03 11:10       ` erik.taraldsen
  2017-11-27  8:35     ` Dave Taht
  1 sibling, 2 replies; 16+ messages in thread
From: Pete Heist @ 2017-05-03  7:15 UTC (permalink / raw)
  To: erik.taraldsen; +Cc: cake

> On May 3, 2017, at 7:59 AM, <erik.taraldsen@telenor.com> <erik.taraldsen@telenor.com> wrote:
> 
>> It’s an interesting question: what can be done as an ISP? Essentially it boils down to the fundamentals 
>> of deploying AQM- finding where the queues are forming and placing fq_codel or Cake at the 
>> bottleneck links, preferring “hardware” queue management like BQL or in the case of WiFi the ath9k’s 
>> driver in LEDE, over soft rate limiting, where possible. When soft rate limiting, the rate limiting strategy 
>> and chosen rate are the most CPU intensive and finicky parts of deploying AQM.
> 
> What I see as short term posibiliteis for us as ISP's is to push our vendors to include this as a part of the feature set.  We also could do better with the maketing.  Lets steal an idea from the Video area.  HD is often written as 1080P@60.  Why not do the same for internet speed?  60M@80ms.  Where the @80ms would be the larges latency in either direction that queue management would introduce?  (This of cource introduces the risk of artificialy tuning the @xxms to low and ending up with strict policing)

True, in the same way max throughputs have been pushed up various ways, I wouldn’t want to see a latency war where ‘pfifo limit 2’ is being deployed, and yet I like the idea of spreading awareness about the importance of latency. When I hear users ask “is it stable?”, I think latency is a big part of what they’re asking about without realizing it. There’s a certain “latency stress" that comes when clicking a link on the web and not getting an immediate response. I wonder if anyone has studied that.

>> - I don't understand why ADSL modem vendors don’t just bake BQL-like functionality right into their 
>> devices so they can ship AQM without the need for soft rate limiting. AQM is so effective on ADSL's 
>> upstream that it seems it would just make a lot of sense. For that matter, why not on the DSLAM as well 
>> to shape the customer’s downstream, if that’s also a bottleneck?
> 
> I think most ISP's handle shaping on the BRAS level rather than on the DSLAM, as DSLAM's in general have very limited shaping/qos capabilities.

That makes sense. I’ve never worked with provider side ADSL equipment so I lumped it all under the term “DSLAM”, not knowing what a BRAS was before. :) 

Another option for ISPs (failing AQM support in the devices, and instead of deploying devices on the customer side), could be to provide each customer a queue that’s tuned to their link rate. There could be an HTB tree with classes for each customer and Cake at the leafs. Knowing each customer’s link rate (assuming it’s not variable) you’d set HTB’s rate to something less than that. There would be work to do as each customer is added and removed, but at least it would be transparent to them. AQM is best done at egress where packets originate, so I’m not sure how well it would work in practice.

What’s usually used for an ADSL provider’s backhaul, fiber?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-05-03  7:15     ` Pete Heist
@ 2017-05-03 10:03       ` Andy Furniss
  2017-05-03 11:10       ` erik.taraldsen
  1 sibling, 0 replies; 16+ messages in thread
From: Andy Furniss @ 2017-05-03 10:03 UTC (permalink / raw)
  To: Pete Heist, erik.taraldsen; +Cc: cake

Pete Heist wrote:

> Another option for ISPs (failing AQM support in the devices, and
> instead of deploying devices on the customer side), could be to
> provide each customer a queue that’s tuned to their link rate. There
> could be an HTB tree with classes for each customer and Cake at the
> leafs. Knowing each customer’s link rate (assuming it’s not
> variable)

FWIW, on the narrow knowing variable rate point, in the UK DSLAMs use (I
think) TR101 to report sync rates to ISPs.
It has always been the case that BRAS is set with this, my ISP does also
use it to do downstream (for me, egress for them) QOS.
Of course they need (expensive?) kit to mark, and DPI/rules to classify
traffic ....
They use Ellacoyas. I don't know if they do the actual shaping or just
the classification and marking.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-05-03  7:15     ` Pete Heist
  2017-05-03 10:03       ` Andy Furniss
@ 2017-05-03 11:10       ` erik.taraldsen
  1 sibling, 0 replies; 16+ messages in thread
From: erik.taraldsen @ 2017-05-03 11:10 UTC (permalink / raw)
  To: peteheist; +Cc: cake

> Fra: Pete Heist <peteheist@gmail.com>
>
> Another option for ISPs (failing AQM support in the devices, and instead of deploying devices on the customer 
> side), could be to provide each customer a queue that’s tuned to their link rate. There could be an HTB tree 
> with classes for each customer and Cake at the leafs. Knowing each customer’s link rate (assuming it’s not 
> variable) you’d set HTB’s rate to something less than that. There would be work to do as each customer is 
> added and removed, but at least it would be transparent to them. AQM is best done at egress where packets 
> originate, so I’m not sure how well it would work in practice.

We are limited by what the HW can do here.  The Juniper ERX's we use aggregate multiple 10.000's of users.   So Queue handeling must be supported in silicone.  We do have a queue pr customer, but we cant support HTB.


> What’s usually used for an ADSL provider’s backhaul, fiber?

I'm not sure what our competition use, but we mainly use fiber.  There are some edge cases using alternative backhaul such as radio.  Norway has challenging topology for infrastructure, so some will have to suffer unfortunately. :) 



-Erik

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-05-03  5:59   ` erik.taraldsen
  2017-05-03  7:15     ` Pete Heist
@ 2017-11-27  8:35     ` Dave Taht
  2017-11-27 12:04       ` Jonathan Morton
  1 sibling, 1 reply; 16+ messages in thread
From: Dave Taht @ 2017-11-27  8:35 UTC (permalink / raw)
  To: erik.taraldsen; +Cc: Pete Heist, Cake List

>What I see as short term posibiliteis for us as ISP's is to push our vendors to include this as a part of the feature set.  We also could do >better with the maketing.  Lets steal an idea from the Video area.  HD is often written as 1080P@60.  Why not do the same for internet >speed?  60M@80ms.  Where the @80ms would be the larges latency in either direction that queue management would introduce?  (This >of cource introduces the risk of artificialy tuning the @xxms to low and ending up with strict policing)

I like this.

900M @ 1.2ms. Taking the 99th percentile from:

http://www.drhleny.cz/bufferbloat/cake/round0/tor_rrultor_eg_cake_950mbit/rrul_torrent-ping_cdf.svg

Since this is a measure of flow switching time:

950Mbit @ 1.2 "FQ"

better, I think, scaled by a reference to pifo on the same test and
test conditions,
(8ms/1.2ms in this case)

http://www.drhleny.cz/bufferbloat/cake/round0/tor_rrultor_eg_pfifo_950mbit/rrul_torrent-ping_cdf.svg

cake: XMbit @ 6.666 FQ!



On Tue, May 2, 2017 at 10:59 PM,  <erik.taraldsen@telenor.com> wrote:
>> Fra: Pete Heist <peteheist@gmail.com>
>> - As for low bandwidth, in my experience AQM works great on low bandwidth ADSL. A few years ago I
>> used fq_codel at a campground to shape a 0.5 / 5 Mbps ADSL connection. With up to 130 people in the
>> camp, it was a disaster before fq_codel, where one person saturating either the up or downstream
>> could easily cause 600+ ms of induced latency. fq_codel could keep that to 40-50 ms under load,
>>enough to make it usable for web browsing, at least, and Cake does better.
>
> Thats encouraging!
>
>
>>  It’s an interesting question: what can be done as an ISP? Essentially it boils down to the fundamentals
>> of deploying AQM- finding where the queues are forming and placing fq_codel or Cake at the
>> bottleneck links, preferring “hardware” queue management like BQL or in the case of WiFi the ath9k’s
>> driver in LEDE, over soft rate limiting, where possible. When soft rate limiting, the rate limiting strategy
>> and chosen rate are the most CPU intensive and finicky parts of deploying AQM.
>
> What I see as short term posibiliteis for us as ISP's is to push our vendors to include this as a part of the feature set.  We also could do better with the maketing.  Lets steal an idea from the Video area.  HD is often written as 1080P@60.  Why not do the same for internet speed?  60M@80ms.  Where the @80ms would be the larges latency in either direction that queue management would introduce?  (This of cource introduces the risk of artificialy tuning the @xxms to low and ending up with strict policing)
>
>
>> - I don't understand why ADSL modem vendors don’t just bake BQL-like functionality right into their
>> devices so they can ship AQM without the need for soft rate limiting. AQM is so effective on ADSL's
>> upstream that it seems it would just make a lot of sense. For that matter, why not on the DSLAM as well
>> to shape the customer’s downstream, if that’s also a bottleneck?
>
> I think most ISP's handle shaping on the BRAS level rather than on the DSLAM, as DSLAM's in general have very limited shaping/qos capabilites.
>
> Regarding CPEs, to be fair, up coming devices from Intel (Lantiq) will more or less do away with HW accellerators and do everything in software.  Then the vendors are a lot more free to implement better shapeing strategies.
>
> The trade shows and all sales pitches focuses mostly on next gen stuff.  There are comparatively very little recources dedicated to ADSL, where the best schedulers is most needed.
>
>
> -Erik
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-11-27  8:35     ` Dave Taht
@ 2017-11-27 12:04       ` Jonathan Morton
  2017-11-27 12:47         ` Pete Heist
  0 siblings, 1 reply; 16+ messages in thread
From: Jonathan Morton @ 2017-11-27 12:04 UTC (permalink / raw)
  To: Dave Taht; +Cc: erik.taraldsen, Cake List

[-- Attachment #1: Type: text/plain, Size: 370 bytes --]

My pet suggestion here is to represent latency as its inverse,
"responsiveness" with units of Hz.  This has the dual advantages of bigger
numbers being better, and the figures being directly comparable with
framerates.

As you say, the methodology will need to be very carefully specified, so
that we get a meaningful measurement that's hard to game.

- Jonathan Morton

[-- Attachment #2: Type: text/html, Size: 444 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-11-27 12:04       ` Jonathan Morton
@ 2017-11-27 12:47         ` Pete Heist
  2017-11-27 15:54           ` Sebastian Moeller
  0 siblings, 1 reply; 16+ messages in thread
From: Pete Heist @ 2017-11-27 12:47 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Dave Taht, Cake List

[-- Attachment #1: Type: text/plain, Size: 1087 bytes --]

> On Nov 27, 2017, at 1:04 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
> My pet suggestion here is to represent latency as its inverse, "responsiveness" with units of Hz.  This has the dual advantages of bigger numbers being better, and the figures being directly comparable with framerates.
> 
> As you say, the methodology will need to be very carefully specified, so that we get a meaningful measurement that's hard to game.
> 
I like that idea...

Then it’s how to measure it. 1 / latency where latency is what…the maximum value you’ll see considering all traffic as besteffort at a fixed number of concurrent flows? Otherwise it would have do be expressed differently for different traffic classes, which is probably already too complicated for most people.

Food for thought, I know this is the opposite direction, but I’ve always liked in Europe how car “mileage” is expressed as consumption (L/100km) instead of efficiency (miles/gallon). Yes, then a lower number is better, but it’s easier to calculate how much gas you’ll use for a given trip.

[-- Attachment #2: Type: text/html, Size: 1658 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-11-27 12:47         ` Pete Heist
@ 2017-11-27 15:54           ` Sebastian Moeller
  2017-11-27 16:12             ` Pete Heist
  0 siblings, 1 reply; 16+ messages in thread
From: Sebastian Moeller @ 2017-11-27 15:54 UTC (permalink / raw)
  To: Pete Heist; +Cc: Jonathan Morton, Cake List

Well,

how about keeping it simple and just give the latency increment under full (bidirectional) link saturation (I guess a catchy acronym might be found)? Yes this is a number where lower is better, but it also has immediate information (like: "mmmh, at an added 3seconds under load, VoIP might suffer a bit if I start heavy torrenting...").

I am not opposed to the inverse per se and I also like the "bigger is better" property, but mental division is hard and the period seems to be more informative than the frequency. But at this point anything that will get some traction will be a winner...



Best Regards
	Sebastian

> On Nov 27, 2017, at 13:47, Pete Heist <peteheist@gmail.com> wrote:
> 
> 
>> On Nov 27, 2017, at 1:04 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>> My pet suggestion here is to represent latency as its inverse, "responsiveness" with units of Hz.  This has the dual advantages of bigger numbers being better, and the figures being directly comparable with framerates.
>> 
>> As you say, the methodology will need to be very carefully specified, so that we get a meaningful measurement that's hard to game.
>> 
> I like that idea...
> 
> Then it’s how to measure it. 1 / latency where latency is what…the maximum value you’ll see considering all traffic as besteffort at a fixed number of concurrent flows? Otherwise it would have do be expressed differently for different traffic classes, which is probably already too complicated for most people.
> 
> Food for thought, I know this is the opposite direction, but I’ve always liked in Europe how car “mileage” is expressed as consumption (L/100km) instead of efficiency (miles/gallon). Yes, then a lower number is better, but it’s easier to calculate how much gas you’ll use for a given trip.
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-11-27 15:54           ` Sebastian Moeller
@ 2017-11-27 16:12             ` Pete Heist
  2017-11-27 18:28               ` Jonathan Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Pete Heist @ 2017-11-27 16:12 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Jonathan Morton, Cake List

[-- Attachment #1: Type: text/plain, Size: 1497 bytes --]

> On Nov 27, 2017, at 4:54 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
> how about keeping it simple and just give the latency increment under full (bidirectional) link saturation (I guess a catchy acronym might be found)? Yes this is a number where lower is better, but it also has immediate information (like: "mmmh, at an added 3seconds under load, VoIP might suffer a bit if I start heavy torrenting...”).

Couldn’t the number of flows contributing to the saturation affect the results though, so that it would have to be specified?

I think this gets to the crux of the original thinking behind the RRUL specification. The RRUL “Score” section contains a lot of detail for an “optimum result”, and further admissions that it isn’t easy to assess: https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/ <https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/>.

If we could come up with one all-encompassing and reliable metric for measuring the “goodness” of queueing behavior, it would also make testing much easier. I really wish for such a test, and sometimes try to figure out how it would look, but I don’t think it’s an easy problem to solve.

> I am not opposed to the inverse per se and I also like the "bigger is better" property, but mental division is hard and the period seems to be more informative than the frequency. But at this point anything that will get some traction will be a winner...
> 
> Best Regards
> 	Sebastian

[-- Attachment #2: Type: text/html, Size: 2320 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-11-27 16:12             ` Pete Heist
@ 2017-11-27 18:28               ` Jonathan Morton
  2017-11-27 21:49                 ` Pete Heist
  0 siblings, 1 reply; 16+ messages in thread
From: Jonathan Morton @ 2017-11-27 18:28 UTC (permalink / raw)
  To: Pete Heist; +Cc: Sebastian Moeller, Cake List

[-- Attachment #1: Type: text/plain, Size: 1560 bytes --]

An important factor when designing the test is the difference between
intra-flow and inter-flow induced latencies, as well as the baseline
latency.

In general, AQM by itself controls intra-flow induced latency, while flow
isolation (commonly FQ) controls inter-flow induced latency.  I consider
the latter to be more important to measure.

Baseline latency is a factor of the underlying network topology, and is the
type of latency most often measured.  It should be measured in the no-load
condition, but the choice of remote endpoint is critical.  Large ISPs could
gain an unfair advantage if they can provide a qualifying endpoint within
their network, closer to the last mile links than most realistic Internet
services.  Conversely, ISPs are unlikely to endorse a measurement scheme
which places the endpoints too far away from them.

One reasonable possibility is to use DNS lookups to randomly-selected gTLDs
as the benchmark.  There are gTLD DNS servers well-placed in essentially
all regions of interest, and effective DNS caching is a legitimate means
for an ISP to improve their customers' internet performance.  Random
lookups (especially of domains which are known to not exist) should defeat
the effects of such caching.

Induced latency can then be measured by applying a load and comparing the
new latency measurement to the baseline.  This load can simultaneously be
used to measure available throughput.  The tests on dslreports offer a
decent example of how to do this, but it would be necessary to standardise
the load.

- Jonathan Morton

[-- Attachment #2: Type: text/html, Size: 1682 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Recomended HW to run cake and fq_codel?
  2017-11-27 18:28               ` Jonathan Morton
@ 2017-11-27 21:49                 ` Pete Heist
  2017-11-28 18:15                   ` [Cake] Simple metrics Dave Taht
  0 siblings, 1 reply; 16+ messages in thread
From: Pete Heist @ 2017-11-27 21:49 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Sebastian Moeller, Cake List

[-- Attachment #1: Type: text/plain, Size: 2752 bytes --]


> On Nov 27, 2017, at 7:28 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
> An important factor when designing the test is the difference between intra-flow and inter-flow induced latencies, as well as the baseline latency.
> 
> In general, AQM by itself controls intra-flow induced latency, while flow isolation (commonly FQ) controls inter-flow induced latency.  I consider the latter to be more important to measure.
> 
Intra-flow induced latency should also be important for web page load time and websockets, for example. Maybe not as important as inter-flow, because there you’re talking about how voice, videoconferencing and other interactive apps work together with other traffic, which is what people are affected by the most when it doesn’t work.

I don’t think it’s too much to include one public metric for each. People are used to “upload” and “download”, maybe they’d one day get used to “reactivity” and “interactivity”, or some more accessible terms.
> Baseline latency is a factor of the underlying network topology, and is the type of latency most often measured.  It should be measured in the no-load condition, but the choice of remote endpoint is critical.  Large ISPs could gain an unfair advantage if they can provide a qualifying endpoint within their network, closer to the last mile links than most realistic Internet services.  Conversely, ISPs are unlikely to endorse a measurement scheme which places the endpoints too far away from them.
> 
> One reasonable possibility is to use DNS lookups to randomly-selected gTLDs as the benchmark.  There are gTLD DNS servers well-placed in essentially all regions of interest, and effective DNS caching is a legitimate means for an ISP to improve their customers' internet performance.  Random lookups (especially of domains which are known to not exist) should defeat the effects of such caching.
> 
> Induced latency can then be measured by applying a load and comparing the new latency measurement to the baseline.  This load can simultaneously be used to measure available throughput.  The tests on dslreports offer a decent example of how to do this, but it would be necessary to standardise the load.
> 
It would be good to know what an average worst case heavy load is on a typical household Internet connection and standardize on that. Windows updates for example can be pretty bad (many flows).

DNS is an interesting possibility. On the one hand all you get is RTT, but on the other hand your server infrastructure is already available. I use the dslreports speedtest pretty routinely as it’s decent, although results can vary significantly between runs. If they’re using DNS to measure latency, I hadn’t realized it.

[-- Attachment #2: Type: text/html, Size: 3409 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Cake] Simple metrics
  2017-11-27 21:49                 ` Pete Heist
@ 2017-11-28 18:15                   ` Dave Taht
  2017-11-28 22:14                     ` Pete Heist
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Taht @ 2017-11-28 18:15 UTC (permalink / raw)
  To: Pete Heist; +Cc: Jonathan Morton, Cake List


Changing the title of the thread.

Pete Heist <peteheist@gmail.com> writes:

>     On Nov 27, 2017, at 7:28 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>     
>     An important factor when designing the test is the difference between
>     intra-flow and inter-flow induced latencies, as well as the baseline
>     latency.
>
>     In general, AQM by itself controls intra-flow induced latency, while flow
>     isolation (commonly FQ) controls inter-flow induced latency. I consider the
>     latter to be more important to measure.
>
> Intra-flow induced latency should also be important for web page load time and
> websockets, for example. Maybe not as important as inter-flow, because there
> you’re talking about how voice, videoconferencing and other interactive apps
> work together with other traffic, which is what people are affected by the most
> when it doesn’t work.
>
> I don’t think it’s too much to include one public metric for each. People are
> used to “upload” and “download”, maybe they’d one day get used to “reactivity”
> and “interactivity”, or some more accessible terms.

Well, what I proposed was using a pfifo as the reference
standard, and "FQ" as one metric name against pfifo 1000/newstuff. 

That normalizes any test we come up with.

>
>         Baseline latency is a factor of the underlying network topology, and is
>     the type of latency most often measured. It should be measured in the
>     no-load condition, but the choice of remote endpoint is critical. Large ISPs
>     could gain an unfair advantage if they can provide a qualifying endpoint
>     within their network, closer to the last mile links than most realistic
>     Internet services. Conversely, ISPs are unlikely to endorse a measurement
>     scheme which places the endpoints too far away from them.
>
>     One reasonable possibility is to use DNS lookups to randomly-selected gTLDs
>     as the benchmark. There are gTLD DNS servers well-placed in essentially all
>     regions of interest, and effective DNS caching is a legitimate means for an
>     ISP to improve their customers' internet performance. Random lookups
>     (especially of domains which are known to not exist) should defeat the
>     effects of such caching.
>
>     Induced latency can then be measured by applying a load and comparing the
>     new latency measurement to the baseline. This load can simultaneously be
>     used to measure available throughput. The tests on dslreports offer a decent
>     example of how to do this, but it would be necessary to standardise the
>     load.
>
> It would be good to know what an average worst case heavy load is on a typical
> household Internet connection and standardize on that. Windows updates for
> example can be pretty bad (many flows).

My mental reference has always been family of four -

Mom in a videoconference
Dad surfing the web
Son playing a game
Daughter uploading to youtube

(pick your gender neutral roles at will)

+ Torrenting or dropbox or windows update or steam or ...

A larger scale reference might be a company of 30 people.


>
> DNS is an interesting possibility. On the one hand all you get is RTT, but on
> the other hand your server infrastructure is already available. I use the
> dslreports speedtest pretty routinely as it’s decent, although results can vary
> significantly between runs. If they’re using DNS to measure latency, I hadn’t
> realized it.
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Simple metrics
  2017-11-28 18:15                   ` [Cake] Simple metrics Dave Taht
@ 2017-11-28 22:14                     ` Pete Heist
  2017-11-28 22:41                       ` Dave Taht
  0 siblings, 1 reply; 16+ messages in thread
From: Pete Heist @ 2017-11-28 22:14 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jonathan Morton, Cake List


> On Nov 28, 2017, at 7:15 PM, Dave Taht <dave@taht.net> wrote:
> 
> Pete Heist <peteheist@gmail.com> writes:
> 
>>    On Nov 27, 2017, at 7:28 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>> 
>>    An important factor when designing the test is the difference between
>>    intra-flow and inter-flow induced latencies, as well as the baseline
>>    latency.
>> 
>>    In general, AQM by itself controls intra-flow induced latency, while flow
>>    isolation (commonly FQ) controls inter-flow induced latency. I consider the
>>    latter to be more important to measure.
>> 
>> Intra-flow induced latency should also be important for web page load time and
>> websockets, for example. Maybe not as important as inter-flow, because there
>> you’re talking about how voice, videoconferencing and other interactive apps
>> work together with other traffic, which is what people are affected by the most
>> when it doesn’t work.
>> 
>> I don’t think it’s too much to include one public metric for each. People are
>> used to “upload” and “download”, maybe they’d one day get used to “reactivity”
>> and “interactivity”, or some more accessible terms.
> 
> Well, what I proposed was using a pfifo as the reference
> standard, and "FQ" as one metric name against pfifo 1000/newstuff. 
> 
> That normalizes any test we come up with.

So one could have 6 FQ on the intra-flow latency test and 4 FQ on the inter-flow latency test, for example, because it’s always a factor of pfifo 1000’s result on whatever test is run?

>>        Baseline latency is a factor of the underlying network topology, and is
>>    the type of latency most often measured. It should be measured in the
>>    no-load condition, but the choice of remote endpoint is critical. Large ISPs
>>    could gain an unfair advantage if they can provide a qualifying endpoint
>>    within their network, closer to the last mile links than most realistic
>>    Internet services. Conversely, ISPs are unlikely to endorse a measurement
>>    scheme which places the endpoints too far away from them.
>> 
>>    One reasonable possibility is to use DNS lookups to randomly-selected gTLDs
>>    as the benchmark. There are gTLD DNS servers well-placed in essentially all
>>    regions of interest, and effective DNS caching is a legitimate means for an
>>    ISP to improve their customers' internet performance. Random lookups
>>    (especially of domains which are known to not exist) should defeat the
>>    effects of such caching.
>> 
>>    Induced latency can then be measured by applying a load and comparing the
>>    new latency measurement to the baseline. This load can simultaneously be
>>    used to measure available throughput. The tests on dslreports offer a decent
>>    example of how to do this, but it would be necessary to standardise the
>>    load.
>> 
>> It would be good to know what an average worst case heavy load is on a typical
>> household Internet connection and standardize on that. Windows updates for
>> example can be pretty bad (many flows).
> 
> My mental reference has always been family of four -
> 
> Mom in a videoconference
> Dad surfing the web
> Son playing a game
> Daughter uploading to youtube
> 
> (pick your gender neutral roles at will)
> 
> + Torrenting or dropbox or windows update or steam or …

That sounds like a pretty good reasonable family maximum.

> A larger scale reference might be a company of 30 people.

I’m only speculating that an average active company user generates less traffic than an average active home user, depending on the line of work of course.

Could there be a single test that’s independent of scale and intelligently exercises the connection until the practical limits of its rrul related variables are known? I think that’s what would make testing much easier. I realize I'm conflating the concept of a simple testing metric with this idea.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Simple metrics
  2017-11-28 22:14                     ` Pete Heist
@ 2017-11-28 22:41                       ` Dave Taht
  2017-11-29  8:08                         ` Sebastian Moeller
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Taht @ 2017-11-28 22:41 UTC (permalink / raw)
  To: Pete Heist; +Cc: Jonathan Morton, Cake List

Pete Heist <peteheist@gmail.com> writes:

>> On Nov 28, 2017, at 7:15 PM, Dave Taht <dave@taht.net> wrote:
>> 
>> Pete Heist <peteheist@gmail.com> writes:
>> 
>>>    On Nov 27, 2017, at 7:28 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>>> 
>>>    An important factor when designing the test is the difference between
>>>    intra-flow and inter-flow induced latencies, as well as the baseline
>>>    latency.
>>> 
>>>    In general, AQM by itself controls intra-flow induced latency, while flow
>>>    isolation (commonly FQ) controls inter-flow induced latency. I consider the
>>>    latter to be more important to measure.
>>> 
>>> Intra-flow induced latency should also be important for web page load time and
>>> websockets, for example. Maybe not as important as inter-flow, because there
>>> you’re talking about how voice, videoconferencing and other interactive apps
>>> work together with other traffic, which is what people are affected by the most
>>> when it doesn’t work.
>>> 
>>> I don’t think it’s too much to include one public metric for each. People are
>>> used to “upload” and “download”, maybe they’d one day get used to “reactivity”
>>> and “interactivity”, or some more accessible terms.
>> 
>> Well, what I proposed was using a pfifo as the reference
>> standard, and "FQ" as one metric name against pfifo 1000/newstuff. 
>> 
>> That normalizes any test we come up with.
>
> So one could have 6 FQ on the intra-flow latency test and 4 FQ on the inter-flow latency test, for example, because it’s always a factor of pfifo 1000’s result on whatever test is run?

yep. using 1000 for FIFO queue length also pleases me due to all the
academic work at 50 or 100.

It would even work for tcp RTT measurement changes, although "FQ" is
sort of a bad name here. I'd be up to another name. LQ? (latency
quotient). LS (latency stress)


>
>>>        Baseline latency is a factor of the underlying network topology, and is
>>>    the type of latency most often measured. It should be measured in the
>>>    no-load condition, but the choice of remote endpoint is critical. Large ISPs
>>>    could gain an unfair advantage if they can provide a qualifying endpoint
>>>    within their network, closer to the last mile links than most realistic
>>>    Internet services. Conversely, ISPs are unlikely to endorse a measurement
>>>    scheme which places the endpoints too far away from them.
>>> 
>>>    One reasonable possibility is to use DNS lookups to randomly-selected gTLDs
>>>    as the benchmark. There are gTLD DNS servers well-placed in essentially all
>>>    regions of interest, and effective DNS caching is a legitimate means for an
>>>    ISP to improve their customers' internet performance. Random lookups
>>>    (especially of domains which are known to not exist) should defeat the
>>>    effects of such caching.
>>> 
>>>    Induced latency can then be measured by applying a load and comparing the
>>>    new latency measurement to the baseline. This load can simultaneously be
>>>    used to measure available throughput. The tests on dslreports offer a decent
>>>    example of how to do this, but it would be necessary to standardise the
>>>    load.
>>> 
>>> It would be good to know what an average worst case heavy load is on a typical
>>> household Internet connection and standardize on that. Windows updates for
>>> example can be pretty bad (many flows).
>> 
>> My mental reference has always been family of four -
>> 
>> Mom in a videoconference
>> Dad surfing the web
>> Son playing a game
>> Daughter uploading to youtube
>> 
>> (pick your gender neutral roles at will)
>> 
>> + Torrenting or dropbox or windows update or steam or …
>
> That sounds like a pretty good reasonable family maximum.
>
>> A larger scale reference might be a company of 30 people.
>
> I’m only speculating that an average active company user generates less traffic than an average active home user, depending on the line of work of course.
>
> Could there be a single test that’s independent of scale and intelligently
> exercises the connection until the practical limits of its rrul related
> variables are known? I think that’s what would make testing much easier. I
> realize I'm conflating the concept of a simple testing metric with this idea.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Cake] Simple metrics
  2017-11-28 22:41                       ` Dave Taht
@ 2017-11-29  8:08                         ` Sebastian Moeller
  0 siblings, 0 replies; 16+ messages in thread
From: Sebastian Moeller @ 2017-11-29  8:08 UTC (permalink / raw)
  To: Dave Taht; +Cc: Pete Heist, Cake List

I know I am simple minded, but still... I think that giving the RTT increase incurred on a sparse flow during full up- and downstream saturation seems the rawest and most useful; I guess one would need to select the number of concurrent flows and define sparseness. This will show the effect of fair queueing quite nicely and will have immediate relevant information for gamers; how much "lag" does the internet link add. I simply assume that for latency improvements FPS player might be our best hope to popularize the concept. Nothing against a ratio or a quotient, but honestly fractions are not intuitively understood by most people (that is, without actually thinking about them).

Best Regards
	Sebastian



> On Nov 28, 2017, at 23:41, Dave Taht <dave@taht.net> wrote:
> 
> Pete Heist <peteheist@gmail.com> writes:
> 
>>> On Nov 28, 2017, at 7:15 PM, Dave Taht <dave@taht.net> wrote:
>>> 
>>> Pete Heist <peteheist@gmail.com> writes:
>>> 
>>>>   On Nov 27, 2017, at 7:28 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>>>> 
>>>>   An important factor when designing the test is the difference between
>>>>   intra-flow and inter-flow induced latencies, as well as the baseline
>>>>   latency.
>>>> 
>>>>   In general, AQM by itself controls intra-flow induced latency, while flow
>>>>   isolation (commonly FQ) controls inter-flow induced latency. I consider the
>>>>   latter to be more important to measure.
>>>> 
>>>> Intra-flow induced latency should also be important for web page load time and
>>>> websockets, for example. Maybe not as important as inter-flow, because there
>>>> you’re talking about how voice, videoconferencing and other interactive apps
>>>> work together with other traffic, which is what people are affected by the most
>>>> when it doesn’t work.
>>>> 
>>>> I don’t think it’s too much to include one public metric for each. People are
>>>> used to “upload” and “download”, maybe they’d one day get used to “reactivity”
>>>> and “interactivity”, or some more accessible terms.
>>> 
>>> Well, what I proposed was using a pfifo as the reference
>>> standard, and "FQ" as one metric name against pfifo 1000/newstuff. 
>>> 
>>> That normalizes any test we come up with.
>> 
>> So one could have 6 FQ on the intra-flow latency test and 4 FQ on the inter-flow latency test, for example, because it’s always a factor of pfifo 1000’s result on whatever test is run?
> 
> yep. using 1000 for FIFO queue length also pleases me due to all the
> academic work at 50 or 100.
> 
> It would even work for tcp RTT measurement changes, although "FQ" is
> sort of a bad name here. I'd be up to another name. LQ? (latency
> quotient). LS (latency stress)
> 
> 
>> 
>>>>       Baseline latency is a factor of the underlying network topology, and is
>>>>   the type of latency most often measured. It should be measured in the
>>>>   no-load condition, but the choice of remote endpoint is critical. Large ISPs
>>>>   could gain an unfair advantage if they can provide a qualifying endpoint
>>>>   within their network, closer to the last mile links than most realistic
>>>>   Internet services. Conversely, ISPs are unlikely to endorse a measurement
>>>>   scheme which places the endpoints too far away from them.
>>>> 
>>>>   One reasonable possibility is to use DNS lookups to randomly-selected gTLDs
>>>>   as the benchmark. There are gTLD DNS servers well-placed in essentially all
>>>>   regions of interest, and effective DNS caching is a legitimate means for an
>>>>   ISP to improve their customers' internet performance. Random lookups
>>>>   (especially of domains which are known to not exist) should defeat the
>>>>   effects of such caching.
>>>> 
>>>>   Induced latency can then be measured by applying a load and comparing the
>>>>   new latency measurement to the baseline. This load can simultaneously be
>>>>   used to measure available throughput. The tests on dslreports offer a decent
>>>>   example of how to do this, but it would be necessary to standardise the
>>>>   load.
>>>> 
>>>> It would be good to know what an average worst case heavy load is on a typical
>>>> household Internet connection and standardize on that. Windows updates for
>>>> example can be pretty bad (many flows).
>>> 
>>> My mental reference has always been family of four -
>>> 
>>> Mom in a videoconference
>>> Dad surfing the web
>>> Son playing a game
>>> Daughter uploading to youtube
>>> 
>>> (pick your gender neutral roles at will)
>>> 
>>> + Torrenting or dropbox or windows update or steam or …
>> 
>> That sounds like a pretty good reasonable family maximum.
>> 
>>> A larger scale reference might be a company of 30 people.
>> 
>> I’m only speculating that an average active company user generates less traffic than an average active home user, depending on the line of work of course.
>> 
>> Could there be a single test that’s independent of scale and intelligently
>> exercises the connection until the practical limits of its rrul related
>> variables are known? I think that’s what would make testing much easier. I
>> realize I'm conflating the concept of a simple testing metric with this idea.
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-11-29  8:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <mailman.1.1493740801.18318.cake@lists.bufferbloat.net>
2017-05-02 18:44 ` [Cake] Recomended HW to run cake and fq_codel? Pete Heist
2017-05-03  5:59   ` erik.taraldsen
2017-05-03  7:15     ` Pete Heist
2017-05-03 10:03       ` Andy Furniss
2017-05-03 11:10       ` erik.taraldsen
2017-11-27  8:35     ` Dave Taht
2017-11-27 12:04       ` Jonathan Morton
2017-11-27 12:47         ` Pete Heist
2017-11-27 15:54           ` Sebastian Moeller
2017-11-27 16:12             ` Pete Heist
2017-11-27 18:28               ` Jonathan Morton
2017-11-27 21:49                 ` Pete Heist
2017-11-28 18:15                   ` [Cake] Simple metrics Dave Taht
2017-11-28 22:14                     ` Pete Heist
2017-11-28 22:41                       ` Dave Taht
2017-11-29  8:08                         ` Sebastian Moeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox