Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.

Development issues regarding the cerowrt test router project
 help / color / mirror / Atom feed

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
@ 2014-05-24 14:03 R.
  2014-07-25 18:37 ` Valdis.Kletnieks
  2014-07-25 20:48 ` Wes Felter
  0 siblings, 2 replies; 51+ messages in thread
From: R. @ 2014-05-24 14:03 UTC (permalink / raw)
  To: cerowrt-devel

>> I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth.

Pardon my noobiness, but is there a technical obstacle that prevents
the creation of a user-triggered function on the router side that
measures the provider's bandwidth?

Function, when (luci-gui?) triggered, would:

1. Ensure that internet connectivity is present.
2. Disconnect all clients.
3. Engage in DL and UL on a dedicated web server, measure stats and
straight up use them in fq_codel -- or suggest them in appropriate
QoS-gui user-boxes.

Further, this function could be auto-scheduled or made enabled on
router boot up.

I must be missing something important which prevents this. What is it?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-24 14:03 [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration R.
@ 2014-07-25 18:37 ` Valdis.Kletnieks
  2014-07-25 21:03   ` David Lang
  2014-07-25 20:48 ` Wes Felter
  1 sibling, 1 reply; 51+ messages in thread
From: Valdis.Kletnieks @ 2014-07-25 18:37 UTC (permalink / raw)
  To: R.; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 2120 bytes --]

On Sat, 24 May 2014 10:02:53 -0400, "R." said:

> Further, this function could be auto-scheduled or made enabled on
> router boot up.

Yeah, if such a thing worked, it would be good.

(Note in the following that a big part of my *JOB* is doing "What could
possibly go wrong?" analysis on mission-critical systems, which tends to color
my viewpoint on projects. I still think the basic concept is good, just
difficult to do, and am listing the obvious challenges for anybody brave
enough to tackle it... :)

> I must be missing something important which prevents this. What is it?

There's a few biggies.  The first is what the linux-kernel calls -ENOPATCH -
nobody's written the code.  The second is you need an upstream target someplace
to test against.  You need to deal with both the "server is unavalailable due
to a backhoe incident 2 time zones away" problem (which isn't *that* hard, just
default to Something Not Obviously Bad(TM), and "server is slashdotted" (whci
is a bit harder to deal with.  Remember that there's some really odd corner
cases to worry about - for instance, if there's a power failure in a town, then
when the electric company restores power you're going to have every cerowrt box
hit the server within a few seconds - all over the same uplink most likely.  No
good data can result from that... (Holy crap, it's been almost 3 decades since
I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the
network at once when building power was restored).

And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the
analysis to compute the correct values becomes.... interesting.

Dealing with non-obvious error conditions is also a challenge - a router
may only boot once every few months.  And if you happen to be booting just
as a BGP routing flap is causing your traffic to take a vastly suboptimal
path, you may end up encoding a vastly inaccurate setting and have it stuck
there, causing suckage for non-obvious reasons for the non-technical, so you
really don't want to enable auto-tuning unless you also have a good plan for
auto-*RE*tuning....

[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-24 14:03 [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration R.
  2014-07-25 18:37 ` Valdis.Kletnieks
@ 2014-07-25 20:48 ` Wes Felter
  2014-07-25 20:57   ` David Lang
  2014-07-26 11:01   ` Sebastian Moeller
  1 sibling, 2 replies; 51+ messages in thread
From: Wes Felter @ 2014-07-25 20:48 UTC (permalink / raw)
  To: cerowrt-devel

The Netgear stock firmware measures bandwidth on every boot or link up 
(not sure which) and I would suggest doing the same for CeroWRT.

Do you need to measure Internet bandwidth or last mile bandwidth? For 
link bandwidth it seems like you can solve a lot of problems by 
measuring to the first hop router. Does the packer pair technique work 
on TDMA link layers like DOCSIS?

-- 
Wes Felter

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-25 20:48 ` Wes Felter
@ 2014-07-25 20:57   ` David Lang
  2014-07-26 11:18     ` Sebastian Moeller
  2014-07-26 11:01   ` Sebastian Moeller
  1 sibling, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-25 20:57 UTC (permalink / raw)
  To: Wes Felter; +Cc: cerowrt-devel

On Fri, 25 Jul 2014, Wes Felter wrote:

> The Netgear stock firmware measures bandwidth on every boot or link up (not 
> sure which) and I would suggest doing the same for CeroWRT.
>
> Do you need to measure Internet bandwidth or last mile bandwidth? For link 
> bandwidth it seems like you can solve a lot of problems by measuring to the 
> first hop router. Does the packer pair technique work on TDMA link layers 
> like DOCSIS?

The trouble is that to measure bandwidth, you have to be able to send and 
receive a lot of traffic. unless the router you are connecting to is running 
some sort of service to support that, you can't just test that link, you have to 
connect to something beyond that.

David Lang

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-25 18:37 ` Valdis.Kletnieks
@ 2014-07-25 21:03   ` David Lang
  2014-07-26 11:30     ` Sebastian Moeller
  0 siblings, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-25 21:03 UTC (permalink / raw)
  To: cerowrt-devel

 On Fri, 25 Jul 2014 14:37:34 -0400, Valdis.Kletnieks@vt.edu wrote:
> On Sat, 24 May 2014 10:02:53 -0400, "R." said:
>
>> Further, this function could be auto-scheduled or made enabled on
>> router boot up.
>
> Yeah, if such a thing worked, it would be good.
>
> (Note in the following that a big part of my *JOB* is doing "What 
> could
> possibly go wrong?" analysis on mission-critical systems, which tends
> to color
> my viewpoint on projects. I still think the basic concept is good, 
> just
> difficult to do, and am listing the obvious challenges for anybody 
> brave
> enough to tackle it... :)
>
>> I must be missing something important which prevents this. What is 
>> it?
>
> There's a few biggies.  The first is what the linux-kernel calls 
> -ENOPATCH -
> nobody's written the code.  The second is you need an upstream target
> someplace
> to test against.  You need to deal with both the "server is 
> unavalailable due
> to a backhoe incident 2 time zones away" problem (which isn't *that*
> hard, just
> default to Something Not Obviously Bad(TM), and "server is 
> slashdotted" (whci
> is a bit harder to deal with.  Remember that there's some really odd 
> corner
> cases to worry about - for instance, if there's a power failure in a
> town, then
> when the electric company restores power you're going to have every
> cerowrt box
> hit the server within a few seconds - all over the same uplink most
> likely.  No
> good data can result from that... (Holy crap, it's been almost 3
> decades since
> I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted
> over the
> network at once when building power was restored).
>
> And if you're in Izbekistan and the closest server netwise is at 60
> Hudson, the
> analysis to compute the correct values becomes.... interesting.
>
> Dealing with non-obvious error conditions is also a challenge - a 
> router
> may only boot once every few months.  And if you happen to be booting 
> just
> as a BGP routing flap is causing your traffic to take a vastly 
> suboptimal
> path, you may end up encoding a vastly inaccurate setting and have it 
> stuck
> there, causing suckage for non-obvious reasons for the non-technical, 
> so you
> really don't want to enable auto-tuning unless you also have a good 
> plan for
> auto-*RE*tuning....

 have the router record it's finding, and then repeat the test 
 periodically, recording it's finding as well. If the new finding is 
 substantially different from the prior ones, schedule a retest 'soon' 
 (or default to the prior setting if it's bad enough), otherwise, if 
 there aren't many samples, schedule a test 'soon' if there are a lot of 
 samples, schedule a test in a while.

 However, I think the big question is how much the tuning is required.

 If a connection with BQL and fq_codel is 90% as good as a tuned setup, 
 default to untuned unless the user explicitly hits a button to measure 
 (and then a second button to accept the measurement)

 If BQL and fw_codel by default are M70% as good as a tuned setup, 
 there's more space to argue that all setups must be tuned, but then the 
 question is how to they fare against a old, non-BQL, non-fq-codel setup? 
 if they are considerably better, it may still be worthwhile.

 David Lang

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-25 20:48 ` Wes Felter
  2014-07-25 20:57   ` David Lang
@ 2014-07-26 11:01   ` Sebastian Moeller
  1 sibling, 0 replies; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-26 11:01 UTC (permalink / raw)
  To: Wes Felter; +Cc: cerowrt-devel

Hi Wes,


On Jul 25, 2014, at 22:48 , Wes Felter <wmf@felter.org> wrote:

> The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT.
> 
> Do you need to measure Internet bandwidth or last mile bandwidth?

	I think you want the bandwidth of the usual bottleneck, on DSL that typically is the actual DSL-link to the DSLAM (even though the DSLAM is oversubscribed typically its upstream link is not congested…). I think with DOCSIS it is the same. Realistically bandwidth measurement are going to be sporadic, so this will only help with pretty constant bottlenecks anyway, no use in trying to track, say the DSLAM congestion that transiently happens during peak use time...


> For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router.

	And that would be sweet, but with DT’s network  the first hop does not respond to ICMP probes, nor anything else under end user control, also the bottleneck might actually be in the BRAS, which can be upstream of the DSLAM. What would be great is if all CPE would return the current link rates per SNMP or so… Or if DSLAMs and CMTSs would supply data sinks and sources for easy testing of good-put.

> Does the packer pair technique work on TDMA link layers like DOCSIS?

	Toke and Dave dug up a paper showing that packet pair is not an reliable estimator for link bandwidth, So one could send independent packet of differing size, but then one needs to synchronize the clocks somehow… 

Best Regards
	Sebastian

> 
> -- 
> Wes Felter
> 
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-25 20:57   ` David Lang
@ 2014-07-26 11:18     ` Sebastian Moeller
  2014-07-26 20:21       ` David Lang
  2014-08-01  4:40       ` Michael Richardson
  0 siblings, 2 replies; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-26 11:18 UTC (permalink / raw)
  To: David Lang; +Cc: Wes Felter, cerowrt-devel

Hi David,


On Jul 25, 2014, at 22:57 , David Lang <david@lang.hm> wrote:

> On Fri, 25 Jul 2014, Wes Felter wrote:
> 
>> The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT.
>> 
>> Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS?
> 
> The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic.

	Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times).

> unless the router you are connecting to is running some sort of service to support that,

	But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
	Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
	I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…

> you can't just test that link, you have to connect to something beyond that.

	So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…

Best Regards
	Sebastian



> 
> David Lang
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-25 21:03   ` David Lang
@ 2014-07-26 11:30     ` Sebastian Moeller
  2014-07-26 20:39       ` David Lang
  2014-08-01  4:21       ` Michael Richardson
  0 siblings, 2 replies; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-26 11:30 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

Hi David,


On Jul 25, 2014, at 23:03 , David Lang <david@lang.hm> wrote:

> On Fri, 25 Jul 2014 14:37:34 -0400, Valdis.Kletnieks@vt.edu wrote:
>> On Sat, 24 May 2014 10:02:53 -0400, "R." said:
>> 
>>> Further, this function could be auto-scheduled or made enabled on
>>> router boot up.
>> 
>> Yeah, if such a thing worked, it would be good.
>> 
>> (Note in the following that a big part of my *JOB* is doing "What could
>> possibly go wrong?" analysis on mission-critical systems, which tends
>> to color
>> my viewpoint on projects. I still think the basic concept is good, just
>> difficult to do, and am listing the obvious challenges for anybody brave
>> enough to tackle it... :)
>> 
>>> I must be missing something important which prevents this. What is it?
>> 
>> There's a few biggies.  The first is what the linux-kernel calls -ENOPATCH -
>> nobody's written the code.  The second is you need an upstream target
>> someplace
>> to test against.  You need to deal with both the "server is unavalailable due
>> to a backhoe incident 2 time zones away" problem (which isn't *that*
>> hard, just
>> default to Something Not Obviously Bad(TM), and "server is slashdotted" (whci
>> is a bit harder to deal with.  Remember that there's some really odd corner
>> cases to worry about - for instance, if there's a power failure in a
>> town, then
>> when the electric company restores power you're going to have every
>> cerowrt box
>> hit the server within a few seconds - all over the same uplink most
>> likely.  No
>> good data can result from that... (Holy crap, it's been almost 3
>> decades since
>> I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted
>> over the
>> network at once when building power was restored).
>> 
>> And if you're in Izbekistan and the closest server netwise is at 60
>> Hudson, the
>> analysis to compute the correct values becomes.... interesting.
>> 
>> Dealing with non-obvious error conditions is also a challenge - a router
>> may only boot once every few months.  And if you happen to be booting just
>> as a BGP routing flap is causing your traffic to take a vastly suboptimal
>> path, you may end up encoding a vastly inaccurate setting and have it stuck
>> there, causing suckage for non-obvious reasons for the non-technical, so you
>> really don't want to enable auto-tuning unless you also have a good plan for
>> auto-*RE*tuning....
> 
> have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while.

	Yeah, keeping some history to “predict” when to measure next sounds clever.

> 
> However, I think the big question is how much the tuning is required.

I assume in most cases you need to measure the home-routers bandwidth rarely (say on DSL only after a re-sync with the DSLAM), but you need to measure the bandwidth early as only then you can properly shape the downlink. And we need to know the link’s capacity to use traffic shaping so that BQL and fq_codel in the router have control over the bottleneck queue… An equivalent of BQL and fq_codel running in the DSLAM/CMTS and CPE obviously would be what we need, because then BQL and fq_codel on the router would be all that is required. But that does not seem like it is happening anytime soon, so we still need to workaround the limitations in the equipment fr a long time to come, I fear. 

> 
> If a connection with BQL and fq_codel is 90% as good as a tuned setup, default to untuned unless the user explicitly hits a button to measure (and then a second button to accept the measurement)
> 
> If BQL and fw_codel by default are M70% as good as a tuned setup, there's more space to argue that all setups must be tuned, but then the question is how to they fare against a old, non-BQL, non-fq-codel setup? if they are considerably better, it may still be worthwhile.
	

Best Regards
	Sebastian

> 
> David Lang
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 11:18     ` Sebastian Moeller
@ 2014-07-26 20:21       ` David Lang
  2014-07-26 20:54         ` Sebastian Moeller
  2014-08-01  4:40       ` Michael Richardson
  1 sibling, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-26 20:21 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Wes Felter, cerowrt-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4818 bytes --]

On Sat, 26 Jul 2014, Sebastian Moeller wrote:

> Hi David,
>
>
> On Jul 25, 2014, at 22:57 , David Lang <david@lang.hm> wrote:
>
>> On Fri, 25 Jul 2014, Wes Felter wrote:
>>
>>> The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT.
>>>
>>> Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS?
>>
>> The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic.
>
> 	Well that is what you typically do, but you can get away with less 
> measurement traffic: in an ideal quiescent network sending two packets back to 
> back should give you the bandwidth (packet size / incoming time difference of 
> both packets), or send two packets of different size (needs synchronized 
> clocks, then difference of packet sizes / difference of transfer times).

Except that your ideal network doesn't exist in the real world. You are never 
going to have the entire network quiescent, the router you are going to be 
talking to is always going to have other things going on, which can affect it's 
timing.

>> unless the router you are connecting to is running some sort of service to support that,
>
> 	But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
> 	Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
> 	I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…

As you say, anything that requires symmetrical traffic (like ICMP isn't going to 
work, and routers do not currently offer any service that will.

you also can't count on time being synced properly. Top Tier companies have 
trouble doing that in their dedicated datacenters, depending on it for this sort 
of testing is a non-starter

>> you can't just test that link, you have to connect to something beyond that.
>
> 	So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…

Well, let's talk about what we would like to have on the router

As I see it, we want to have two services

1. a service you send a small amount of data to and it responds by sending you a 
large amount of data (preferrably with the most accurate timestamps it has and 
the TTL of the packets it received)

2. a service you send a large amount of data to and it responds by sending you 
small responses, telling you how much data it has received (with a timestamp and 
what the TTL of the packets it received were)

questions:

A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???

TCP has the problem of slow start so it would need substantially more traffic to 
flow to reach steady-state.

anything else has the possibility of taking a different path through the 
router/switch software and so the performance may not be the same.

B. How much data is needed to be statistically accurate?

Too many things can happen for 1-2 packets to tell you the answer. The systems 
on both ends are multi-tasking, and at high speeds, scheduling jitter will throw 
off your calculations with too few packets.

C. How can this be prevented from being used for DoS attacks, either against the 
thing running the service or against someone else via a reflected attack if it's 
a forgable protocol (i.e. UDP)

One thought I have is to require a high TTL on the packets for the services to 
respond to them. That way any abuse of the service would have to take place from 
very close on the network.

Ideally these services would only respond to senders that are directly 
connected, but until these services are deployed and enabled by default, there 
is going to be a need to be the ability to 'jump over' old equipment. This need 
will probably never go away completely.

Other requirements or restrictions?

David Lang

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 11:30     ` Sebastian Moeller
@ 2014-07-26 20:39       ` David Lang
  2014-07-26 21:25         ` Sebastian Moeller
  2014-08-01  4:21       ` Michael Richardson
  1 sibling, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-26 20:39 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6344 bytes --]

On Sat, 26 Jul 2014, Sebastian Moeller wrote:

> Hi David,
>
>
> On Jul 25, 2014, at 23:03 , David Lang <david@lang.hm> wrote:
>
>> On Fri, 25 Jul 2014 14:37:34 -0400, Valdis.Kletnieks@vt.edu wrote:
>>> On Sat, 24 May 2014 10:02:53 -0400, "R." said:
>>>
>>>> Further, this function could be auto-scheduled or made enabled on
>>>> router boot up.
>>>
>>> Yeah, if such a thing worked, it would be good.
>>>
>>> (Note in the following that a big part of my *JOB* is doing "What could
>>> possibly go wrong?" analysis on mission-critical systems, which tends
>>> to color
>>> my viewpoint on projects. I still think the basic concept is good, just
>>> difficult to do, and am listing the obvious challenges for anybody brave
>>> enough to tackle it... :)
>>>
>>>> I must be missing something important which prevents this. What is it?
>>>
>>> There's a few biggies.  The first is what the linux-kernel calls -ENOPATCH -
>>> nobody's written the code.  The second is you need an upstream target
>>> someplace
>>> to test against.  You need to deal with both the "server is unavalailable due
>>> to a backhoe incident 2 time zones away" problem (which isn't *that*
>>> hard, just
>>> default to Something Not Obviously Bad(TM), and "server is slashdotted" (whci
>>> is a bit harder to deal with.  Remember that there's some really odd corner
>>> cases to worry about - for instance, if there's a power failure in a
>>> town, then
>>> when the electric company restores power you're going to have every
>>> cerowrt box
>>> hit the server within a few seconds - all over the same uplink most
>>> likely.  No
>>> good data can result from that... (Holy crap, it's been almost 3
>>> decades since
>>> I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted
>>> over the
>>> network at once when building power was restored).
>>>
>>> And if you're in Izbekistan and the closest server netwise is at 60
>>> Hudson, the
>>> analysis to compute the correct values becomes.... interesting.
>>>
>>> Dealing with non-obvious error conditions is also a challenge - a router
>>> may only boot once every few months.  And if you happen to be booting just
>>> as a BGP routing flap is causing your traffic to take a vastly suboptimal
>>> path, you may end up encoding a vastly inaccurate setting and have it stuck
>>> there, causing suckage for non-obvious reasons for the non-technical, so you
>>> really don't want to enable auto-tuning unless you also have a good plan for
>>> auto-*RE*tuning....
>>
>> have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while.
>
> 	Yeah, keeping some history to “predict” when to measure next sounds clever.
>
>>
>> However, I think the big question is how much the tuning is required.
>
> I assume in most cases you need to measure the home-routers bandwidth rarely 
> (say on DSL only after a re-sync with the DSLAM), but you need to measure the 
> bandwidth early as only then you can properly shape the downlink. And we need 
> to know the link’s capacity to use traffic shaping so that BQL and fq_codel in 
> the router have control over the bottleneck queue… An equivalent of BQL and 
> fq_codel running in the DSLAM/CMTS and CPE obviously would be what we need, 
> because then BQL and fq_codel on the router would be all that is required. But 
> that does not seem like it is happening anytime soon, so we still need to 
> workaround the limitations in the equipment fr a long time to come, I fear.

by how much tuning is required, I wasn't meaning how frequently to tune, but how 
close default settings can come to the performance of a expertly tuned setup.

Ideally the tuning takes into account the characteristics of the hardware of the 
link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN 
tagging, ethernet with jumbo packet support for example), then you have overhead 
from the encapsulation that you would ideally take into account when tuning 
things.

the question I'm talking about below is how much do you loose compared to the 
idea if you ignore this sort of thing and just assume that the wire is dumb and 
puts the bits on them as you send them? By dumb I mean don't even allow for 
inter-packet gaps, don't measure the bandwidth, don't try to pace inbound 
connections by the timing of your acks, etc. Just run BQL and fq_codel and start 
the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and 
shrink them based on long-term passive observation of the sender.

If you end up only loosing 5-10% of your overall network performance by ignoring 
the details of the wire, then we should ignore them by default.

If however, not measuring anything first results in significantly worse 
performance than a tuned setup, then we need to figure out how to do the 
measurements needed for tuning.

Some people seem to have fallen into the "perfect is the enemy of good enough" 
trap on this topic. They are so fixated on getting the absolute best performance 
out of a link that they are forgetting how bad the status-quo is right now.

If you look at the graph that Dave Taht put on page 6 of his slide deck 
http://snapon.lab.bufferbloat.net/~d/Presos/CaseForComprehensiveQueueManagement/assets/player/KeynoteDHTMLPlayer.html#5 
it's important to realize that even the worst of the BQL+fq_codel graphs is 
worlds better than the default setting, while it would be nice to get to the 
green trace on the left, even getting to the middle traces instead of the black 
trace on the right would be a huge win for the public.

David Lang

>> If a connection with BQL and fq_codel is 90% as good as a tuned setup, 
>> default to untuned unless the user explicitly hits a button to measure (and 
>> then a second button to accept the measurement)
>>
>> If BQL and fw_codel by default are M70% as good as a tuned setup, there's 
>> more space to argue that all setups must be tuned, but then the question is 
>> how to they fare against a old, non-BQL, non-fq-codel setup? if they are 
>> considerably better, it may still be worthwhile.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 20:21       ` David Lang
@ 2014-07-26 20:54         ` Sebastian Moeller
  2014-07-26 21:14           ` David Lang
  0 siblings, 1 reply; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-26 20:54 UTC (permalink / raw)
  To: David Lang; +Cc: Wes Felter, cerowrt-devel

Hi David,

On Jul 26, 2014, at 22:21 , David Lang <david@lang.hm> wrote:

> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
> 
>> Hi David,
>> 
>> 
>> On Jul 25, 2014, at 22:57 , David Lang <david@lang.hm> wrote:
>> 
>>> On Fri, 25 Jul 2014, Wes Felter wrote:
>>> 
>>>> The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT.
>>>> 
>>>> Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS?
>>> 
>>> The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic.
>> 
>> 	Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times).
> 
> Except that your ideal network doesn't exist in the real world. You are never going to have the entire network quiescent, the router you are going to be talking to is always going to have other things going on, which can affect it's timing.

	Sure, the two packets a required per measurement,  guess I would calculate the average and confidence interval over several of these (potentially by a moving window) to get a handle on the variability. I have done some RTT measurements on a ADSL link and can say that realistically one needs in the hundreds data points per packet size. This sounds awe full, but at least it does not require to saturate the link and hence works without dedicated receivers on the other end...

> 
>>> unless the router you are connecting to is running some sort of service to support that,
>> 
>> 	But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
>> 	Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
>> 	I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…
> 
> As you say, anything that requires symmetrical traffic (like ICMP isn't going to work, and routers do not currently offer any service that will.

	Well I think the gargoyle idea is feasible given that there is a reference implementation out in the wild ;). 

> 
> you also can't count on time being synced properly.

	Quick testing today drove him that message (ICMP time requests showing receive time before originating times, quite sobering). Naive me had thought that NTP would guarantee <1ms deviation from reference time, but I just figured it is rather low ms to 100ms, so basically useless for one-way delay measurements for close hosts….

> Top Tier companies have trouble doing that in their dedicated datacenters, depending on it for this sort of testing is a non-starter

	Agreed.

> 
>>> you can't just test that link, you have to connect to something beyond that.
>> 
>> 	So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…
> 
> Well, let's talk about what we would like to have on the router
> 
> As I see it, we want to have two services
> 
> 1. a service you send a small amount of data to and it responds by sending you a large amount of data (preferrably with the most accurate timestamps it has and the TTL of the packets it received)
> 
> 2. a service you send a large amount of data to and it responds by sending you small responses, telling you how much data it has received (with a timestamp and what the TTL of the packets it received were)
> 
> questions:
> 
> A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???
> 
> TCP has the problem of slow start so it would need substantially more traffic to flow to reach steady-state.
> 
> anything else has the possibility of taking a different path through the router/switch software and so the performance may not be the same.

	You thing UDP would not work out?

> 
> B. How much data is needed to be statistically accurate?
> 
> Too many things can happen for 1-2 packets to tell you the answer. The systems on both ends are multi-tasking, and at high speeds, scheduling jitter will throw off your calculations with too few packets.

	Yeah, but you can (to steal an I idea from Rick Jones netperf) just keep measuring until the confidence interval around the mean of the data falls below a set magnitude. But for the purpose of traffic shaping you do not need the exact link bandwidth anyway just a close enough proxy to start the search for a decent set point from a reasonable position. I think that the actual shaping rates need to be iteratively optimized.

> 
> C. How can this be prevented from being used for DoS attacks, either against the thing running the service or against someone else via a reflected attack if it's a forgable protocol (i.e. UDP)

	Well, if it only requires a sparse packet stream it is not going to be to useful for DOS attacks, 

> 
> One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
> 
> Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.

	But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)

> 
> 
> Other requirements or restrictions?

	I think the measurement should be fast and continuous…

Best Regards
	Sebastian

> 
> David Lang


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 20:54         ` Sebastian Moeller
@ 2014-07-26 21:14           ` David Lang
  2014-07-26 21:48             ` Sebastian Moeller
  0 siblings, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-26 21:14 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Wes Felter, cerowrt-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 7493 bytes --]

On Sat, 26 Jul 2014, Sebastian Moeller wrote:

> On Jul 26, 2014, at 22:21 , David Lang <david@lang.hm> wrote:
>
>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>>
>>>
>>> On Jul 25, 2014, at 22:57 , David Lang <david@lang.hm> wrote:
>>>
>>>> The trouble is that to measure bandwidth, you have to be able to send and 
>>>> receive a lot of traffic.
>>>
>>> 	Well that is what you typically do, but you can get away with less 
>>> measurement traffic: in an ideal quiescent network sending two packets back 
>>> to back should give you the bandwidth (packet size / incoming time 
>>> difference of both packets), or send two packets of different size (needs 
>>> synchronized clocks, then difference of packet sizes / difference of 
>>> transfer times).
>>
>> Except that your ideal network doesn't exist in the real world. You are never 
>> going to have the entire network quiescent, the router you are going to be 
>> talking to is always going to have other things going on, which can affect 
>> it's timing.
>
> 	Sure, the two packets a required per measurement, guess I would 
> calculate the average and confidence interval over several of these 
> (potentially by a moving window) to get a handle on the variability. I have 
> done some RTT measurements on a ADSL link and can say that realistically one 
> needs in the hundreds data points per packet size. This sounds awe full, but 
> at least it does not require to saturate the link and hence works without 
> dedicated receivers on the other end...
>
>>
>>>> unless the router you are connecting to is running some sort of service to support that,
>>>
>>> 	But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
>>> 	Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
>>> 	I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…
>>
>> As you say, anything that requires symmetrical traffic (like ICMP isn't going to work, and routers do not currently offer any service that will.
>
> 	Well I think the gargoyle idea is feasible given that there is a 
> reference implementation out in the wild ;).

I'm not worried about an implementation existing as much as the question of if 
it's on the routers/switches by default, and if it isn't, is the service simple 
enough to be able to avoid causing load on these devices and to avoid having any 
security vulnerabilities (or DDos potential)

>>>> you can't just test that link, you have to connect to something beyond 
>>>> that.
>>>
>>> 	So it would be sweet if we could use services that are running on the 
>>> machines anyway, like ping. That way the “load” of all the leaf nodes of the 
>>> internet continuously measuring their bandwidth could be handled in a 
>>> distributed fashion avoiding melt-downs by synchronized measurement streams…
>>
>> Well, let's talk about what we would like to have on the router
>>
>> As I see it, we want to have two services
>>
>> 1. a service you send a small amount of data to and it responds by sending 
>> you a large amount of data (preferrably with the most accurate timestamps it 
>> has and the TTL of the packets it received)
>>
>> 2. a service you send a large amount of data to and it responds by sending 
>> you small responses, telling you how much data it has received (with a 
>> timestamp and what the TTL of the packets it received were)
>>
>> questions:
>>
>> A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???
>>
>> TCP has the problem of slow start so it would need substantially more traffic 
>> to flow to reach steady-state.
>>
>> anything else has the possibility of taking a different path through the 
>> router/switch software and so the performance may not be the same.
>
> 	You thing UDP would not work out?

I don't trust that UDP would go through the same codepaths and delays as TCP

even fw_codel handles TCP differently

so if we measure with UDP, does it really reflect the 'real world' of TCP?

>> B. How much data is needed to be statistically accurate?
>>
>> Too many things can happen for 1-2 packets to tell you the answer. The 
>> systems on both ends are multi-tasking, and at high speeds, scheduling jitter 
>> will throw off your calculations with too few packets.
>
> 	Yeah, but you can (to steal an I idea from Rick Jones netperf) just keep 
> measuring until the confidence interval around the mean of the data falls 
> below a set magnitude. But for the purpose of traffic shaping you do not need 
> the exact link bandwidth anyway just a close enough proxy to start the search 
> for a decent set point from a reasonable position. I think that the actual 
> shaping rates need to be iteratively optimized.
>
>>
>> C. How can this be prevented from being used for DoS attacks, either against 
>> the thing running the service or against someone else via a reflected attack 
>> if it's a forgable protocol (i.e. UDP)
>
> 	Well, if it only requires a sparse packet stream it is not going to be 
> to useful for DOS attacks,

unless it can be requested a lot

>> One thought I have is to require a high TTL on the packets for the services 
>> to respond to them. That way any abuse of the service would have to take 
>> place from very close on the network.
>>
>> Ideally these services would only respond to senders that are directly 
>> connected, but until these services are deployed and enabled by default, 
>> there is going to be a need to be the ability to 'jump over' old equipment. 
>> This need will probably never go away completely.
>
> 	But if we need to modify DSLAMs and CMTSs it would be much nicer if we 
> could just ask nicely what the current negotiated bandwidths are ;)

negotiated bandwith and effective bandwidth are not the same

what if you can't talk to the devices directly connected to the DSL line, but 
only to a router one hop on either side?

for example, I can't buy (at least not for anything close to a reasonable price) 
a router to run at home that has a DSL port on it, so I will always have some 
device between me and the DSL.

If you have a shared media (cable, wireless, etc), the negotiated speed is 
meaningless.

In my other location, I have a wireless link that is ethernet to the dish on the 
roof, I expect the other end is a similar setup, so I can never see the link 
speed directly (not to mention the fact that rain can degrade the effective link 
speed)

>> Other requirements or restrictions?
>
> 	I think the measurement should be fast and continuous…

Fast yes, because we want to impact the network as little as possible

continuous?? I'm not so sure. Do conditions really change that much? And as I 
ask in the other thread, how much does it hurt if your estimates are wrong?

for wireless links the conditions are much more variable, but we don't really 
know what is going to work well there.

David Lang

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 20:39       ` David Lang
@ 2014-07-26 21:25         ` Sebastian Moeller
  2014-07-26 21:45           ` David Lang
  0 siblings, 1 reply; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-26 21:25 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 8308 bytes --]

Hi David,


On Jul 26, 2014, at 22:39 , David Lang <david@lang.hm> wrote:

> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
> 
>> Hi David,
>> 
>> 
>> On Jul 25, 2014, at 23:03 , David Lang <david@lang.hm> wrote:
>> 
>>> On Fri, 25 Jul 2014 14:37:34 -0400, Valdis.Kletnieks@vt.edu wrote:
>>>> On Sat, 24 May 2014 10:02:53 -0400, "R." said:
>>>> 
>>>>> Further, this function could be auto-scheduled or made enabled on
>>>>> router boot up.
>>>> 
>>>> Yeah, if such a thing worked, it would be good.
>>>> 
>>>> (Note in the following that a big part of my *JOB* is doing "What could
>>>> possibly go wrong?" analysis on mission-critical systems, which tends
>>>> to color
>>>> my viewpoint on projects. I still think the basic concept is good, just
>>>> difficult to do, and am listing the obvious challenges for anybody brave
>>>> enough to tackle it... :)
>>>> 
>>>>> I must be missing something important which prevents this. What is it?
>>>> 
>>>> There's a few biggies.  The first is what the linux-kernel calls -ENOPATCH -
>>>> nobody's written the code.  The second is you need an upstream target
>>>> someplace
>>>> to test against.  You need to deal with both the "server is unavalailable due
>>>> to a backhoe incident 2 time zones away" problem (which isn't *that*
>>>> hard, just
>>>> default to Something Not Obviously Bad(TM), and "server is slashdotted" (whci
>>>> is a bit harder to deal with.  Remember that there's some really odd corner
>>>> cases to worry about - for instance, if there's a power failure in a
>>>> town, then
>>>> when the electric company restores power you're going to have every
>>>> cerowrt box
>>>> hit the server within a few seconds - all over the same uplink most
>>>> likely.  No
>>>> good data can result from that... (Holy crap, it's been almost 3
>>>> decades since
>>>> I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted
>>>> over the
>>>> network at once when building power was restored).
>>>> 
>>>> And if you're in Izbekistan and the closest server netwise is at 60
>>>> Hudson, the
>>>> analysis to compute the correct values becomes.... interesting.
>>>> 
>>>> Dealing with non-obvious error conditions is also a challenge - a router
>>>> may only boot once every few months.  And if you happen to be booting just
>>>> as a BGP routing flap is causing your traffic to take a vastly suboptimal
>>>> path, you may end up encoding a vastly inaccurate setting and have it stuck
>>>> there, causing suckage for non-obvious reasons for the non-technical, so you
>>>> really don't want to enable auto-tuning unless you also have a good plan for
>>>> auto-*RE*tuning....
>>> 
>>> have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while.
>> 
>> 	Yeah, keeping some history to “predict” when to measure next sounds clever.
>> 
>>> 
>>> However, I think the big question is how much the tuning is required.
>> 
>> I assume in most cases you need to measure the home-routers bandwidth rarely (say on DSL only after a re-sync with the DSLAM), but you need to measure the bandwidth early as only then you can properly shape the downlink. And we need to know the link’s capacity to use traffic shaping so that BQL and fq_codel in the router have control over the bottleneck queue… An equivalent of BQL and fq_codel running in the DSLAM/CMTS and CPE obviously would be what we need, because then BQL and fq_codel on the router would be all that is required. But that does not seem like it is happening anytime soon, so we still need to workaround the limitations in the equipment fr a long time to come, I fear.
> 
> by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.

	Good question.

> 
> Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
> 
> the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.

	As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)



> 
> If you end up only loosing 5-10% of your overall network performance by ignoring the details of the wire, then we should ignore them by default.
> 
> If however, not measuring anything first results in significantly worse performance than a tuned setup, then we need to figure out how to do the measurements needed for tuning.

	Agreed.

> 
> Some people seem to have fallen into the "perfect is the enemy of good enough" trap on this topic. They are so fixated on getting the absolute best performance out of a link that they are forgetting how bad the status-quo is right now.
> 
> If you look at the graph that Dave Taht put on page 6 of his slide deck http://snapon.lab.bufferbloat.net/~d/Presos/CaseForComprehensiveQueueManagement/assets/player/KeynoteDHTMLPlayer.html#5 it's important to realize that even the worst of the BQL+fq_codel graphs is worlds better than the default setting, while it would be nice to get to the green trace on the left, even getting to the middle traces instead of the black trace on the right would be a huge win for the public.

	Just to note in the plot above the connection to the DSL modem was always mediated by fq_codel and BQL? and since shaping was used BQL would not come into effect…

Best Regards
	Sebastian

> 
> David Lang
> 
> 
>>> If a connection with BQL and fq_codel is 90% as good as a tuned setup, default to untuned unless the user explicitly hits a button to measure (and then a second button to accept the measurement)
>>> 
>>> If BQL and fw_codel by default are M70% as good as a tuned setup, there's more space to argue that all setups must be tuned, but then the question is how to they fare against a old, non-BQL, non-fq-codel setup? if they are considerably better, it may still be worthwhile.


[-- Attachment #2.1: Type: text/html, Size: 9806 bytes --]

[-- Attachment #2.2: ATM_linklayer_encapsulation-effects.png --]
[-- Type: image/png, Size: 344086 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 21:25         ` Sebastian Moeller
@ 2014-07-26 21:45           ` David Lang
  2014-07-26 22:24             ` David Lang
  2014-07-26 22:39             ` Sebastian Moeller
  0 siblings, 2 replies; 51+ messages in thread
From: David Lang @ 2014-07-26 21:45 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

On Sat, 26 Jul 2014, Sebastian Moeller wrote:

> On Jul 26, 2014, at 22:39 , David Lang <david@lang.hm> wrote:
>
>> by how much tuning is required, I wasn't meaning how frequently to tune, but 
>> how close default settings can come to the performance of a expertly tuned 
>> setup.
>
> 	Good question.
>
>>
>> Ideally the tuning takes into account the characteristics of the hardware of 
>> the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, 
>> VLAN tagging, ethernet with jumbo packet support for example), then you have 
>> overhead from the encapsulation that you would ideally take into account when 
>> tuning things.
>>
>> the question I'm talking about below is how much do you loose compared to the 
>> idea if you ignore this sort of thing and just assume that the wire is dumb 
>> and puts the bits on them as you send them? By dumb I mean don't even allow 
>> for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound 
>> connections by the timing of your acks, etc. Just run BQL and fq_codel and 
>> start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) 
>> and shrink them based on long-term passive observation of the sender.
>
> 	As data talks I just did a quick experiment with my ADSL2+ koine at 
> home. The solid lines in the attached plot show the results for proper shaping 
> with SQM (shaping to 95% of del link rates of downstream and upstream while 
> taking the link layer properties, that is ATM encapsulation and per packet 
> overhead into account) the broken lines show the same system with just the 
> link layer adjustments and per packet overhead adjustments disabled, but still 
> shaping to 95% of link rate (this is roughly equivalent to 15% underestimation 
> of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams 
> up, 4 tcp steams down while measuring latency with ping and UDP probes). As 
> you can see from the plot just getting the link layer encapsulation wrong 
> destroys latency under load badly. The host is ~52ms RTT away, and with 
> fq_codel the ping time per leg is just increased one codel target of 5ms each 
> resulting in an modest latency increase of ~10ms with proper shaping for a 
> total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost 
> double), so RTT increases by ~43ms. Also note how the extremes for the broken 
> lines are much worse than for the solid lines. In short I would estimate that 
> a slight misjudgment (15%) results in almost 80% increase of latency under 
> load. In other words getting the rates right matters a lot. (I should also 
> note that in my setup there is a secondary router that limits RTT to max 
> 300ms, otherwise the broken lines might look even worse...)

what is the latency like without BQL and codel? the pre-bufferbloat version? 
(without any traffic shaping)

I agree that going from 65ms to 95ms seems significant, but if the stock version 
goes into up above 1000ms, then I think we are talking about things that are 
'close'

assuming that latency under load without the improvents got >1000ms

fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000

fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19

slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05

rather than looking at how much worse it is than the ideal, look at how much 
closer it is to the ideal than to the bloated version.

David Lang


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 21:14           ` David Lang
@ 2014-07-26 21:48             ` Sebastian Moeller
  2014-07-26 22:23               ` David Lang
  0 siblings, 1 reply; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-26 21:48 UTC (permalink / raw)
  To: David Lang; +Cc: Wes Felter, cerowrt-devel

Hi David,


On Jul 26, 2014, at 23:14 , David Lang <david@lang.hm> wrote:

> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
> 
>> On Jul 26, 2014, at 22:21 , David Lang <david@lang.hm> wrote:
>> 
>>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>>> 
>>>> 
>>>> On Jul 25, 2014, at 22:57 , David Lang <david@lang.hm> wrote:
>>>> 
>>>>> The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic.
>>>> 
>>>> 	Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times).
>>> 
>>> Except that your ideal network doesn't exist in the real world. You are never going to have the entire network quiescent, the router you are going to be talking to is always going to have other things going on, which can affect it's timing.
>> 
>> 	Sure, the two packets a required per measurement, guess I would calculate the average and confidence interval over several of these (potentially by a moving window) to get a handle on the variability. I have done some RTT measurements on a ADSL link and can say that realistically one needs in the hundreds data points per packet size. This sounds awe full, but at least it does not require to saturate the link and hence works without dedicated receivers on the other end...
>> 
>>> 
>>>>> unless the router you are connecting to is running some sort of service to support that,
>>>> 
>>>> 	But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though.
>>>> 	Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress)
>>>> 	I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link…
>>> 
>>> As you say, anything that requires symmetrical traffic (like ICMP isn't going to work, and routers do not currently offer any service that will.
>> 
>> 	Well I think the gargoyle idea is feasible given that there is a reference implementation out in the wild ;).
> 
> I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential)

	But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings...

> 
>>>>> you can't just test that link, you have to connect to something beyond that.
>>>> 
>>>> 	So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…
>>> 
>>> Well, let's talk about what we would like to have on the router
>>> 
>>> As I see it, we want to have two services
>>> 
>>> 1. a service you send a small amount of data to and it responds by sending you a large amount of data (preferrably with the most accurate timestamps it has and the TTL of the packets it received)
>>> 
>>> 2. a service you send a large amount of data to and it responds by sending you small responses, telling you how much data it has received (with a timestamp and what the TTL of the packets it received were)
>>> 
>>> questions:
>>> 
>>> A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???
>>> 
>>> TCP has the problem of slow start so it would need substantially more traffic to flow to reach steady-state.
>>> 
>>> anything else has the possibility of taking a different path through the router/switch software and so the performance may not be the same.
>> 
>> 	You thing UDP would not work out?
> 
> I don't trust that UDP would go through the same codepaths and delays as TCP

	Why should a router care 

> 
> even fw_codel handles TCP differently

	Does it? I thought UDP typically reacts differently to fq_codels dropping strategy but fq_codel does not differentiate between protocols (last time I looked at the code I came to that conclusion, but I am not very fluent in C so I might be simply wrong here)

> 
> so if we measure with UDP, does it really reflect the 'real world' of TCP?

	But we care for UDP as well, no?

> 
>>> B. How much data is needed to be statistically accurate?
>>> 
>>> Too many things can happen for 1-2 packets to tell you the answer. The systems on both ends are multi-tasking, and at high speeds, scheduling jitter will throw off your calculations with too few packets.
>> 
>> 	Yeah, but you can (to steal an I idea from Rick Jones netperf) just keep measuring until the confidence interval around the mean of the data falls below a set magnitude. But for the purpose of traffic shaping you do not need the exact link bandwidth anyway just a close enough proxy to start the search for a decent set point from a reasonable position. I think that the actual shaping rates need to be iteratively optimized.
>> 
>>> 
>>> C. How can this be prevented from being used for DoS attacks, either against the thing running the service or against someone else via a reflected attack if it's a forgable protocol (i.e. UDP)
>> 
>> 	Well, if it only requires a sparse packet stream it is not going to be to useful for DOS attacks,
> 
> unless it can be requested a lot

	Well yes, hence sparse stream, if we can make sure to always just send sparse streams we will stay on the backwaters of services useful for DOS I would guess, we just need not to be the low hanging fruit :) .

> 
>>> One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
>>> 
>>> Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
>> 
>> 	But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
> 
> negotiated bandwith and effective bandwidth are not the same
> 
> what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side?

	In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription of the DSLAM, that now this constitutes the bottleneck. Now the available bandwidth for each user depends on the combined traffic of all users, not a situation we can reasonable shape for anyway (I would hope that ISPs monitor this situation and would remedy it by adding uplink capacity, so this hopefully is just a transient event). 

> 
> for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.

	http://wiki.openwrt.org/toh/tp-link/td-w8970  or http://www.traverse.com.au/products ? If you had the DSL modem in the router under cerowrts control you would not need to use a traffic shaper for your uplink, as you could apply the BQL ideas to the ADSL driver.

> 
> If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.

	Not exactly meaningless, if gives you an upper bound...

> 
> In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)

	One more case for measuring the link speed continuously!


> 
>>> Other requirements or restrictions?
>> 
>> 	I think the measurement should be fast and continuous…
> 
> Fast yes, because we want to impact the network as little as possible
> 
> continuous?? I'm not so sure. Do conditions really change that much?

	You just gave an example above for changing link conditions, by shared media...

> And as I ask in the other thread, how much does it hurt if your estimates are wrong?

	I think I sent a plot to that regard.

> 
> for wireless links the conditions are much more variable, but we don't really know what is going to work well there.

	Wireless as in point 2 point links or in wifi?

Best Regards
	Sebastian

> 
> David Lang


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 21:48             ` Sebastian Moeller
@ 2014-07-26 22:23               ` David Lang
  2014-07-26 23:08                 ` Sebastian Moeller
  0 siblings, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-26 22:23 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Wes Felter, cerowrt-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 9323 bytes --]

On Sat, 26 Jul 2014, Sebastian Moeller wrote:

> On Jul 26, 2014, at 23:14 , David Lang <david@lang.hm> wrote:
>
>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>>
>>> On Jul 26, 2014, at 22:21 , David Lang <david@lang.hm> wrote:
>>>
>>>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>>>>
>>>>>
>>>>> On Jul 25, 2014, at 22:57 , David Lang <david@lang.hm> wrote:
>>>>>
>>>>>> unless the router you are connecting to is running some sort of service to support that,
>>>>>
>>>>> 	But this still requires some service on the other side. You could try to 
>>>>> use ICMP packets, but these will only allow to measure RTT not one-way 
>>>>> delays (if you do this on ADSL you will find the RTT dominated by the 
>>>>> typically much slower uplink path). If network equipment would be 
>>>>> guaranteed to use NTP for decent clock synchronization and would respond 
>>>>> to timestamp ICMP messages with timestamp reply measuring bandwidth might 
>>>>> be “cheap” enough to keep running in the background, though.
>>>>>
>>>>> 	Since this looks too simple there must be a simple reason why this would 
>>>>> fail. (It would be nice if ping packets with timestamps would have 
>>>>> required the echo server top also store its incoming timestamp in the 
>>>>> echo, but I digress)
>>>>>
>>>>> 	I note that gargoyle uses a sparse stream of ping packets to a close 
>>>>> host and uses increases in RTT as proxy for congestion and signal to 
>>>>> throttle down stream link…
>>>>>
>>>> As you say, anything that requires symmetrical traffic (like ICMP isn't 
>>>> going to work, and routers do not currently offer any service that will.
>>>
>>>
>>> 	Well I think the gargoyle idea is feasible given that there is a 
>>> reference implementation out in the wild ;).
>>
>> I'm not worried about an implementation existing as much as the question of 
>> if it's on the routers/switches by default, and if it isn't, is the service 
>> simple enough to be able to avoid causing load on these devices and to avoid 
>> having any security vulnerabilities (or DDos potential)
>
> 	But with gargoyle the idea is to monitor a sparse ping stream to the 
> closest host responding and interpreting a sudden increase in RTT as a sign 
> the the upstreams buffers are filling up and using this as signal to throttle 
> on the home router. My limited experience shows that quite often close hosts 
> will respond to pings...

that measures latency, but how does it tell you bandwidth unless you are the 
only possible thing on the network and you measure what you are receiving?

>>>>>> you can't just test that link, you have to connect to something beyond that.
>>>>>
>>>>> 	So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams…
>>>>
>>>> Well, let's talk about what we would like to have on the router
>>>>
>>>> As I see it, we want to have two services
>>>>
>>>> 1. a service you send a small amount of data to and it responds by sending you a large amount of data (preferrably with the most accurate timestamps it has and the TTL of the packets it received)
>>>>
>>>> 2. a service you send a large amount of data to and it responds by sending you small responses, telling you how much data it has received (with a timestamp and what the TTL of the packets it received were)
>>>>
>>>> questions:
>>>>
>>>> A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/???
>>>>
>>>> TCP has the problem of slow start so it would need substantially more traffic to flow to reach steady-state.
>>>>
>>>> anything else has the possibility of taking a different path through the router/switch software and so the performance may not be the same.
>>>
>>> 	You thing UDP would not work out?
>>
>> I don't trust that UDP would go through the same codepaths and delays as TCP
>
> 	Why should a router care
>
>> even fw_codel handles TCP differently
>
> 	Does it? I thought UDP typically reacts differently to fq_codels 
> dropping strategy but fq_codel does not differentiate between protocols (last 
> time I looked at the code I came to that conclusion, but I am not very fluent 
> in C so I might be simply wrong here)

with TCP, the system can tell the difference between different connections to 
the same system, with UDP it needs to infer this from port numbers, this isn't 
as accurate and so the systems (fq_codel and routers) handle them in a slightly 
different way. This does affect the numbers.

>> so if we measure with UDP, does it really reflect the 'real world' of TCP?
>
> 	But we care for UDP as well, no?

Yes, but the reality is that the vast majority of traffic is TCP, and that's 
what the devices are optimized to handle, so if we measure with UDP we may not 
get the same results as if we measure with TCP.

measuing with ICMP is different yet again.

Think of the router ASICs that handle the 'normal' traffic in the ASIC in the 
card, but 'unusual' traffic needs to be sent to the core CPU to be processed and 
is therefor MUCH slower

>>>> One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
>>>>
>>>> Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
>>>
>>> 	But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
>>
>> negotiated bandwith and effective bandwidth are not the same
>>
>> what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side?
>
> 	In my limited experience the typical bottleneck is the DSL line, so if 
> we shape for that we are fine… Assume for a moment the DSLAM uplink is so 
> congested because of oversubscription of the DSLAM, that now this constitutes 
> the bottleneck. Now the available bandwidth for each user depends on the 
> combined traffic of all users, not a situation we can reasonable shape for 
> anyway (I would hope that ISPs monitor this situation and would remedy it by 
> adding uplink capacity, so this hopefully is just a transient event).

for DSL you are correct, it's a point-to-point connection (star network 
topology), but we have other technologies used in homes that are shared-media 
bus topology networks. This includes cablemodems and wireless links.

>>
>> for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.
>
> 	http://wiki.openwrt.org/toh/tp-link/td-w8970 or

no 5GHz wireless?

> http://www.traverse.com.au/products ?

I couldn't figure out where to buy one through their site.

> If you had the DSL modem in the router 
> under cerowrts control you would not need to use a traffic shaper for your 
> uplink, as you could apply the BQL ideas to the ADSL driver.
>
>>
>> If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.
>
> 	Not exactly meaningless, if gives you an upper bound...

true, but is an upper bound good enough? How close does the estimate need to be?

and does it matter if both sides are doign fq_codel or is this still in the mode 
of trying to control the far side indirectly?

>> In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)
>
> 	One more case for measuring the link speed continuously!

at what point does the measuring process interfere with the use of the link? or 
cause other upstream issues.

>>>> Other requirements or restrictions?
>>>
>>> 	I think the measurement should be fast and continuous…
>>
>> Fast yes, because we want to impact the network as little as possible
>>
>> continuous?? I'm not so sure. Do conditions really change that much?
>
> 	You just gave an example above for changing link conditions, by shared media...

but can you really measure fast enough to handle shared media? at some point you 
need to give up measuring because by the time you have your measurement it's 
obsolete.

If you look at networking with a tight enough timeframe, it's either idle or 
100% utilized depending on if a bit is being sent at that instant, however a 
plot at that precision is worthless :-)

>> And as I ask in the other thread, how much does it hurt if your estimates are wrong?
>
> 	I think I sent a plot to that regard.

yep, our mails are crossing

>>
>> for wireless links the conditions are much more variable, but we don't really know what is going to work well there.
>
> 	Wireless as in point 2 point links or in wifi?

both, point-to-point is variable based on weather, trees blowing in the wind, 
interference, etc. Wifi has a lot more congestion, so interference dominates 
everything else.

David Lang

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 21:45           ` David Lang
@ 2014-07-26 22:24             ` David Lang
  2014-07-27  9:50               ` Sebastian Moeller
  2014-07-26 22:39             ` Sebastian Moeller
  1 sibling, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-26 22:24 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

On Sat, 26 Jul 2014, David Lang wrote:

> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>
>> On Jul 26, 2014, at 22:39 , David Lang <david@lang.hm> wrote:
>> 
>>> by how much tuning is required, I wasn't meaning how frequently to tune, 
>>> but how close default settings can come to the performance of a expertly 
>>> tuned setup.
>>
>> 	Good question.
>> 
>>> 
>>> Ideally the tuning takes into account the characteristics of the hardware 
>>> of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, 
>>> VPN, VLAN tagging, ethernet with jumbo packet support for example), then 
>>> you have overhead from the encapsulation that you would ideally take into 
>>> account when tuning things.
>>> 
>>> the question I'm talking about below is how much do you loose compared to 
>>> the idea if you ignore this sort of thing and just assume that the wire is 
>>> dumb and puts the bits on them as you send them? By dumb I mean don't even 
>>> allow for inter-packet gaps, don't measure the bandwidth, don't try to 
>>> pace inbound connections by the timing of your acks, etc. Just run BQL and 
>>> fq_codel and start the BQL sizes based on the wire speed of your link 
>>> (Gig-E on the 3800) and shrink them based on long-term passive observation 
>>> of the sender.
>>
>> 	As data talks I just did a quick experiment with my ADSL2+ koine at 
>> home. The solid lines in the attached plot show the results for proper 
>> shaping with SQM (shaping to 95% of del link rates of downstream and 
>> upstream while taking the link layer properties, that is ATM encapsulation 
>> and per packet overhead into account) the broken lines show the same system 
>> with just the link layer adjustments and per packet overhead adjustments 
>> disabled, but still shaping to 95% of link rate (this is roughly equivalent 
>> to 15% underestimation of the packet size). The actual theist is 
>> netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring 
>> latency with ping and UDP probes). As you can see from the plot just 
>> getting the link layer encapsulation wrong destroys latency under load 
>> badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg 
>> is just increased one codel target of 5ms each resulting in an modest 
>> latency increase of ~10ms with proper shaping for a total of ~65ms, with 
>> improper shaping RTTs increase to ~95ms (they almost double), so RTT 
>> increases by ~43ms. Also note how the extremes for the broken lines are 
>> much worse than for the solid lines. In short I would estimate that a 
>> slight misjudgment (15%) results in almost 80% increase of latency under 
>> load. In other words getting the rates right matters a lot. (I should also 
>> note that in my setup there is a secondary router that limits RTT to max 
>> 300ms, otherwise the broken lines might look even worse...)

is this with BQL/fq_codel in both directions or only in one direction?

David Lang

> what is the latency like without BQL and codel? the pre-bufferbloat version? 
> (without any traffic shaping)
>
> I agree that going from 65ms to 95ms seems significant, but if the stock 
> version goes into up above 1000ms, then I think we are talking about things 
> that are 'close'
>
> assuming that latency under load without the improvents got >1000ms
>
> fast-slow (in ms)
> ideal=10
> untuned=43
> bloated > 1000
>
> fast/slow
> ideal = 1.25
> untuned = 1.83
> bloated > 19
>
> slow/fast
> ideal = 0.8
> untuned = 0.55
> bloated = 0.05
>
> rather than looking at how much worse it is than the ideal, look at how much 
> closer it is to the ideal than to the bloated version.
>
> David Lang
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 21:45           ` David Lang
  2014-07-26 22:24             ` David Lang
@ 2014-07-26 22:39             ` Sebastian Moeller
  2014-07-26 22:53               ` David Lang
  1 sibling, 1 reply; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-26 22:39 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

Hi David,

On Jul 26, 2014, at 23:45 , David Lang <david@lang.hm> wrote:

> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
> 
>> On Jul 26, 2014, at 22:39 , David Lang <david@lang.hm> wrote:
>> 
>>> by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
>> 
>> 	Good question.
>> 
>>> 
>>> Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
>>> 
>>> the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
>> 
>> 	As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
> 
> what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping)

	So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for > 1000ms, just as with the broken line, with proper shaping even extreme pings stay < 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes > 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely...

> 
> I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are ‘close'

	Well if we include outliers (and we should as enough outliers will degrade the FPS and voip suitability of an otherwise responsive system quickly) stock and improper shaping  are in the >1000ms worst case range, while proper SQM bounds this to 100ms. 

> 
> assuming that latency under load without the improvents got >1000ms
> 
> fast-slow (in ms)
> ideal=10
> untuned=43
> bloated > 1000

	The sign seems off as fast < slow? I like this best ;)

> 
> fast/slow
> ideal = 1.25
> untuned = 1.83
> bloated > 19

	But Fast < Slow and hence this ration should be <0?

> slow/fast
> ideal = 0.8
> untuned = 0.55
> bloated = 0.05
> 

	and this >0?

> rather than looking at how much worse it is than the ideal, look at how much closer it is to the ideal than to the bloated version.
> 
> David Lang
> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 22:39             ` Sebastian Moeller
@ 2014-07-26 22:53               ` David Lang
  2014-07-26 23:39                 ` Sebastian Moeller
  0 siblings, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-26 22:53 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5867 bytes --]

On Sun, 27 Jul 2014, Sebastian Moeller wrote:

> Hi David,
>
> On Jul 26, 2014, at 23:45 , David Lang <david@lang.hm> wrote:
>
>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>>
>>> On Jul 26, 2014, at 22:39 , David Lang <david@lang.hm> wrote:
>>>
>>>> by how much tuning is required, I wasn't meaning how frequently to tune, 
>>>> but how close default settings can come to the performance of a expertly 
>>>> tuned setup.
>>>
>>> 	Good question.
>>>
>>>>
>>>> Ideally the tuning takes into account the characteristics of the hardware 
>>>> of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, 
>>>> VPN, VLAN tagging, ethernet with jumbo packet support for example), then 
>>>> you have overhead from the encapsulation that you would ideally take into 
>>>> account when tuning things.
>>>>
>>>> the question I'm talking about below is how much do you loose compared to 
>>>> the idea if you ignore this sort of thing and just assume that the wire is 
>>>> dumb and puts the bits on them as you send them? By dumb I mean don't even 
>>>> allow for inter-packet gaps, don't measure the bandwidth, don't try to pace 
>>>> inbound connections by the timing of your acks, etc. Just run BQL and 
>>>> fq_codel and start the BQL sizes based on the wire speed of your link 
>>>> (Gig-E on the 3800) and shrink them based on long-term passive observation 
>>>> of the sender.
>>>
>>> 	As data talks I just did a quick experiment with my ADSL2+ koine at 
>>> home. The solid lines in the attached plot show the results for proper 
>>> shaping with SQM (shaping to 95% of del link rates of downstream and 
>>> upstream while taking the link layer properties, that is ATM encapsulation 
>>> and per packet overhead into account) the broken lines show the same system 
>>> with just the link layer adjustments and per packet overhead adjustments 
>>> disabled, but still shaping to 95% of link rate (this is roughly equivalent 
>>> to 15% underestimation of the packet size). The actual theist is 
>>> netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring 
>>> latency with ping and UDP probes). As you can see from the plot just getting 
>>> the link layer encapsulation wrong destroys latency under load badly. The 
>>> host is ~52ms RTT away, and with fq_codel the ping time per leg is just 
>>> increased one codel target of 5ms each resulting in an modest latency 
>>> increase of ~10ms with proper shaping for a total of ~65ms, with improper 
>>> shaping RTTs increase to ~95ms (they almost double), so RTT increases by 
>>> ~43ms. Also note how the extremes for the broken lines are much worse than 
>>> for the solid lines. In short I would estimate that a slight misjudgment 
>>> (15%) results in almost 80% increase of latency under load. In other words 
>>> getting the rates right matters a lot. (I should also note that in my setup 
>>> there is a secondary router that limits RTT to max 300ms, otherwise the 
>>> broken lines might look even worse...)
>>
>> what is the latency like without BQL and codel? the pre-bufferbloat version? 
>> (without any traffic shaping)
>
> 	So I just disabled SQM and the plot looks almost exactly like the broken 
> line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings 
> delayed for > 1000ms, just as with the broken line, with proper shaping even 
> extreme pings stay < 100ms). But as I said before I need to run through my ISP 
> supplied primary router (not just a dumb modem) that also tries to bound the 
> latencies under load to some degree. Actually I just repeated the test 
> connected directly to the primary router and get the same ~95ms average ping 
> time with frequent extremes > 1000ms, so it looks like just getting the 
> shaping wrong by 15% eradicates the buffer de-bloating efforts completely...

just so I understand this completely

you have

debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?

and are you measuring the latency impact when uploading or downloading?

I think a lot of people would be happy with 95ms average pings on a loaded 
connection, even with occasional outliers. It's far better than sustained 
multi-second ping times which is what I've seen with stock setups.

but if no estimate is this bad, how bad is it if you use as your estimate the 
'rated' speed of your DSL (i.e. what the ISP claims they are providing you) 
instead of the fully accurate speed that includes accounting for ATM 
encapsulation?

It's also worth figuring out if this problem would remain in place if you didn't 
have to go through the ISP router and were runing fq_codel on that router. As 
long as fixing bufferbloat involves esoteric measurements and tuning, it's not 
going to be solved, but if it could be solved by people flahing openwrt onto 
their DSL router and then using the defaults, it could gain traction fairly 
quickly.

>> I agree that going from 65ms to 95ms seems significant, but if the stock 
>> version goes into up above 1000ms, then I think we are talking about things 
>> that are ‘close'
>
> 	Well if we include outliers (and we should as enough outliers will 
> degrade the FPS and voip suitability of an otherwise responsive system 
> quickly) stock and improper shaping are in the >1000ms worst case range, while 
> proper SQM bounds this to 100ms.
>
>>
>> assuming that latency under load without the improvents got >1000ms
>>
>> fast-slow (in ms)
> ideal=10
> untuned=43
> bloated > 1000
>
> 	The sign seems off as fast < slow? I like this best ;)

yep, I reversed fast/slow in all of these

>>
>> fast/slow
>> ideal = 1.25
>> untuned = 1.83
>> bloated > 19
>
> 	But Fast < Slow and hence this ration should be <0?

1 not 0, but yes, this is really slow/fast

>> slow/fast
>> ideal = 0.8
>> untuned = 0.55
>> bloated = 0.05
>>
>
> 	and this >0?

and this is really fast/slow

David Lang

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 22:23               ` David Lang
@ 2014-07-26 23:08                 ` Sebastian Moeller
  2014-07-27  1:04                   ` David Lang
  0 siblings, 1 reply; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-26 23:08 UTC (permalink / raw)
  To: David Lang; +Cc: Wes Felter, cerowrt-devel

Hi David,


On Jul 27, 2014, at 00:23 , David Lang <david@lang.hm> wrote:
[...]
>>> I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential)
>> 
>> 	But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings...
> 
> that measures latency, but how does it tell you bandwidth unless you are the only possible thing on the network and you measure what you are receiving?

	So the idea would be start the ping probe with no traffic and increase the traffic until the ping RTT increases, the useable bandwidth is around when RTTs increase.

[...]
>> 
>>> even fw_codel handles TCP differently
>> 
>> 	Does it? I thought UDP typically reacts differently to fq_codels dropping strategy but fq_codel does not differentiate between protocols (last time I looked at the code I came to that conclusion, but I am not very fluent in C so I might be simply wrong here)
> 
> with TCP, the system can tell the difference between different connections to the same system, with UDP it needs to infer this from port numbers, this isn't as accurate and so the systems (fq_codel and routers) handle them in a slightly different way. This does affect the numbers.

	But that only affects the hashing into fq_codel bins? From http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c 
70 static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q,
 71                                   const struct sk_buff *skb)
 72 {
 73         struct flow_keys keys;
 74         unsigned int hash;
 75 
 76         skb_flow_dissect(skb, &keys);
 77         hash = jhash_3words((__force u32)keys.dst,
 78                             (__force u32)keys.src ^ keys.ip_proto,
 79                             (__force u32)keys.ports, q->perturbation);
 80         return ((u64)hash * q->flows_cnt) >> 32;
 81 }

The way I read this is that it just uses source  and destination IP and the port, all the protocol does is make sure different protocol connections to the same src dot ports ruble end in different bins, no? My C is bad so I would not be amazed if my interpretation would be wrong, but please show me where?



> 
>>> so if we measure with UDP, does it really reflect the 'real world' of TCP?
>> 
>> 	But we care for UDP as well, no?
> 
> Yes, but the reality is that the vast majority of traffic is TCP, and that's what the devices are optimized to handle, so if we measure with UDP we may not get the same results as if we measure with TCP.
> 
> measuing with ICMP is different yet again.

	Yes, I have heard stories like that when I set out for my little detect ATM quantization from ping RTTs, but to my joy it looks like ICMP still gives reasonable measurements! Besed on tat data I would assume UDP to be even less exotic and hence handled even less special and hence more like tcp?

> 
> Think of the router ASICs that handle the 'normal' traffic in the ASIC in the card, but 'unusual' traffic needs to be sent to the core CPU to be processed and is therefor MUCH slower

	Except for my ICMP RTT measurements I still saw quantization steps in accordance with the expected best case RTT for a packet, showing that the slow processing at least is constant and hence easy to get ridd of in measurements...


> 
>>>>> One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
>>>>> 
>>>>> Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
>>>> 
>>>> 	But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
>>> 
>>> negotiated bandwith and effective bandwidth are not the same
>>> 
>>> what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side?
>> 
>> 	In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription of the DSLAM, that now this constitutes the bottleneck. Now the available bandwidth for each user depends on the combined traffic of all users, not a situation we can reasonable shape for anyway (I would hope that ISPs monitor this situation and would remedy it by adding uplink capacity, so this hopefully is just a transient event).
> 
> for DSL you are correct, it's a point-to-point connection (star network topology), but we have other technologies used in homes that are shared-media bus topology networks. This includes cablemodems and wireless links.

	Well, yes I understand, but you again would assume that the cable ISP tries to provision the system so that most users are happy, so congestion is not the rule? Even then I think cable guarantees some minimum rates per user, no? With wireless it is worse in that RF events outside of the ISP and end users control can ruin the day. 

> 
>>> 
>>> for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.
>> 
>> 	http://wiki.openwrt.org/toh/tp-link/td-w8970 or
> 
> no 5GHz wireless?

	Could be, but definitely reasonable priced, probany cheap enough to use as smart de-bloated DSL modem, so your main router does not need HTB traffic shaping on uplink anymore. I might actually go that route since I really dislike my ISP primary router, but I digress...

> 
>> http://www.traverse.com.au/products ?
> 
> I couldn't figure out where to buy one through their site.

	Maybe they only sell in AU, I guess I just wanted to be helpful,

> 
>> If you had the DSL modem in the router under cerowrts control you would not need to use a traffic shaper for your uplink, as you could apply the BQL ideas to the ADSL driver.
>> 
>>> 
>>> If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.
>> 
>> 	Not exactly meaningless, if gives you an upper bound...
> 
> true, but is an upper bound good enough? How close does the estimate need to be?

	If we end up recommending people using say binary search to find the best tradeoff (maximizing throughput while keeping the maximum latency under load increase bounded to say 10ms) we should have an idea where to start, so bit to large is fine as a starting point. Traditionally the recommendation was around 85% of link rates, but that never came with a decent justification or data.

> 
> and does it matter if both sides are doign fq_codel or is this still in the mode of trying to control the far side indirectly?

	Yes, this is only relevant as long as both sides of the bottleneck link are not de-bloated. But it does not look like DSLAMs/CMTs will change any time soon from the old ways...

> 
>>> In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)
>> 
>> 	One more case for measuring the link speed continuously!
> 
> at what point does the measuring process interfere with the use of the link? or cause other upstream issues.

	If my measuring by sparse stream idea works out the answer to both questions is not much ;)

> 
>>>>> Other requirements or restrictions?
>>>> 
>>>> 	I think the measurement should be fast and continuous…
>>> 
>>> Fast yes, because we want to impact the network as little as possible
>>> 
>>> continuous?? I'm not so sure. Do conditions really change that much?
>> 
>> 	You just gave an example above for changing link conditions, by shared media...
> 
> but can you really measure fast enough to handle shared media? at some point you need to give up measuring because by the time you have your measurement it's obsolete.

	So this is not going to work well a wifi wlan with wildly fluctuating rates (see Dave’s upcoming project make-wifi-fast) but for typical cable node where congestion changes over the day as a function of people being at home it might be fast enough.

> 
> If you look at networking with a tight enough timeframe, it's either idle or 100% utilized depending on if a bit is being sent at that instant, however a plot at that precision is worthless :-)

	Yes I think a moving average over some time would be required.

> 
>>> And as I ask in the other thread, how much does it hurt if your estimates are wrong?
>> 
>> 	I think I sent a plot to that regard.
> 
> yep, our mails are crossing
> 
>>> 
>>> for wireless links the conditions are much more variable, but we don't really know what is going to work well there.
>> 
>> 	Wireless as in point 2 point links or in wifi?
> 
> both, point-to-point is variable based on weather, trees blowing in the wind, interference, etc. Wifi has a lot more congestion, so interference dominates everything else.

	So maybe that is a diffent kettle of fish then.

Best Regards
	Sebastian

> 
> David Lang


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 22:53               ` David Lang
@ 2014-07-26 23:39                 ` Sebastian Moeller
  2014-07-27  0:49                   ` David Lang
  0 siblings, 1 reply; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-26 23:39 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

Hi David,

On Jul 27, 2014, at 00:53 , David Lang <david@lang.hm> wrote:

> On Sun, 27 Jul 2014, Sebastian Moeller wrote:
> 
>> Hi David,
>> 
>> On Jul 26, 2014, at 23:45 , David Lang <david@lang.hm> wrote:
>> 
>>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>>> 
>>>> On Jul 26, 2014, at 22:39 , David Lang <david@lang.hm> wrote:
>>>> 
>>>>> by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
>>>> 
>>>> 	Good question.
>>>> 
>>>>> 
>>>>> Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
>>>>> 
>>>>> the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
>>>> 
>>>> 	As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
>>> 
>>> what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping)
>> 
>> 	So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for > 1000ms, just as with the broken line, with proper shaping even extreme pings stay < 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes > 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely...
> 
> just so I understand this completely
> 
> you have
> 
> debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?

	Well more like:

	Macbook with dubious bloat-state -> wifi to de-bloated cerowrt box that shapes the traffic -> ISP router -> ADSL -> internet -> server

I assume that Dave debated these servers well, but it should not really matter as the problem are the buffers on both ends of the bottleneck ADSL link.


> 
> and are you measuring the latency impact when uploading or downloading?

	No I measure the impact of latency of saturating both up- and downlink, pretty much the worst case scenario.

> 
> I think a lot of people would be happy with 95ms average pings on a loaded connection, even with occasional outliers.

	No that is too low an aim, this still is not useable for real time applications, we should aim for base RTT plus 10ms. (For very slow links we need to cut some slack but for > 3Mbps 10ms should be achievable )

> It's far better than sustained multi-second ping times which is what I've seen with stock setups.

	True, but compared to multi seconds even <1000ms would be a really great improvement, but also not enough.

> 
> but if no estimate is this bad, how bad is it if you use as your estimate the 'rated' speed of your DSL (i.e. what the ISP claims they are providing you) instead of the fully accurate speed that includes accounting for ATM encapsulation?

	Well ~95ms with outliers > 1000ms, just as bad as no estimate. I shaped 5% below rated speed as reported by the DSL modem, so disabling the ATM link layer adjustments (as shown in the broken lines in the plot), basically increased the effective shaped rate by ~13% or to effectively 107% of line rate, your proposal would be line rate and no link layer adjustments or effectively 110% of line rate; I do not feel like repeating this experiment right now as I think the data so far shows that even with less misjudgment the bloat effect is fully visible ) Not accounting for ATM framing carries a ~10% cost in link speed, as ATM packet size on the wire increases by >= ~10%.

> 
> It's also worth figuring out if this problem would remain in place if you didn't have to go through the ISP router and were runing fq_codel on that router.

	If the DSL modem would be debloated at least on upstream no shaping would be required any more; but that does not fix the need for downstream shaping (and bandwidth estimation) until the head end gear is debloated..

> As long as fixing bufferbloat involves esoteric measurements and tuning, it's not going to be solved, but if it could be solved by people flahing openwrt onto their DSL router and then using the defaults, it could gain traction fairly quickly.

	But as there are only very few DSL modems with open sources (especially of the DSL chips) this just as esoteric ;) Really if equipment manufactures could be convinced to take these issues seriously and actually fix their gear that would be best. But this does not look like it is happening on the fast track. (Even DOCSIS developer cable labs punted on requiring codel or fq_codel in DOCSIS modems since the think that the required timestamps are to “expensive” on the device class they want to use for modems. They opted for PIE, much better than what we have right now but far away from my latency under load increase of 10ms...)

> 
>>> I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are ‘close'
>> 
>> 	Well if we include outliers (and we should as enough outliers will degrade the FPS and voip suitability of an otherwise responsive system quickly) stock and improper shaping are in the >1000ms worst case range, while proper SQM bounds this to 100ms.
>> 
>>> 
>>> assuming that latency under load without the improvents got >1000ms
>>> 
>>> fast-slow (in ms)
>> ideal=10
>> untuned=43
>> bloated > 1000
>> 
>> 	The sign seems off as fast < slow? I like this best ;)
> 
> yep, I reversed fast/slow in all of these
> 
>>> 
>>> fast/slow
>>> ideal = 1.25
>>> untuned = 1.83
>>> bloated > 19
>> 
>> 	But Fast < Slow and hence this ration should be <0?
> 
> 1 not 0, but yes, this is really slow/fast
> 
>>> slow/fast
>>> ideal = 0.8
>>> untuned = 0.55
>>> bloated = 0.05
>>> 
>> 
>> 	and this >0?
> 
> and this is really fast/slow


	What about taking the latency difference an re;aging it with a reference time, like say the time a photon would take to travel once around the equator, or the earth’s diamater?

Best Regards
	Sebastian



> 
> David Lang


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 23:39                 ` Sebastian Moeller
@ 2014-07-27  0:49                   ` David Lang
  2014-07-27 11:17                     ` Sebastian Moeller
  0 siblings, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-27  0:49 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 11052 bytes --]

On Sun, 27 Jul 2014, Sebastian Moeller wrote:

> On Jul 27, 2014, at 00:53 , David Lang <david@lang.hm> wrote:
>
>> On Sun, 27 Jul 2014, Sebastian Moeller wrote:
>>
>>> Hi David,
>>>
>>> On Jul 26, 2014, at 23:45 , David Lang <david@lang.hm> wrote:
>>>
>>>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>>>>
>>>>> On Jul 26, 2014, at 22:39 , David Lang <david@lang.hm> wrote:
>>>>>
>>>>>> by how much tuning is required, I wasn't meaning how frequently to tune, 
>>>>>> but how close default settings can come to the performance of a expertly 
>>>>>> tuned setup.
>>>>>
>>>>> 	Good question.
>>>>>
>>>>>>
>>>>>> Ideally the tuning takes into account the characteristics of the hardware 
>>>>>> of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, 
>>>>>> VPN, VLAN tagging, ethernet with jumbo packet support for example), then 
>>>>>> you have overhead from the encapsulation that you would ideally take into 
>>>>>> account when tuning things.
>>>>>>
>>>>>> the question I'm talking about below is how much do you loose compared to 
>>>>>> the idea if you ignore this sort of thing and just assume that the wire 
>>>>>> is dumb and puts the bits on them as you send them? By dumb I mean don't 
>>>>>> even allow for inter-packet gaps, don't measure the bandwidth, don't try 
>>>>>> to pace inbound connections by the timing of your acks, etc. Just run BQL 
>>>>>> and fq_codel and start the BQL sizes based on the wire speed of your link 
>>>>>> (Gig-E on the 3800) and shrink them based on long-term passive 
>>>>>> observation of the sender.
>>>>>
>>>>> 	As data talks I just did a quick experiment with my ADSL2+ koine at 
>>>>> home. The solid lines in the attached plot show the results for proper 
>>>>> shaping with SQM (shaping to 95% of del link rates of downstream and 
>>>>> upstream while taking the link layer properties, that is ATM encapsulation 
>>>>> and per packet overhead into account) the broken lines show the same 
>>>>> system with just the link layer adjustments and per packet overhead 
>>>>> adjustments disabled, but still shaping to 95% of link rate (this is 
>>>>> roughly equivalent to 15% underestimation of the packet size). The actual 
>>>>> theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while 
>>>>> measuring latency with ping and UDP probes). As you can see from the plot 
>>>>> just getting the link layer encapsulation wrong destroys latency under 
>>>>> load badly. The host is ~52ms RTT away, and with fq_codel the ping time 
>>>>> per leg is just increased one codel target of 5ms each resulting in an 
>>>>> modest latency increase of ~10ms with proper shaping for a total of ~65ms, 
>>>>> with improper shaping RTTs increase to ~95ms (they almost double), so RTT 
>>>>> increases by ~43ms. Also note how the extremes for the broken lines are 
>>>>> much worse than for the solid lines. In short I would estimate that a 
>>>>> slight misjudgment (15%) results in almost 80% increase of latency under 
>>>>> load. In other words getting the rates right matters a lot. (I should also 
>>>>> note that in my setup there is a secondary router that limits RTT to max 
>>>>> 300ms, otherwise the broken lines might look even worse...)
>>>>
>>>> what is the latency like without BQL and codel? the pre-bufferbloat 
>>>> version? (without any traffic shaping)
>>>
>>> 	So I just disabled SQM and the plot looks almost exactly like the broken 
>>> line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings 
>>> delayed for > 1000ms, just as with the broken line, with proper shaping even 
>>> extreme pings stay < 100ms). But as I said before I need to run through my 
>>> ISP supplied primary router (not just a dumb modem) that also tries to bound 
>>> the latencies under load to some degree. Actually I just repeated the test 
>>> connected directly to the primary router and get the same ~95ms average ping 
>>> time with frequent extremes > 1000ms, so it looks like just getting the 
>>> shaping wrong by 15% eradicates the buffer de-bloating efforts completely...
>>
>> just so I understand this completely
>>
>> you have
>>
>> debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?
>
> 	Well more like:
>
> 	Macbook with dubious bloat-state -> wifi to de-bloated cerowrt box that 
> shapes the traffic -> ISP router -> ADSL -> internet -> server
>
> I assume that Dave debated these servers well, but it should not really matter 
> as the problem are the buffers on both ends of the bottleneck ADSL link.

right, I was forgetting that unless you are the bottleneck, you aren't buffering 
anything and so debloating makes no difference. In a case like yours where you 
can't debloat the actual bottleneck, the best that you can do is to artificially 
become the bottleneck by shaping the traffic. but on the download side it's much 
harder.

What are we aiming for? something that will show the problem clearly so that 
fixes can be put in the right place? or a work-around to use in the meantime?

I think both need to be pursued, but we need to be clear on what is being done 
for each one.

If having BQL+fq_codel with defaults would solve the problem if it was on the 
right routers, we need to show that.

Then, because we can't get the fixes on the right routers and need to 
work-around the problem by artificially becoming the bottleneck, we need to show 
that the 95% that we shape to is throwing away 5% of your capacity and make that 
clear to the users.

otherwise we will risk getting to the point where it will never get fixed 
because the ISPs will look at their routers and say that bufferbloat can't 
possibly be a problem as they never have large queues (because we are doing the 
workarounds.


>> and are you measuring the latency impact when uploading or downloading?
>
> 	No I measure the impact of latency of saturating both up- and downlink, 
> pretty much the worst case scenario.

I think we need to test this in each direction independantly.

Cerowrt can do a pretty good job of keeping the uplink from being saturated, but 
it can't do a lot for the downlink.

>>
>> I think a lot of people would be happy with 95ms average pings on a loaded 
>> connection, even with occasional outliers.
>
> 	No that is too low an aim, this still is not useable for real time 
> applications, we should aim for base RTT plus 10ms. (For very slow links we 
> need to cut some slack but for > 3Mbps 10ms should be achievable )

perfect is the enemy of good enough.

There's achievable if every router is tuned to exactly the right conditions and 
there's achievable for course settings that can be widely deployed. Get the 
second out while continuing to work on making the first easier.

residential connections only come in a smallish number of sizes, it shouldn't be 
too hard to do a few probes and guess which size is in use, then set the 
bandwith to 90% of that standard size and you should be pretty good without 
further tuning.

>> It's far better than sustained multi-second ping times which is what I've 
>> seen with stock setups.
>
> 	True, but compared to multi seconds even <1000ms would be a really great 
> improvement, but also not enough.
>
>>
>> but if no estimate is this bad, how bad is it if you use as your estimate the 
>> 'rated' speed of your DSL (i.e. what the ISP claims they are providing you) 
>> instead of the fully accurate speed that includes accounting for ATM 
>> encapsulation?
>
> 	Well ~95ms with outliers > 1000ms, just as bad as no estimate. I shaped 
> 5% below rated speed as reported by the DSL modem, so disabling the ATM link 
> layer adjustments (as shown in the broken lines in the plot), basically 
> increased the effective shaped rate by ~13% or to effectively 107% of line 
> rate, your proposal would be line rate and no link layer adjustments or 
> effectively 110% of line rate; I do not feel like repeating this experiment 
> right now as I think the data so far shows that even with less misjudgment the 
> bloat effect is fully visible ) Not accounting for ATM framing carries a ~10% 
> cost in link speed, as ATM packet size on the wire increases by >= ~10%.

so what if you shape to 90% of rated speed (no allowance for ATM vs other 
transports)?

>> It's also worth figuring out if this problem would remain in place if you 
>> didn't have to go through the ISP router and were runing fq_codel on that 
>> router.
>
> 	If the DSL modem would be debloated at least on upstream no shaping 
> would be required any more; but that does not fix the need for downstream 
> shaping (and bandwidth estimation) until the head end gear is debloated..

right, I was forgetting this earlier.

>> As long as fixing bufferbloat involves esoteric measurements and tuning, it's 
>> not going to be solved, but if it could be solved by people flahing openwrt 
>> onto their DSL router and then using the defaults, it could gain traction 
>> fairly quickly.
>
> 	But as there are only very few DSL modems with open sources (especially 
> of the DSL chips) this just as esoteric ;) Really if equipment manufactures 
> could be convinced to take these issues seriously and actually fix their gear 
> that would be best. But this does not look like it is happening on the fast 
> track. (Even DOCSIS developer cable labs punted on requiring codel or fq_codel 
> in DOCSIS modems since the think that the required timestamps are to 
> “expensive” on the device class they want to use for modems. They opted for 
> PIE, much better than what we have right now but far away from my latency 
> under load increase of 10ms...)
>
>>
>>>> I agree that going from 65ms to 95ms seems significant, but if the stock 
>>>> version goes into up above 1000ms, then I think we are talking about things 
>>>> that are ‘close'
>>>
>>> 	Well if we include outliers (and we should as enough outliers will 
>>> degrade the FPS and voip suitability of an otherwise responsive system 
>>> quickly) stock and improper shaping are in the >1000ms worst case range, 
>>> while proper SQM bounds this to 100ms.
>>>
>>>>
>>>> assuming that latency under load without the improvents got >1000ms
>>>>
>>>> fast-slow (in ms)
>>> ideal=10
>>> untuned=43
>>> bloated > 1000
>>>
>>> 	The sign seems off as fast < slow? I like this best ;)
>>
>> yep, I reversed fast/slow in all of these
>>
>>>>
>>>> fast/slow
>>>> ideal = 1.25
>>>> untuned = 1.83
>>>> bloated > 19
>>>
>>> 	But Fast < Slow and hence this ration should be <0?
>>
>> 1 not 0, but yes, this is really slow/fast
>>
>>>> slow/fast
>>>> ideal = 0.8
>>>> untuned = 0.55
>>>> bloated = 0.05
>>>>
>>>
>>> 	and this >0?
>>
>> and this is really fast/slow
>
>
> 	What about taking the latency difference an re;aging it with a reference 
> time, like say the time a photon would take to travel once around the equator, 
> or the earth’s diamater?

how about latency difference scaled by the time to send one 1500 byte packet at 
the measured throughput?

This would factor out the data rate and would not be affected by long distance 
links.

David Lang

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 23:08                 ` Sebastian Moeller
@ 2014-07-27  1:04                   ` David Lang
  2014-07-27 11:38                     ` Sebastian Moeller
  0 siblings, 1 reply; 51+ messages in thread
From: David Lang @ 2014-07-27  1:04 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Wes Felter, cerowrt-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 10791 bytes --]

On Sun, 27 Jul 2014, Sebastian Moeller wrote:

> On Jul 27, 2014, at 00:23 , David Lang <david@lang.hm> wrote:
> [...]
>>>> I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential)
>>>
>>> 	But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings...
>>
>> that measures latency, but how does it tell you bandwidth unless you are the only possible thing on the network and you measure what you are receiving?
>
> 	So the idea would be start the ping probe with no traffic and increase the traffic until the ping RTT increases, the useable bandwidth is around when RTTs increase.
>
> [...]
>>>
>>>> even fw_codel handles TCP differently
>>>
>>> 	Does it? I thought UDP typically reacts differently to fq_codels dropping strategy but fq_codel does not differentiate between protocols (last time I looked at the code I came to that conclusion, but I am not very fluent in C so I might be simply wrong here)
>>
>> with TCP, the system can tell the difference between different connections to the same system, with UDP it needs to infer this from port numbers, this isn't as accurate and so the systems (fq_codel and routers) handle them in a slightly different way. This does affect the numbers.
>
> 	But that only affects the hashing into fq_codel bins? From http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c
> 70 static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q,
> 71                                   const struct sk_buff *skb)
> 72 {
> 73         struct flow_keys keys;
> 74         unsigned int hash;
> 75
> 76         skb_flow_dissect(skb, &keys);
> 77         hash = jhash_3words((__force u32)keys.dst,
> 78                             (__force u32)keys.src ^ keys.ip_proto,
> 79                             (__force u32)keys.ports, q->perturbation);
> 80         return ((u64)hash * q->flows_cnt) >> 32;
> 81 }
>
> The way I read this is that it just uses source  and destination IP and the port, all the protocol does is make sure different protocol connections to the same src dot ports ruble end in different bins, no? My C is bad so I would not be amazed if my interpretation would be wrong, but please show me where?
>
>
>
>>
>>>> so if we measure with UDP, does it really reflect the 'real world' of TCP?
>>>
>>> 	But we care for UDP as well, no?
>>
>> Yes, but the reality is that the vast majority of traffic is TCP, and that's what the devices are optimized to handle, so if we measure with UDP we may not get the same results as if we measure with TCP.
>>
>> measuing with ICMP is different yet again.
>
> 	Yes, I have heard stories like that when I set out for my little detect ATM quantization from ping RTTs, but to my joy it looks like ICMP still gives reasonable measurements! Besed on tat data I would assume UDP to be even less exotic and hence handled even less special and hence more like tcp?
>
>>
>> Think of the router ASICs that handle the 'normal' traffic in the ASIC in the card, but 'unusual' traffic needs to be sent to the core CPU to be processed and is therefor MUCH slower
>
> 	Except for my ICMP RTT measurements I still saw quantization steps in accordance with the expected best case RTT for a packet, showing that the slow processing at least is constant and hence easy to get ridd of in measurements...

yeah, I have to remind myself of the "perfect is the enemy of good enough" 
frequently as well. I tend to fall into that trap pretty easily, as this 
discussion has shown :-)

ping is easy to test. As a thought, is the response time of NTP queries any more 
or less stable?

>>>>>> One thought I have is to require a high TTL on the packets for the 
>>>>>> services to respond to them. That way any abuse of the service would have 
>>>>>> to take place from very close on the network.
>>>>>>
>>>>>> Ideally these services would only respond to senders that are directly 
>>>>>> connected, but until these services are deployed and enabled by default, 
>>>>>> there is going to be a need to be the ability to 'jump over' old 
>>>>>> equipment. This need will probably never go away completely.
>>>>>
>>>>> 	But if we need to modify DSLAMs and CMTSs it would be much nicer if we 
>>>>> could just ask nicely what the current negotiated bandwidths are ;)
>>>>
>>>> negotiated bandwith and effective bandwidth are not the same
>>>>
>>>> what if you can't talk to the devices directly connected to the DSL line, 
>>>> but only to a router one hop on either side?
>>>
>>> 	In my limited experience the typical bottleneck is the DSL line, so if 
>>> we shape for that we are fine… Assume for a moment the DSLAM uplink is so 
>>> congested because of oversubscription of the DSLAM, that now this 
>>> constitutes the bottleneck. Now the available bandwidth for each user 
>>> depends on the combined traffic of all users, not a situation we can 
>>> reasonable shape for anyway (I would hope that ISPs monitor this situation 
>>> and would remedy it by adding uplink capacity, so this hopefully is just a 
>>> transient event).
>>
>> for DSL you are correct, it's a point-to-point connection (star network 
>> topology), but we have other technologies used in homes that are shared-media 
>> bus topology networks. This includes cablemodems and wireless links.
>
> 	Well, yes I understand, but you again would assume that the cable ISP 
> tries to provision the system so that most users are happy, so congestion is 
> not the rule? Even then I think cable guarantees some minimum rates per user, 
> no? With wireless it is worse in that RF events outside of the ISP and end 
> users control can ruin the day.

guarantee is too strong a word. It depends on how much competition there is.

15 years or so ago I moved from a 3Mb cablemodem to a 128K IDSL line and saw my 
performance increase significantly.

>>>> for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.
>>>
>>> 	http://wiki.openwrt.org/toh/tp-link/td-w8970 or
>>
>> no 5GHz wireless?
>
> 	Could be, but definitely reasonable priced, probany cheap enough to use as smart de-bloated DSL modem, so your main router does not need HTB traffic shaping on uplink anymore. I might actually go that route since I really dislike my ISP primary router, but I digress...
>
>>
>>> http://www.traverse.com.au/products ?
>>
>> I couldn't figure out where to buy one through their site.
>
> 	Maybe they only sell in AU, I guess I just wanted to be helpful,
>
>>
>>> If you had the DSL modem in the router under cerowrts control you would not need to use a traffic shaper for your uplink, as you could apply the BQL ideas to the ADSL driver.
>>>
>>>>
>>>> If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.
>>>
>>> 	Not exactly meaningless, if gives you an upper bound...
>>
>> true, but is an upper bound good enough? How close does the estimate need to be?
>
> 	If we end up recommending people using say binary search to find the 
> best tradeoff (maximizing throughput while keeping the maximum latency under 
> load increase bounded to say 10ms) we should have an idea where to start, so 
> bit to large is fine as a starting point. Traditionally the recommendation was 
> around 85% of link rates, but that never came with a decent justification or 
> data.

well, if we are doing a binary search, having the initial estimate off by a lot 
isn't actually going to hurt much, we'll still converge very quickly on the 
right value

>>
>> and does it matter if both sides are doign fq_codel or is this still in the mode of trying to control the far side indirectly?
>
> 	Yes, this is only relevant as long as both sides of the bottleneck link 
> are not de-bloated. But it does not look like DSLAMs/CMTs will change any time 
> soon from the old ways...

yep, I had been forgetting this.

>>
>>>> In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)
>>>
>>> 	One more case for measuring the link speed continuously!
>>
>> at what point does the measuring process interfere with the use of the link? or cause other upstream issues.
>
> 	If my measuring by sparse stream idea works out the answer to both questions is not much ;)
>
>>
>>>>>> Other requirements or restrictions?
>>>>>
>>>>> 	I think the measurement should be fast and continuous…
>>>>
>>>> Fast yes, because we want to impact the network as little as possible
>>>>
>>>> continuous?? I'm not so sure. Do conditions really change that much?
>>>
>>> 	You just gave an example above for changing link conditions, by shared media...
>>
>> but can you really measure fast enough to handle shared media? at some point you need to give up measuring because by the time you have your measurement it's obsolete.
>
> 	So this is not going to work well a wifi wlan with wildly fluctuating rates (see Dave’s upcoming project make-wifi-fast) but for typical cable node where congestion changes over the day as a function of people being at home it might be fast enough.
>
>>
>> If you look at networking with a tight enough timeframe, it's either idle or 100% utilized depending on if a bit is being sent at that instant, however a plot at that precision is worthless :-)
>
> 	Yes I think a moving average over some time would be required.
>
>>
>>>> And as I ask in the other thread, how much does it hurt if your estimates are wrong?
>>>
>>> 	I think I sent a plot to that regard.
>>
>> yep, our mails are crossing
>>
>>>>
>>>> for wireless links the conditions are much more variable, but we don't really know what is going to work well there.
>>>
>>> 	Wireless as in point 2 point links or in wifi?
>>
>> both, point-to-point is variable based on weather, trees blowing in the wind, interference, etc. Wifi has a lot more congestion, so interference dominates everything else.
>
> 	So maybe that is a diffent kettle of fish then.

I think we need to get a simple, repeatable test together and then have people 
start using it and reporting what they find and the type of connection they are 
on, otherwise we are speculating from far too little data.

David Lang

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 22:24             ` David Lang
@ 2014-07-27  9:50               ` Sebastian Moeller
  0 siblings, 0 replies; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-27  9:50 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

Hi David,

On Jul 27, 2014, at 00:24 , David Lang <david@lang.hm> wrote:

> On Sat, 26 Jul 2014, David Lang wrote:
> 
>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>> 
>>> On Jul 26, 2014, at 22:39 , David Lang <david@lang.hm> wrote:
>>>> by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
>>> 
>>> 	Good question.
>>>> Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
>>>> the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
>>> 
>>> 	As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
> 
> is this with BQL/fq_codel in both directions or only in one direction?

	So by shaping to below line rate the bottleneck is actually happening inside cerowrt and there I run BQL (which does not matter since due to shaping the NICs buffer does not fill up anyway) and fq_codel in both directions.

Best Regards
	Sebastian


> 
> David Lang
> 
>> what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping)
>> 
>> I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are 'close'
>> 
>> assuming that latency under load without the improvents got >1000ms
>> 
>> fast-slow (in ms)
>> ideal=10
>> untuned=43
>> bloated > 1000
>> 
>> fast/slow
>> ideal = 1.25
>> untuned = 1.83
>> bloated > 19
>> 
>> slow/fast
>> ideal = 0.8
>> untuned = 0.55
>> bloated = 0.05
>> 
>> rather than looking at how much worse it is than the ideal, look at how much closer it is to the ideal than to the bloated version.
>> 
>> David Lang
>> 
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-27  0:49                   ` David Lang
@ 2014-07-27 11:17                     ` Sebastian Moeller
  0 siblings, 0 replies; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-27 11:17 UTC (permalink / raw)
  To: David Lang; +Cc: cerowrt-devel

Hi David,

On Jul 27, 2014, at 02:49 , David Lang <david@lang.hm> wrote:

> On Sun, 27 Jul 2014, Sebastian Moeller wrote:
> 
>> On Jul 27, 2014, at 00:53 , David Lang <david@lang.hm> wrote:
>> 
>>> On Sun, 27 Jul 2014, Sebastian Moeller wrote:
>>> 
>>>> Hi David,
>>>> 
>>>> On Jul 26, 2014, at 23:45 , David Lang <david@lang.hm> wrote:
>>>> 
>>>>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>>>>> 
>>>>>> On Jul 26, 2014, at 22:39 , David Lang <david@lang.hm> wrote:
>>>>>> 
>>>>>>> by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
>>>>>> 
>>>>>> 	Good question.
>>>>>> 
>>>>>>> 
>>>>>>> Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
>>>>>>> 
>>>>>>> the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
>>>>>> 
>>>>>> 	As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
>>>>> 
>>>>> what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping)
>>>> 
>>>> 	So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for > 1000ms, just as with the broken line, with proper shaping even extreme pings stay < 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes > 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely...
>>> 
>>> just so I understand this completely
>>> 
>>> you have
>>> 
>>> debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?
>> 
>> 	Well more like:
>> 
>> 	Macbook with dubious bloat-state -> wifi to de-bloated cerowrt box that shapes the traffic -> ISP router -> ADSL -> internet -> server
>> 
>> I assume that Dave debated these servers well, but it should not really matter as the problem are the buffers on both ends of the bottleneck ADSL link.
> 
> right, I was forgetting that unless you are the bottleneck, you aren't buffering anything and so debloating makes no difference. In a case like yours where you can't debloat the actual bottleneck, the best that you can do is to artificially become the bottleneck by shaping the traffic. but on the download side it's much harder.

Actually, all RRUL plots that Dave collected show that ingress shaping does work quite well on average. It will fail with a severe DOS, but let’s face it these can only be mitigated by the ISP anyways… 


> 
> What are we aiming for? something that will show the problem clearly so that fixes can be put in the right place? or a work-around to use in the meantime?

	Mmmh, I aim for decent internet connections for home-iusers like myself. It would be great if ISPs could use their leverage on equipment manufacturers to implement the current state of the art solution in broadband gear; realistically even if this would start like today we still face a long transition time, so I am all for putting the smarts into home-router’s. At least the end user has enough incentive to put in (small amount of) work required to mitigate bad buffer management...

> 
> I think both need to be pursued, but we need to be clear on what is being done for each one.

	I have no connection into telco’s, ISPs, nor OEMs, so all I can help with is getting the “work-around” in good shape and ready for deployment. Arguably convincing ISPs might be more important.

> 
> If having BQL+fq_codel with defaults would solve the problem if it was on the right routers, we need to show that.

	I think Dave has pretty much shown this. Note though that it is rather traffic shaping and fq_codel, BQL would be needed in the DSL drivers on both sides of the link.

> 
> Then, because we can't get the fixes on the right routers and need to work-around the problem by artificially becoming the bottleneck, we need to show that the 95% that we shape to is throwing away 5% of your capacity and make that clear to the users.

	I think if you google for “router qos” you will find plenty of pages already describing the rational and bandwidth sacrifice required, so that knowledge might already be in the public knowledge.

> 
> otherwise we will risk getting to the point where it will never get fixed because the ISPs will look at their routers and say that bufferbloat can't possibly be a problem as they never have large queues (because we are doing the workarounds.

	Honestly, for an ISP the best solution is us shaping our connections as that reduces the worst case bandwidth use per user and might allow higher oversubscription. We need to find economical incentives for ISPs to implement BQL equivalents in the broadband gear. In theory it should give a competitive advantage to be able to advertise better gaming/void suitability but many users really have no real choice of ISP. I cold imagine that the big push away from switched circuit telephony to voip even for carriers ISPs might get more interested in improving VOIP resilience unhand usability under load...

> 
> 
>>> and are you measuring the latency impact when uploading or downloading?
>> 
>> 	No I measure the impact of latency of saturating both up- and downlink, pretty much the worst case scenario.
> 
> I think we need to test this in each direction independently.

	Rich Brown has made a nice script to test that, betterspeedtest.sh at https://github.com/richb-hanover/CeroWrtScripts
For figuring out the required shaping point it is easier to work on both “legs” independently, But to assess worst case behavior I think both directions need to be saturated.
There is a pretty good description of a quick bufferloat test on http://www.bufferbloat.net/projects/cerowrt/wiki/Quick_Test_for_Bufferbloat



> 
> Cerowrt can do a pretty good job of keeping the uplink from being saturated, but it can't do a lot for the downlink.

	Well, except it does. Downlink shaping is less reliable than uplink shaping. Most traffic sources, TCP or UDP actually need to deal with the variable bandwidth of the internet anyway and implement some congestion control, that needs to deal with packet loss as congestion signal. So the downlink shaping mostly works okay (even though I think Dave recommends to shape downlink more aggressive than 95% of link rate)

> 
>>> 
>>> I think a lot of people would be happy with 95ms average pings on a loaded connection, even with occasional outliers.
>> 
>> 	No that is too low an aim, this still is not useable for real time applications, we should aim for base RTT plus 10ms. (For very slow links we need to cut some slack but for > 3Mbps 10ms should be achievable )
> 
> perfect is the enemy of good enough.

	Sure but really according to http://www.hh.se/download/18.70cf2e49129168da015800094780/7_7_delay.pdf we only have a 400ms budget for acceptable voip (I would love real psychophysics papers for that instead of cisco marketing material), or 200ms oneway delay. With ~170ms RTT to the west coast (from university wired network, so no ADSL delay involved) almost half of the budget is used up in a way that can not be fixed easily. (It takes 66ms for light to travel the distance of half the earth’s circumference, or 132ms RTT, or assuming c(fiber) = 0.7* c(vacuum) rather 95ms one-way of 190ms RTT). With ~100ms RTT from each end there is barely enough time left for data processing and transcoding.

> 
> There's achievable if every router is tuned to exactly the right conditions and there's achievable for course settings that can be widely deployed. Get the second out while continuing to work on making the first easier.

	Okay so that is easy, if you massively overshsape latency will be great, but bandwidth is compromised...

> 
> residential connections only come in a smallish number of sizes,

	Except that with say DSL there is often a wide corridor for allowed sync speed, e.g. the 50Mbps down / 10Mbps up vdsl2 packet of DT actually will synchronize in a corridor of 50 to 27Mbps and 10 to 5.5 Mbps (numbers are approximately right), That is almost a factor of 2, too much for a one size fits all approach (say 90% of advertised speed).

> it shouldn't be too hard to do a few probes and guess which size is in use, then set the bandwith to 90% of that standard size and you should be pretty good without further tuning.

	No, with ATM carriers (ADSL, some VDSL) the encapsulation overhead ranges from ~10% to >50% depending on packet size, so to get the bottleneck queue reliable under our control we would need to shape to ~50% of link speed, obviously a very hard sell . (And it is not easy to figure out whether the bottleneck link uses ATM or not, so there is no one size fits all). We currently have no easy and quick way of detecting ATM link layers from cerowrt...


> 
>>> It's far better than sustained multi-second ping times which is what I've seen with stock setups.
>> 
>> 	True, but compared to multi seconds even <1000ms would be a really great improvement, but also not enough.
>> 
>>> 
>>> but if no estimate is this bad, how bad is it if you use as your estimate the 'rated' speed of your DSL (i.e. what the ISP claims they are providing you) instead of the fully accurate speed that includes accounting for ATM encapsulation?
>> 
>> 	Well ~95ms with outliers > 1000ms, just as bad as no estimate. I shaped 5% below rated speed as reported by the DSL modem, so disabling the ATM link layer adjustments (as shown in the broken lines in the plot), basically increased the effective shaped rate by ~13% or to effectively 107% of line rate, your proposal would be line rate and no link layer adjustments or effectively 110% of line rate; I do not feel like repeating this experiment right now as I think the data so far shows that even with less misjudgment the bloat effect is fully visible ) Not accounting for ATM framing carries a ~10% cost in link speed, as ATM packet size on the wire increases by >= ~10%.
> 
> so what if you shape to 90% of rated speed (no allowance for ATM vs other transports)?

	I have not done that but the typical recommendation for ADSL links for shaping without taking the link layer peculiarities into account is 85% (which should work for large packets, but can easily melt down with lots of smallish packets, like voip calls). I repeat there is no simple one-size fits all shaping that will solve the buffer bloat issue for most home-users in a acceptable fashion. (And I am not talking perfekt here, it simply is not good enough). Note that 90% will just account for the 48in53 ATM transport cost, it will not take the increased per packet header into account.

> 
>>> It's also worth figuring out if this problem would remain in place if you didn't have to go through the ISP router and were runing fq_codel on that router.
>> 
>> 	If the DSL modem would be debloated at least on upstream no shaping would be required any more; but that does not fix the need for downstream shaping (and bandwidth estimation) until the head end gear is debloated..
> 
> right, I was forgetting this earlier.
> 
>>> As long as fixing bufferbloat involves esoteric measurements and tuning, it's not going to be solved, but if it could be solved by people flahing openwrt onto their DSL router and then using the defaults, it could gain traction fairly quickly.
>> 
>> 	But as there are only very few DSL modems with open sources (especially of the DSL chips) this just as esoteric ;) Really if equipment manufactures could be convinced to take these issues seriously and actually fix their gear that would be best. But this does not look like it is happening on the fast track. (Even DOCSIS developer cable labs punted on requiring codel or fq_codel in DOCSIS modems since the think that the required timestamps are to “expensive” on the device class they want to use for modems. They opted for PIE, much better than what we have right now but far away from my latency under load increase of 10ms...)
>> 
>>> 
>>>>> I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are ‘close'
>>>> 
>>>> 	Well if we include outliers (and we should as enough outliers will degrade the FPS and voip suitability of an otherwise responsive system quickly) stock and improper shaping are in the >1000ms worst case range, while proper SQM bounds this to 100ms.
>>>> 
>>>>> 
>>>>> assuming that latency under load without the improvents got >1000ms
>>>>> 
>>>>> fast-slow (in ms)
>>>> ideal=10
>>>> untuned=43
>>>> bloated > 1000
>>>> 
>>>> 	The sign seems off as fast < slow? I like this best ;)
>>> 
>>> yep, I reversed fast/slow in all of these
>>> 
>>>>> 
>>>>> fast/slow
>>>>> ideal = 1.25
>>>>> untuned = 1.83
>>>>> bloated > 19
>>>> 
>>>> 	But Fast < Slow and hence this ration should be <0?
>>> 
>>> 1 not 0, but yes, this is really slow/fast
>>> 
>>>>> slow/fast
>>>>> ideal = 0.8
>>>>> untuned = 0.55
>>>>> bloated = 0.05
>>>>> 
>>>> 
>>>> 	and this >0?
>>> 
>>> and this is really fast/slow
>> 
>> 
>> 	What about taking the latency difference an re;aging it with a reference time, like say the time a photon would take to travel once around the equator, or the earth’s diamater?
> 
> how about latency difference scaled by the time to send one 1500 byte packet at the measured throughput?

	So you propose latency difference / time to send one full packet at the measured speed
	
Not sure: think two de-bloated setups, one fast one slow: for the slow link we get 10ms/long for a fast link we get 10ms/short, so assuming that both keep the 10ms average latency increase why should both links show different bloat-measure?
I really think the raw latency difference is what we should convince the users to look at. All one-number measures are going to be too simplistic, but at least for the difference you can easily estimate the effect on RTTs for relevant traffic...

> 
> This would factor out the data rate and would not be affected by long distance links.

	I am not convinced that people on a slow link can afford latency increases any better than people on a fast link. I actually think that it is the other way round. During the tuning process your measure might be helpful to find a good tradeoff between bandwidth and latency increase though.

Best Regards
	Sebastian

> 
> David Lang


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-27  1:04                   ` David Lang
@ 2014-07-27 11:38                     ` Sebastian Moeller
  2014-08-01  4:51                       ` Michael Richardson
  0 siblings, 1 reply; 51+ messages in thread
From: Sebastian Moeller @ 2014-07-27 11:38 UTC (permalink / raw)
  To: David Lang; +Cc: Wes Felter, cerowrt-devel


On Jul 27, 2014, at 03:04 , David Lang <david@lang.hm> wrote:

> On Sun, 27 Jul 2014, Sebastian Moeller wrote:
[...]
>> 
>>> 
>>> Think of the router ASICs that handle the 'normal' traffic in the ASIC in the card, but 'unusual' traffic needs to be sent to the core CPU to be processed and is therefor MUCH slower
>> 
>> 	Except for my ICMP RTT measurements I still saw quantization steps in accordance with the expected best case RTT for a packet, showing that the slow processing at least is constant and hence easy to get ridd of in measurements...
> 
> yeah, I have to remind myself of the "perfect is the enemy of good enough" frequently as well. I tend to fall into that trap pretty easily, as this discussion has shown :-)
> 
> ping is easy to test. As a thought, is the response time of NTP queries any more or less stable?

	No idea? How would you test this (any command line to try). The good thingg with the ping is that often even the DSLAM responds keeping external sources (i.e. hops further away in the network) of variability out of the measurement...

> 
>>>>>>> One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
>>>>>>> 
>>>>>>> Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
>>>>>> 
>>>>>> 	But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
>>>>> 
>>>>> negotiated bandwith and effective bandwidth are not the same
>>>>> 
>>>>> what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side?
>>>> 
>>>> 	In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription of the DSLAM, that now this constitutes the bottleneck. Now the available bandwidth for each user depends on the combined traffic of all users, not a situation we can reasonable shape for anyway (I would hope that ISPs monitor this situation and would remedy it by adding uplink capacity, so this hopefully is just a transient event).
>>> 
>>> for DSL you are correct, it's a point-to-point connection (star network topology), but we have other technologies used in homes that are shared-media bus topology networks. This includes cablemodems and wireless links.
>> 
>> 	Well, yes I understand, but you again would assume that the cable ISP tries to provision the system so that most users are happy, so congestion is not the rule? Even then I think cable guarantees some minimum rates per user, no? With wireless it is worse in that RF events outside of the ISP and end users control can ruin the day.
> 
> guarantee is too strong a word. It depends on how much competition there is.
> 
> 15 years or so ago I moved from a 3Mb cablemodem to a 128K IDSL line and saw my performance increase significantly.

	I used to think exactly the same, but currently I tend to think that the difference is about how well managed a node is not so much the access technology, with DSL the shared medium is the link connecting the DSLAM to the backbone, if this is congested it is similar to a busy cable node. In both cases the ISP needs to make sure the shared segments congestion is well managed. I might be that DSLAMs are typically better manages as TELCO’s always dealt with interactive (bi-directional) traffic while cable traditionally was a one-directional transport. So I assume both have different traditions about provisioning. I could be off my rocker here ;)


> 
>>>>> for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.
>>>> 
>>>> 	http://wiki.openwrt.org/toh/tp-link/td-w8970 or
>>> 
>>> no 5GHz wireless?
>> 
>> 	Could be, but definitely reasonable priced, probany cheap enough to use as smart de-bloated DSL modem, so your main router does not need HTB traffic shaping on uplink anymore. I might actually go that route since I really dislike my ISP primary router, but I digress...
>> 
>>> 
>>>> http://www.traverse.com.au/products ?
>>> 
>>> I couldn't figure out where to buy one through their site.
>> 
>> 	Maybe they only sell in AU, I guess I just wanted to be helpful,
>> 
>>> 
>>>> If you had the DSL modem in the router under cerowrts control you would not need to use a traffic shaper for your uplink, as you could apply the BQL ideas to the ADSL driver.
>>>> 
>>>>> 
>>>>> If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.
>>>> 
>>>> 	Not exactly meaningless, if gives you an upper bound...
>>> 
>>> true, but is an upper bound good enough? How close does the estimate need to be?
>> 
>> 	If we end up recommending people using say binary search to find the best tradeoff (maximizing throughput while keeping the maximum latency under load increase bounded to say 10ms) we should have an idea where to start, so bit to large is fine as a starting point. Traditionally the recommendation was around 85% of link rates, but that never came with a decent justification or data.
> 
> well, if we are doing a binary search, having the initial estimate off by a lot isn't actually going to hurt much, we'll still converge very quickly on the right value

	Yes, but we still need to solve the question what infrastructure to test against ;)
> 
>>> 
>>> and does it matter if both sides are doign fq_codel or is this still in the mode of trying to control the far side indirectly?
>> 
>> 	Yes, this is only relevant as long as both sides of the bottleneck link are not de-bloated. But it does not look like DSLAMs/CMTs will change any time soon from the old ways...
> 
> yep, I had been forgetting this.
> 
>>> 
>>>>> In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)
>>>> 
>>>> 	One more case for measuring the link speed continuously!
>>> 
>>> at what point does the measuring process interfere with the use of the link? or cause other upstream issues.
>> 
>> 	If my measuring by sparse stream idea works out the answer to both questions is not much ;)
>> 
>>> 
>>>>>>> Other requirements or restrictions?
>>>>>> 
>>>>>> 	I think the measurement should be fast and continuous…
>>>>> 
>>>>> Fast yes, because we want to impact the network as little as possible
>>>>> 
>>>>> continuous?? I'm not so sure. Do conditions really change that much?
>>>> 
>>>> 	You just gave an example above for changing link conditions, by shared media...
>>> 
>>> but can you really measure fast enough to handle shared media? at some point you need to give up measuring because by the time you have your measurement it's obsolete.
>> 
>> 	So this is not going to work well a wifi wlan with wildly fluctuating rates (see Dave’s upcoming project make-wifi-fast) but for typical cable node where congestion changes over the day as a function of people being at home it might be fast enough.
>> 
>>> 
>>> If you look at networking with a tight enough timeframe, it's either idle or 100% utilized depending on if a bit is being sent at that instant, however a plot at that precision is worthless :-)
>> 
>> 	Yes I think a moving average over some time would be required.
>> 
>>> 
>>>>> And as I ask in the other thread, how much does it hurt if your estimates are wrong?
>>>> 
>>>> 	I think I sent a plot to that regard.
>>> 
>>> yep, our mails are crossing
>>> 
>>>>> 
>>>>> for wireless links the conditions are much more variable, but we don't really know what is going to work well there.
>>>> 
>>>> 	Wireless as in point 2 point links or in wifi?
>>> 
>>> both, point-to-point is variable based on weather, trees blowing in the wind, interference, etc. Wifi has a lot more congestion, so interference dominates everything else.
>> 
>> 	So maybe that is a diffent kettle of fish then.
> 
> I think we need to get a simple, repeatable test together and then have people start using it and reporting what they find and the type of connection they are on, otherwise we are speculating from far too little data.

	So Rich Brown’s betterspeedtest.sh is a simple test, at least for the crowd of people involved in the buffer bloat discussion right now. I always love to see more data, especially I would be interested to see data from VDSL1 lines and GPON fiber lines…

Best Regards
	Sebastian
> 
> David Lang


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 11:30     ` Sebastian Moeller
  2014-07-26 20:39       ` David Lang
@ 2014-08-01  4:21       ` Michael Richardson
  2014-08-01 18:28         ` Sebastian Moeller
  1 sibling, 1 reply; 51+ messages in thread
From: Michael Richardson @ 2014-08-01  4:21 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1183 bytes --]

On symmetric links, particularly PPP ones, one can use the LCP layer to do
echo requests to the first layer-3 device.  This can be used to measure RTT
and through some math, the bandwidth.

On assymetric links, my instinct is that if you can measure the downlink
speed through another mechanism, that one might be able to subtract, but I
can't think exactly how right now.
I'm thinking that one can observe the downlink speed by observing packet
arrival times/sizes for awhile --- the calculation might be too low if the
sender is congested otherwise, but the average should go up slowly.

At first, this means that subtracting the downlink bandwidth from the uplink
bandwidth will, I think, result in too high an uplink speed, which will
result in rate limiting to a too high value, which is bad.  

But, if there something wrong with my notion?

My other notion is that the LCP packets could be time stamped by the PPP(oE)
gateway, and this would solve the asymmetry.   This would take an IETF action
to make standard and a decade to get deployed, but it might be a clearly
measureable marketing win for ISPs.

-- 
Michael Richardson
-on the road-

[-- Attachment #2: Type: application/pgp-signature, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-26 11:18     ` Sebastian Moeller
  2014-07-26 20:21       ` David Lang
@ 2014-08-01  4:40       ` Michael Richardson
  1 sibling, 0 replies; 51+ messages in thread
From: Michael Richardson @ 2014-08-01  4:40 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Wes Felter, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1945 bytes --]


Sebastian Moeller <moeller0@gmx.de> wrote:
    >> The trouble is that to measure bandwidth, you have to be able to send
    >> and receive a lot of traffic.

    > 	Well that is what you typically do, but you can get away with less
    > measurement traffic: in an ideal quiescent network sending two packets
    > back to back should give you the bandwidth (packet size / incoming time
    > difference of both packets), or send two packets of different size
    > (needs synchronized clocks, then difference of packet sizes /
    > difference of transfer times).

Apparently common 802.1ah libraries in most routers can do speed tests at
layer-2 for ethernet doing exactly this.  (Apparently, one vendor's code is
in 90% of equipment out there, cause some of this stuff invoves intimate
knowledge of PHYs and MII buses, and it's not worth anyone's time to write
the code over again vs licensing it...)

    > 	But this still requires some service on the other side. You could try
    > to use ICMP packets, but these will only allow to measure RTT not
    > one-way delays (if you do this on ADSL you will find the RTT dominated
    > by the typically much slower uplink path). If network equipment would

And correct me if I'm wrong, if you naively divide by two,  you wind up
overestimating the uplink speed.

    >> you can't just test that link, you have to connect to something beyond
    >> that.

    > 	So it would be sweet if we could use services that are running on the
    > machines anyway, like ping. That way the “load” of all the leaf nodes
    > of the internet continuously measuring their bandwidth could be handled
    > in a distributed fashion avoiding melt-downs by synchronized
    > measurement streams…

sadly, ICMP responses are rate limited, even when they are implemented in the
fast path.  PPP's LCP is not, AFAIK.

-- 
Michael Richardson
-on the road-







[-- Attachment #2: Type: application/pgp-signature, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-07-27 11:38                     ` Sebastian Moeller
@ 2014-08-01  4:51                       ` Michael Richardson
  2014-08-01 18:04                         ` Sebastian Moeller
  0 siblings, 1 reply; 51+ messages in thread
From: Michael Richardson @ 2014-08-01  4:51 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Wes Felter, cerowrt-devel

[-- Attachment #1.1: Type: text/plain, Size: 972 bytes --]

Sebastian Moeller <moeller0@gmx.de> wrote:
    > 	No idea? How would you test this (any command line to try). The good
    > thingg with the ping is that often even the DSLAM responds keeping
    > external sources (i.e. hops further away in the network) of variability
    > out of the measurement...

With various third-party-internet-access ("TPIA" in Canada),  the DSLAM
is operated by the incumbent (monopoly) telco, and the layer-3 first hop
is connected via PPPoE-VLAN or PPP/L2TP.  The incumbent telco has significant
incentive to make the backhaul network as congested and bufferbloated as
possible, and to mis-crimp cables so that the DSL resyncs at different speeds
regularly...   my incumbent telco's commercial LAN extension salesperson
proudly told me how they never drop packets, even when their links are
congested!!!

The Third Party ISP has a large incentive to deploy equipment that supports
whatever "bandwidth measurement" service we might cook up.

[-- Attachment #1.2: Signature --]
[-- Type: text/plain, Size: 54 bytes --]

-- 
Michael Richardson
-on the road-

[-- Attachment #2: Type: application/pgp-signature, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-08-01  4:51                       ` Michael Richardson
@ 2014-08-01 18:04                         ` Sebastian Moeller
  2014-08-02 20:17                           ` Michael Richardson
  0 siblings, 1 reply; 51+ messages in thread
From: Sebastian Moeller @ 2014-08-01 18:04 UTC (permalink / raw)
  To: Michael Richardson; +Cc: Wes Felter, cerowrt-devel

Hi MIchael,

On Aug 1, 2014, at 06:51 , Michael Richardson <mcr@sandelman.ca> wrote:

> 
> Sebastian Moeller <moeller0@gmx.de> wrote:
>> 	No idea? How would you test this (any command line to try). The good
>> thingg with the ping is that often even the DSLAM responds keeping
>> external sources (i.e. hops further away in the network) of variability
>> out of the measurement...
> 
> With various third-party-internet-access ("TPIA" in Canada),  the DSLAM
> is operated by the incumbent (monopoly) telco, and the layer-3 first hop
> is connected via PPPoE-VLAN or PPP/L2TP.  

	So they “own” the copper lines connecting each customer to the DSLAM? And everybody else just rents their DSL service and resells them? Do they really connect to the DSLAM or to the BRAS?

> The incumbent telco has significant
> incentive to make the backhaul network as congested and bufferbloated as
> possible, and to mis-crimp cables so that the DSL resyncs at different speeds
> regularly…  

	I think in Germany the incumbent has to either rent out the copper lines to competitors (who can put their own lines cards in DSLAMs backed by their own back-bone) or rent “bit-stream” access that is the incumbent handles the DSL part on both ends and passes the traffic either in the next central office or at specific transit points. I always assumed competitors renting these services would get much better guarantees than end-customers, but it seems in Canada the incumbent has more found ways to evade efficient regulation.

> my incumbent telco's commercial LAN extension salesperson
> proudly told me how they never drop packets, even when their links are
> congested!!!

	I really hope this is the opinion of a sales person and not the network operators who really operate the gear in the “field”. On the other hand having sufficient buffering in the DSLAM to never having to drop a packet sounds quite manly (and a terrible waste of otherwise fine DRAM chips) ;)

> 
> The Third Party ISP has a large incentive to deploy equipment that supports
> whatever "bandwidth measurement" service we might cook up.

	As much as I would like to think otherwise, the only way to get a BMS in the field is if all national regulators require it by law (well maybe if ITU would bake it into the next xDSL standard that the DSLAM has to report current line speeds as per SNMP? back to all down stream devices asking for it). But I am not holding my breath…

Best Regards
	Sebastian

> 
> -- 
> Michael Richardson
> -on the road-
> 
> 
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-08-01  4:21       ` Michael Richardson
@ 2014-08-01 18:28         ` Sebastian Moeller
  0 siblings, 0 replies; 51+ messages in thread
From: Sebastian Moeller @ 2014-08-01 18:28 UTC (permalink / raw)
  To: Michael Richardson; +Cc: cerowrt-devel

HI Michael,


On Aug 1, 2014, at 06:21 , Michael Richardson <mcr@sandelman.ca> wrote:

> 
> On symmetric links, particularly PPP ones, one can use the LCP layer to do
> echo requests to the first layer-3 device.  This can be used to measure RTT
> and through some math, the bandwidth.

	Sure.

> 
> On assymetric links, my instinct is that if you can measure the downlink
> speed through another mechanism, that one might be able to subtract, but I
> can't think exactly how right now.
> I'm thinking that one can observe the downlink speed by observing packet
> arrival times/sizes for awhile --- the calculation might be too low if the
> sender is congested otherwise, but the average should go up slowly.

	If you go this rout, I would rather look at the minimum delay between incoming packets as a function of the size of the second packet.

> 
> At first, this means that subtracting the downlink bandwidth from the uplink
> bandwidth will, I think, result in too high an uplink speed, which will
> result in rate limiting to a too high value, which is bad.

	But given all the uncertainties right now finding the proper shaping bandwidths is an iterative process anyway, but one that is best started with a decent initial guess. My thinking is that with binary search I would want to definitely see decent latency under load after the first reduction...

>  
> 
> But, if there something wrong with my notion?
> 
> My other notion is that the LCP packets could be time stamped by the PPP(oE)
> gateway, and this would solve the asymmetry.  

	If both devices are time synchronized to a close enough delta that would be great. Initial testing with icmp timestamp request makes me doubt the quality of synchronization (at least right now).

> This would take an IETF action
> to make standard and a decade to get deployed, but it might be a clearly
> measureable marketing win for ISPs.

	But if the “grown ups” can be made to act wouldn’t we rather see nice end-user query-able SNMP information about the current up and downlink rates (and in what protocol level, e.g. 2400Mbps down, 1103Kbps up ATM carrier) (For all I know the DSLAMs/BRASes might already support this)

Best Regards
	Sebastian

> 
> -- 
> Michael Richardson
> -on the road-
> 
> 
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-08-01 18:04                         ` Sebastian Moeller
@ 2014-08-02 20:17                           ` Michael Richardson
  0 siblings, 0 replies; 51+ messages in thread
From: Michael Richardson @ 2014-08-02 20:17 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Wes Felter, cerowrt-devel


Sebastian Moeller <moeller0@gmx.de> wrote:
    >> Sebastian Moeller <moeller0@gmx.de> wrote:
    >>> No idea? How would you test this (any command line to try). The good
    >>> thingg with the ping is that often even the DSLAM responds keeping
    >>> external sources (i.e. hops further away in the network) of
    >>> variability out of the measurement...
    >>
    >> With various third-party-internet-access ("TPIA" in Canada), the DSLAM
    >> is operated by the incumbent (monopoly) telco, and the layer-3 first
    >> hop is connected via PPPoE-VLAN or PPP/L2TP.

    > 	So they “own” the copper lines connecting each customer to the DSLAM?
    > And everybody else just rents their DSL service and resells them? Do
    > they really connect to the DSLAM or to the BRAS?

correct, the copper continues to be regulated; the incumbent was given a
guaranteed 11-14% profit on that service for the past 75 years...

Third parties get an NNI to the incumbent in a data centre.
1) for bridged ethernet DSL service ("HSA" in Bell Canada land), the
   each customer shows up to the ISP in a VLAN tag.
2) for PPPoE DSL service, the traffic comes in a specific VLAN, over
   IP (RFC1918) via L2TP.

Other parties can put copper in the ground, and in some parts of Canada, this
has occured.  Also worth mentioning that
AlbertaGovernmentTelephone/EdmontonTel/BCTel became "TELUS", and then left
the Stentor/Bell-Canada alliance, so Bell can be the third party in the west,
while Telus is the third party in the centre, and Island/Aliant/NBTel/Sasktel
remain government owned... and they actually do different things as a result.

    > 	I think in Germany the incumbent has to either rent out the copper
    > lines to competitors (who can put their own lines cards in DSLAMs
    > backed by their own back-bone) or rent “bit-stream” access that is the
    > incumbent handles the DSL part on both ends and passes the traffic
    > either in the next central office or at specific transit points. I
    > always assumed competitors renting these services would get much better
    > guarantees than end-customers, but it seems in Canada the incumbent has
    > more found ways to evade efficient regulation.

This option exists, but the number of CLECs is large, and the move towards
VDSL2 / Fiber-To-The-Neighbourhood (with much shorter copper options!!) means
that this is impractical.

    >> my incumbent telco's commercial LAN extension salesperson proudly told
    >> me how they never drop packets, even when their links are congested!!!

    > 	I really hope this is the opinion of a sales person and not the
    > network operators who really operate the gear in the “field”. On the
    > other hand having sufficient buffering in the DSLAM to never having to
    > drop a packet sounds quite manly (and a terrible waste of otherwise
    > fine DRAM chips) ;)

I think much of the buffer is the legacy Nortel Passport 15K that ties much
of the system together...

    >> The Third Party ISP has a large incentive to deploy equipment that
    >> supports whatever "bandwidth measurement" service we might cook up.

    > 	As much as I would like to think otherwise, the only way to get a BMS
    > in the field is if all national regulators require it by law (well
    > maybe if ITU would bake it into the next xDSL standard that the DSLAM
    > has to report current line speeds as per SNMP? back to all down stream
    > devices asking for it). But I am not holding my breath…

My position is that if there isn't a technical specification, no regulation
could possibly follow...

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        | network architect  [
]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-24 14:12 R.
  2014-05-24 17:31 ` Sebastian Moeller
@ 2014-05-24 19:05 ` David P. Reed
  1 sibling, 0 replies; 51+ messages in thread
From: David P. Reed @ 2014-05-24 19:05 UTC (permalink / raw)
  To: R., cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 2259 bytes --]

Depends on the type of the provider. Most providers now have shared paths to the backbone among users and give a peak rate up and down for brief periods that they will not sustain... In fact they usually penalize use of the peak rate by reducing the rate after that.

So at what point they create bloat in their access net is hard to determine. And it depends on your neighbors' behavior as well.

The number you want is the bloatedness of your path through the access provider.

This is measurable by sending small probes back and forth to a measurement server... Measuring instantaneous latency in each direction and combining that information with one's recent history in a non trivial calculation. 

Note that that measurement does not directly produce provider speeds that can be input to the shapers used in codel. But it does produce a queue size that can.

So it's a plausible way to proceed as long as the operators refuse to fix their gear to manage the actual link that is problematic.

Personally I'd suggest that the gear makers' feet be held to the fire... by not "fixing" it by an inferior fix at the home router. Keep the pressure on them at IETF and among their customers.

On May 24, 2014, "R." <redag2@gmail.com> wrote:
>>> I should point out that another issue with deploying fq_codel widely
>is that it requires an accurate measurement (currently) of the
>providers bandwidth.
>
>Pardon my noobiness, but is there a technical obstacle that prevents
>the creation of a user-triggered function on the router side that
>measures the provider's bandwidth?
>
>Function, when (luci-gui?) triggered, would:
>
>1. Ensure that internet connectivity is present.
>2. Disconnect all clients.
>3. Engage in DL and UL on a dedicated web server, measure stats and
>straight up use them in fq_codel -- or suggest them in appropriate
>QoS-gui user-boxes.
>
>Further, this function could be auto-scheduled or made enabled on
>router boot up.
>
>I must be missing something important which prevents this. What is it?
>_______________________________________________
>Cerowrt-devel mailing list
>Cerowrt-devel@lists.bufferbloat.net
>https://lists.bufferbloat.net/listinfo/cerowrt-devel

-- Sent from my Android device with K-@ Mail. Please excuse my brevity.

[-- Attachment #2: Type: text/html, Size: 2781 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-24 14:12 R.
@ 2014-05-24 17:31 ` Sebastian Moeller
  2014-05-24 19:05 ` David P. Reed
  1 sibling, 0 replies; 51+ messages in thread
From: Sebastian Moeller @ 2014-05-24 17:31 UTC (permalink / raw)
  To: R.; +Cc: cerowrt-devel

Hi R, hi List,

On May 24, 2014, at 16:12 , "R." <redag2@gmail.com> wrote:

>>> I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth.
> 
> Pardon my noobiness, but is there a technical obstacle that prevents
> the creation of a user-triggered function on the router side that
> measures the provider's bandwidth?
> 
> Function, when (luci-gui?) triggered, would:
> 
> 1. Ensure that internet connectivity is present.
> 2. Disconnect all clients.
> 3. Engage in DL and UL on a dedicated web server, measure stats and
> straight up use them in fq_codel -- or suggest them in appropriate
> QoS-gui user-boxes.
> 
> Further, this function could be auto-scheduled or made enabled on
> router boot up.
> 
> I must be missing something important which prevents this. What is it?

	Well, I see a couple of challenges that need to be overcome before this could work. 

In your step 3 you touch the issue of measuring the current stats; and somehow what is trickier than one would think:

1) what to measure precisely, a "dedicated web server" sounds like a great idea, but who is dedicating it and where is it located relative to the link under test?
	Rich Brown has made a nice script to measure current throughput and give an estimate on the effect of link saturation on latency (see betterspeedtest.sh from https://github.com/richb-hanover/CeroWrtScripts), but using this from Germany gives:
2014-05-24 15:44:47 Testing against demo.tohojo.dk with 5 simultaneous sessions while pinging gstatic.com (60 seconds in each direction)
	Download:  12.06 Mbps
	Upload:  1.99 Mbps
against a server in Europe, but:
 	Download:  10.42 Mbps
 	Upload:  1.85 Mbps
against a server on the east side of the USA. So the router would need to select a close-by server. Sites as speedtest.net offer this kind of server selection by proximity but do not have a very reliable way to load the link and do not measure the effect of link saturation on the latency… but the whole idea is to find the highest bandwidth that foes not cause indecent increase of latency under load. (Also speed tests are quite stereotypic in observable behavior and length so some ISPs special case these to look good; but that is a different kettle of fish…)
	Note that there is also the question where one would like to measure the linkspeed; for example for DSL there is the link to the DSLAM, the link from the DSLAM to the next network node, sometimes a PPP link to a remote BRAS system (that might throttle the traffic). All of these can be the bottlenecks of the ISP connection (depending on circumstances). My take is that one would like to look at the link between modem and DSLAM as the bottleneck, but the opinions differ (and then there is cable with its shared first segment...). 

2) Some links have quite peculiar properties that are hard to deduce from quick speed tests. For example ATM based ADSL links (this includes all ADSL1, ADSL2 and to my knowledge all existing ADSL2+ links) will show a packetize dependent link speed. In short ATM uses an integer number of 48 byte cells to transport each packet, so worst case it adds 47 bytes to the payload for small packet that can effectively double the size of the packet on the wire, or stared differently half the link speed for packets of that size. (Note thanks to the work of Jesper Brouer and Russel Stuart the linux kernel can take care of that issue for you, but you need to tell the kernel explicitly.)

3) many links actually do not have a constant wire speed available. For docsis (basically cable) the local segment is shared between many users and transmit timeslots are shared between requestors, giving effectively slower links during peak hours. For DSL a resync between DSLAM and modem can (significantly) change the negotiated speed; something cerowrt does not get any notice of…

	I guess buffer bloat mitigation needs to move into the modems and DSLAMs to get rid of the bandwidth guessing game. For cable at least the modems are getting better (thanks to PIE being part of the docsis 3.1? standard), but for DSL I do not think there is any generic solution on the horizon…


Best Regards
	Sebastian






> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
@ 2014-05-24 14:12 R.
  2014-05-24 17:31 ` Sebastian Moeller
  2014-05-24 19:05 ` David P. Reed
  0 siblings, 2 replies; 51+ messages in thread
From: R. @ 2014-05-24 14:12 UTC (permalink / raw)
  To: cerowrt-devel

>> I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth.

Pardon my noobiness, but is there a technical obstacle that prevents
the creation of a user-triggered function on the router side that
measures the provider's bandwidth?

Function, when (luci-gui?) triggered, would:

1. Ensure that internet connectivity is present.
2. Disconnect all clients.
3. Engage in DL and UL on a dedicated web server, measure stats and
straight up use them in fq_codel -- or suggest them in appropriate
QoS-gui user-boxes.

Further, this function could be auto-scheduled or made enabled on
router boot up.

I must be missing something important which prevents this. What is it?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 17:57                 ` Jim Gettys
@ 2014-05-21 18:31                   ` Dave Taht
  0 siblings, 0 replies; 51+ messages in thread
From: Dave Taht @ 2014-05-21 18:31 UTC (permalink / raw)
  To: Jim Gettys; +Cc: Frits Riep, cerowrt-devel

On Wed, May 21, 2014 at 10:57 AM, Jim Gettys <jg@freedesktop.org> wrote:
>
>
>
> On Wed, May 21, 2014 at 1:56 PM, <dpreed@reed.com> wrote:
>>
>> On Wednesday, May 21, 2014 1:53pm, "Dave Taht" <dave.taht@gmail.com> said:
>>
>>
>>
>> > Or we can take a break, and write books about how we learned to relax
>> > and
>> > stop worrying about the bloat.
>>
>>
>>
>> Leading to waistline bloat?
>
>
> We resemble that remark already....
>

I put on 35 pounds since starting to work on this.

-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 15:07     ` Dave Taht
  2014-05-21 16:50       ` Michael Richardson
@ 2014-05-21 17:58       ` David Lang
  1 sibling, 0 replies; 51+ messages in thread
From: David Lang @ 2014-05-21 17:58 UTC (permalink / raw)
  To: Dave Taht; +Cc: Frits Riep, cerowrt-devel

On Wed, 21 May 2014, Dave Taht wrote:

> On Wed, May 21, 2014 at 4:42 AM, Frits Riep <riep@riepnet.com> wrote:
>> Thanks Dave for your responses.  Based on this, it is very good that 
>> qos-scripts is available now through openwrt, and as I experienced, it 
>> provides a huge advantage for most users.
>
> I should point out that another issue with deploying fq_codel widely is that 
> it requires an accurate measurement (currently) of the providers bandwidth.

does it need this accurate measurement for sending or for the recieving pacing?

David Lang


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 17:56               ` dpreed
@ 2014-05-21 17:57                 ` Jim Gettys
  2014-05-21 18:31                   ` Dave Taht
  0 siblings, 1 reply; 51+ messages in thread
From: Jim Gettys @ 2014-05-21 17:57 UTC (permalink / raw)
  To: David P Reed; +Cc: Frits Riep, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 353 bytes --]

On Wed, May 21, 2014 at 1:56 PM, <dpreed@reed.com> wrote:

> On Wednesday, May 21, 2014 1:53pm, "Dave Taht" <dave.taht@gmail.com> said:
>
>
>
> > Or we can take a break, and write books about how we learned to relax and
> > stop worrying about the bloat.
>
>
>
> Leading to waistline bloat?
>

We resemble that remark already....


[-- Attachment #2: Type: text/html, Size: 1213 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 17:53             ` Dave Taht
@ 2014-05-21 17:56               ` dpreed
  2014-05-21 17:57                 ` Jim Gettys
  0 siblings, 1 reply; 51+ messages in thread
From: dpreed @ 2014-05-21 17:56 UTC (permalink / raw)
  To: Dave Taht; +Cc: Frits Riep, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 224 bytes --]


On Wednesday, May 21, 2014 1:53pm, "Dave Taht" <dave.taht@gmail.com> said:
 

> Or we can take a break, and write books about how we learned to relax and
> stop worrying about the bloat.
 
Leading to waistline bloat?

[-- Attachment #2: Type: text/html, Size: 570 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 16:30           ` Dave Taht
@ 2014-05-21 17:55             ` dpreed
  0 siblings, 0 replies; 51+ messages in thread
From: dpreed @ 2014-05-21 17:55 UTC (permalink / raw)
  To: Dave Taht; +Cc: Frits Riep, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 4596 bytes --]

The end-to-end argument against putting functionality in the network is a modularity principle, as you know. The exception is when there is a function that you want to provide that is not strictly end-to-end.

Congestion is one of them - there is a fundamental issue with congestion that it happens because of collective actions among independent actors.

So if you want to achieve the goals of the modularity principle, you need to find either a) the minimal sensing and response you can put in the network that allows the independent actors to "cooperate", or b) require the independent actors to discover and communicate amongst each other individually.

Any solution that tries to satisfy the modularity principle has the property that it provides sufficient information, in a sufficiently timely manner, for the independent actors to respond "cooperatively" to resolve the issue (by reducing their transmission volume in some - presumably approximately fair - way).

Sufficiently timely is bounded by the "draining time" of a switch's outbound link's queue.  For practical applications of the Internet today, the "draining time" should never exceed about 30-50 msec., at the outbound link's rate.  However, the optimal normal depth of the queue should be no larger than the size needed to keep the outbound link continuously busy at its peak rate whatever that is (for a shared WiFi access point the peak rate is highly variable as you know).

This suggests that the minimal function the network must provide to the endpoints is the packet's "instantaneous" contribution to the draining time of the most degraded link on the path.

Given this information, a pair of endpoints know what to do.  If it is a receiver-managed windowed protocol like TCP, the window needs to be adjusted to minimize the contribution to the "draining time" of the currently bottlenecked node, to stop pipelined flows from its sender as quickly as possible.

In that case, cooperative behavior is implicit.  The bottleneck switch needs only to inform all independent flows of their contribution, and with an appropriate control loop on each node, approximate fairness can result.

And this is the most general approach.  Switches have no idea of the "meaning" of the flows, so beyond timely and accurate reporting, they can't make useful decisions about fixing congestion.

Note that this all is an argument about architectural principles and the essence of the congestion problem.

I could quibble about whether fq_codel is the simplest or best choice for the minimal functionality an "internetwork" could provide.  But it's pretty nice and simple.  Not clear it works for a decentralized protocol like WiFi as a link - but something like it would seem to be the right thing.

On Wednesday, May 21, 2014 12:30pm, "Dave Taht" <dave.taht@gmail.com> said:

> On Wed, May 21, 2014 at 9:03 AM,  <dpreed@reed.com> wrote:
> > In reality we don't disagree on this:
> >
> >
> >
> > On Wednesday, May 21, 2014 11:19am, "Dave Taht" <dave.taht@gmail.com>
> said:
> >
> >>
> >
> >> Well, I disagree somewhat. The downstream shaper we use works quite
> >> well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
> >> has had the inbound shaper work up a little past 100mbits. So there is
> >> no need (theoretically) to upgrade the big fat head ends if your cpe is
> >> powerful enough to do the job. It would be better if the head ends did
> it,
> >> of course....
> >>
> >
> >
> >
> > There is an advantage for the head-ends doing it, to the extent that each
> > edge device has no clarity about what is happening with all the other cpe
> > that are sharing that head-end. When there is bloat in the head-end even if
> > all cpe's sharing an upward path are shaping themselves to the "up to" speed
> > the provider sells, they can go into serious congestion if the head-end
> > queues can grow to 1 second or more of sustained queueing delay.  My
> > understanding is that head-end queues have more than that.  They certainly
> > do in LTE access networks.
> 
> Compelling argument! I agree it would be best for the devices that have the
> most information about the network to manage themselves better.
> 
> It is deeply ironic to me that I'm arguing for an e2e approach on fixing
> the problem in the field, with you!
> 
> http://en.wikipedia.org/wiki/End-to-end_principle
> 
> >
> >
> 
> 
> 
> --
> Dave Täht
> 
> NSFW:
> https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
>

[-- Attachment #2: Type: text/html, Size: 6199 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 17:47           ` Jim Gettys
@ 2014-05-21 17:53             ` Dave Taht
  2014-05-21 17:56               ` dpreed
  0 siblings, 1 reply; 51+ messages in thread
From: Dave Taht @ 2014-05-21 17:53 UTC (permalink / raw)
  To: Jim Gettys; +Cc: Frits Riep, cerowrt-devel

On Wed, May 21, 2014 at 10:47 AM, Jim Gettys <jg@freedesktop.org> wrote:
>
>
>
> On Wed, May 21, 2014 at 12:03 PM, <dpreed@reed.com> wrote:
>>
>> In reality we don't disagree on this:
>>
>>
>>
>> On Wednesday, May 21, 2014 11:19am, "Dave Taht" <dave.taht@gmail.com>
>> said:
>>
>> >
>>
>> > Well, I disagree somewhat. The downstream shaper we use works quite
>> > well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
>> > has had the inbound shaper work up a little past 100mbits. So there is
>> > no need (theoretically) to upgrade the big fat head ends if your cpe is
>> > powerful enough to do the job. It would be better if the head ends did
>> > it,
>> > of course....
>> >
>>
>>
>>
>> There is an advantage for the head-ends doing it, to the extent that each
>> edge device has no clarity about what is happening with all the other cpe
>> that are sharing that head-end. When there is bloat in the head-end even if
>> all cpe's sharing an upward path are shaping themselves to the "up to" speed
>> the provider sells, they can go into serious congestion if the head-end
>> queues can grow to 1 second or more of sustained queueing delay.  My
>> understanding is that head-end queues have more than that.  They certainly
>> do in LTE access networks.
>
>
> I have measured 200ms on a 28Mbps LTE quadrant to a single station.  This
> was using the simplest possible test on an idle cell.  Easy to see how that
> can grow to the second range.
>
> Similarly, Dave Taht and I took data recently that showed a large downstream
> buffer at the CMTS end (line card?), IIRC, it was something like .25
> megabyte, using a UDP flooding tool.

No it was twice that. The udpburst tool is coming along nicely, but still
needs some analytics against the departure rate to get it right.

> As always, there may be multiple different buffers lurking in these complex
> devices, which may only come into play when different parts of them
> "bottleneck", just as we found many different buffering locations inside of
> Linux.  In fact, some of these devices include Linux boxes (though I do not
> know if they are on the packet forwarding path or not).
>
> Bandwidth shaping downstream of those bottlenecks can help, but only to a
> degree, and I believe primarily for "well behaved" long lived elephant
> flows.  Offload engines on servers and coalescing acks in various equipment
> makes the degree of help, particularly for transient behavior such as
> opening a bunch of TCP connections simultaneously and downloading the
> elements of a web page I believe are likely to put large bursts of packets
> into these queues, causing transient poor latency.  I think we'll get a bit
> of help out of the packet pacing code that recently went into Linux (for
> well behaved servers) as it deploys.  Thanks to Eric Dumazet for that work!
> Ironically, servers get updated much more frequently than these middle
> boxes, as far as I can tell.
>
> Somehow we gotta get the bottlenecks in these devices (broadband & cellular)
> to behave better.

Or we can take a break, and write books about how we learned to relax and
stop worrying about the bloat.

>                                       - Jim
>
>>
>>
>>
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>
>



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 16:03         ` dpreed
  2014-05-21 16:30           ` Dave Taht
@ 2014-05-21 17:47           ` Jim Gettys
  2014-05-21 17:53             ` Dave Taht
  1 sibling, 1 reply; 51+ messages in thread
From: Jim Gettys @ 2014-05-21 17:47 UTC (permalink / raw)
  To: David P Reed; +Cc: Frits Riep, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 3039 bytes --]

On Wed, May 21, 2014 at 12:03 PM, <dpreed@reed.com> wrote:

> In reality we don't disagree on this:
>
>
>
> On Wednesday, May 21, 2014 11:19am, "Dave Taht" <dave.taht@gmail.com>
> said:
>
> >
>
> > Well, I disagree somewhat. The downstream shaper we use works quite
> > well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
> > has had the inbound shaper work up a little past 100mbits. So there is
> > no need (theoretically) to upgrade the big fat head ends if your cpe is
> > powerful enough to do the job. It would be better if the head ends did
> it,
> > of course....
> >
>
>
>
> There is an advantage for the head-ends doing it, to the extent that each
> edge device has no clarity about what is happening with all the other cpe
> that are sharing that head-end. When there is bloat in the head-end even if
> all cpe's sharing an upward path are shaping themselves to the "up to"
> speed the provider sells, they can go into serious congestion if the
> head-end queues can grow to 1 second or more of sustained queueing delay.
>  My understanding is that head-end queues have more than that.  They
> certainly do in LTE access networks.
>

I have measured 200ms on a 28Mbps LTE quadrant to a single station.  This
was using the simplest possible test on an idle cell.  Easy to see how that
can grow to the second range.

Similarly, Dave Taht and I took data recently that showed a large
downstream buffer at the CMTS end (line card?), IIRC, it was something like
.25 megabyte, using a UDP flooding tool.

As always, there may be multiple different buffers lurking in these complex
devices, which may only come into play when different parts of them
"bottleneck", just as we found many different buffering locations inside of
Linux.  In fact, some of these devices include Linux boxes (though I do not
know if they are on the packet forwarding path or not).

Bandwidth shaping downstream of those bottlenecks can help, but only to a
degree, and I believe primarily for "well behaved" long lived elephant
flows.  Offload engines on servers and coalescing acks in various equipment
makes the degree of help, particularly for transient behavior such as
opening a bunch of TCP connections simultaneously and downloading the
elements of a web page I believe are likely to put large bursts of packets
into these queues, causing transient poor latency.  I think we'll get a bit
of help out of the packet pacing code that recently went into Linux (for
well behaved servers) as it deploys.  Thanks to Eric Dumazet for that work!
 Ironically, servers get updated much more frequently than these middle
boxes, as far as I can tell.

Somehow we gotta get the bottlenecks in these devices (broadband &
cellular) to behave better.
                                      - Jim

>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
>

[-- Attachment #2: Type: text/html, Size: 4941 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 15:07     ` Dave Taht
@ 2014-05-21 16:50       ` Michael Richardson
  2014-05-21 17:58       ` David Lang
  1 sibling, 0 replies; 51+ messages in thread
From: Michael Richardson @ 2014-05-21 16:50 UTC (permalink / raw)
  To: Dave Taht; +Cc: Frits Riep, cerowrt-devel

Dave Taht <dave.taht@gmail.com> wrote:
    > I should point out that another issue with deploying fq_codel widely
    > is that it requires an accurate
    > measurement (currently) of the providers bandwidth.

I've been thinking about ways to do this over PPP(oE) links if one controls
both ends --- many third party internet access ISPs terminate the PPP
on their equipment, rather than the telco's, so it should be possible
to avoid all the L2 issues.

My ISP now offers fiber-to-the-neighbourhood, 50Mb/s down, 10 up.
(vs 7/640 that I have now).  They are offering me an
  http://smartrg.com/products/products/sr505n/

which they suggest I run in bridge (layer-2) mode.  I'm trying to figure out
what is inside, as it has the DSL interface right on it.  I didn't know
of this device before.

    > My hope/expectation is that more ISPs that
    > provide CPE will ship something that is configured correctly by
    > default, following in free.fr's footsteps,
    > and trying to beat the cable industry to the punch, now that the core
    > code is debugged and documented, creating an out-of-box win.

Agreed.

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        | network architect  [
]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 16:03         ` dpreed
@ 2014-05-21 16:30           ` Dave Taht
  2014-05-21 17:55             ` dpreed
  2014-05-21 17:47           ` Jim Gettys
  1 sibling, 1 reply; 51+ messages in thread
From: Dave Taht @ 2014-05-21 16:30 UTC (permalink / raw)
  To: David Reed; +Cc: Frits Riep, cerowrt-devel

On Wed, May 21, 2014 at 9:03 AM,  <dpreed@reed.com> wrote:
> In reality we don't disagree on this:
>
>
>
> On Wednesday, May 21, 2014 11:19am, "Dave Taht" <dave.taht@gmail.com> said:
>
>>
>
>> Well, I disagree somewhat. The downstream shaper we use works quite
>> well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
>> has had the inbound shaper work up a little past 100mbits. So there is
>> no need (theoretically) to upgrade the big fat head ends if your cpe is
>> powerful enough to do the job. It would be better if the head ends did it,
>> of course....
>>
>
>
>
> There is an advantage for the head-ends doing it, to the extent that each
> edge device has no clarity about what is happening with all the other cpe
> that are sharing that head-end. When there is bloat in the head-end even if
> all cpe's sharing an upward path are shaping themselves to the "up to" speed
> the provider sells, they can go into serious congestion if the head-end
> queues can grow to 1 second or more of sustained queueing delay.  My
> understanding is that head-end queues have more than that.  They certainly
> do in LTE access networks.

Compelling argument! I agree it would be best for the devices that have the
most information about the network to manage themselves better.

It is deeply ironic to me that I'm arguing for an e2e approach on fixing
the problem in the field, with you!

http://en.wikipedia.org/wiki/End-to-end_principle

>
>



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 15:19       ` Dave Taht
@ 2014-05-21 16:03         ` dpreed
  2014-05-21 16:30           ` Dave Taht
  2014-05-21 17:47           ` Jim Gettys
  0 siblings, 2 replies; 51+ messages in thread
From: dpreed @ 2014-05-21 16:03 UTC (permalink / raw)
  To: Dave Taht; +Cc: Frits Riep, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1070 bytes --]

In reality we don't disagree on this:

On Wednesday, May 21, 2014 11:19am, "Dave Taht" <dave.taht@gmail.com> said:
> 

> Well, I disagree somewhat. The downstream shaper we use works quite
> well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
> has had the inbound shaper work up a little past 100mbits. So there is
> no need (theoretically) to upgrade the big fat head ends if your cpe is
> powerful enough to do the job. It would be better if the head ends did it,
> of course....
>

There is an advantage for the head-ends doing it, to the extent that each edge device has no clarity about what is happening with all the other cpe that are sharing that head-end. When there is bloat in the head-end even if all cpe's sharing an upward path are shaping themselves to the "up to" speed the provider sells, they can go into serious congestion if the head-end queues can grow to 1 second or more of sustained queueing delay.  My understanding is that head-end queues have more than that.  They certainly do in LTE access networks.

[-- Attachment #2: Type: text/html, Size: 1647 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 14:51     ` dpreed
@ 2014-05-21 15:19       ` Dave Taht
  2014-05-21 16:03         ` dpreed
  0 siblings, 1 reply; 51+ messages in thread
From: Dave Taht @ 2014-05-21 15:19 UTC (permalink / raw)
  To: David Reed; +Cc: Frits Riep, cerowrt-devel

On Wed, May 21, 2014 at 7:51 AM,  <dpreed@reed.com> wrote:
> Besides deployment in cerowrt and openwrt, what would really have high
> leverage is that the techniques developed in cerowrt's exploration
> (including fq_codel) get deployed where they should be deployed: in the
> access network systems: CMTS's, DSLAM's, Enterprise boundary gear, etc. from
> the major players.

+10.

>
>
>
> Cerowrt's fundamental focus has been proving that the techniques really,
> really work at scale.

That they even work on a processor designed in 1990! :)

I also have hoped that along the way we've shown what techniques don't
work...

> However, the fundamental "bloat-induced" experiences are actually occurring
> due to bloat at points where "fast meets slow".  Cerowrt can't really fix
> the problem in the download direction (currently not so bad because of high
> download speeds relative to upload speeds in the US - that's in the CMTS's
> and DSLAM's.

Well, I disagree somewhat. The downstream shaper we use works quite
well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter
has had the inbound shaper work up a little past 100mbits. So there is
no need (theoretically) to upgrade the big fat head ends if your cpe is
powerful enough to do the job. It would be better if the head ends did it,
of course....

>
>
>
> What's depressing to me is that the IETF community spends more time trying
> to convince themselves that bloat is only a theoretical problem, never
> encountered in the field.  In fact, every lab I've worked at (including the

It isn't all the IETF. Certainly google gets it and has made huge strides.
reduced RTT = money.

My own frustration comes from papers that are testing this stuff at 4mbit
or lower and not seeing the results we get above that, on everything.

https://plus.google.com/u/0/107942175615993706558/posts/AbeHRY9vzLR

ns2 and ns3 could use some improvements...

> startup accelerator where some of my current company work) has had the
> network managers complaining to me that a single heavy FTP I'm running
> causes all of the other users in the site to experience terrible web
> performance.  But when they call Cisco or F5 or whomever, they get told
> "there's nothing to do but buy complicated flow-based traffic management
> boxes to stick in line with the traffic (so they can "slow me down").

It is sad that F5, in particular, doesn't have a sane solution. Their whole
approach is to have a "load-balancer" and fq_codel is a load-balancer to
end all load balancers.

I do note nobody I know has ported BQL or fq_codel to bsd (codel is in bsd now)

>
>
>
> Bloat is the most common invisible elephant on the Internet.  Just fixing a

+10.

> few access points is a start, but even if we fix all the access points so
> that uploads interfere less, there's still more impact this one thing can
> have.

I was scared silly at the implications 2 years back, I am more sanguine
now.

>
>
>
> So, by all means get this stuff into mainstream, but it's time to start
> pushing on the access network technology companies (and there are now open
> switches from Cumulus and even Arista to hack)

Oh, cool! I keep waiting for my parallella to show up so I could start
fiddling with ethernet in the fpga....

>
>
>
>
>
>
>
> On Wednesday, May 21, 2014 7:42am, "Frits Riep" <riep@riepnet.com> said:
>
>> Thanks Dave for your responses. Based on this, it is very good that
>> qos-scripts
>> is available now through openwrt, and as I experienced, it provides a huge
>> advantage for most users. I would agree prioritizing ping is in and of
>> itself not
>> the key goal, but based on what I've read so far, fq-codel provides
>> dramatically
>> better responsiveness for any interactive application such as
>> web-browsing, voip,
>> or gaming, so it qos-scripts would be advantageous for users like your mom
>> if she
>> were in an environment where she had a slow and shared internet
>> connection. Is
>> that a valid interpretation? I am interested in further understanding the
>> differences based on the brief differences you provide. It is true that
>> few
>> devices provide DSCP marking, but if the latency is controlled for all
>> traffic,
>> latency sensitive traffic benefits tremendously even without prioritizing
>> by l7
>> (layer 7 ?). Is this interpretation also valid?
>>
>> Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but
>> if it
>> were set up for her, or if it could be incorporated into a consumer router
>> with
>> automatically determining speed parameters, she would benefit totally from
>> the
>> performance improvement. So the technology ultimately needs to be taken
>> mainstream, and yes that is a huge task.
>>
>> Frits
>>
>> -----Original Message-----
>> From: Dave Taht [mailto:dave.taht@gmail.com]
>> Sent: Tuesday, May 20, 2014 7:14 PM
>> To: Frits Riep
>> Cc: cerowrt-devel@lists.bufferbloat.net
>> Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize
>> bufferbloat
>> control for consideration.
>>
>> On Tue, May 20, 2014 at 3:11 PM, Frits Riep <riep@riepnet.com> wrote:
>> > The concept of eliminating bufferbloat on many more routers is quite
>> > appealing. Reading some of the recent posts makes it clear there is a
>> > desire to get to a stable code, and also to find a new platform
>> > beyond the current Netgear. However, as good as some of the proposed
>> > platforms maybe for developing and for doing all of the new
>> > capabilities of CeroWRT, I also would like to propose that there also
>> > be some focus on reaching a wider and less sophisticated audience to
>> > help broaden the awareness and make control of bufferbloat more
>> > available and
>> easier to attain for more users.
>>
>> I agree that reaching more users is important. I disagree we need to reach
>> them
>> with cerowrt. More below:
>>
>> >
>> >
>> > · It appears there is a desire to merge the code into an
>> upcoming
>> > OpenWRT barrier breaker release, which is excellent as it will make it
>> > easier to fight buffer bloat on a wide range of platforms and provide
>> > users with a much easier to install firmware release. I’d like to be
>> > able to download luci-qos-scripts and sqm-scripts and have basic
>> > bufferbloat control on a much greater variety of devices and to many
>> > more users. From an awareness perspective this would be a huge win.
>> > Is the above scenario what is being planned, is it likely to happen in
>> > the
>> reasonable future?
>>
>> Yes, I'd submitted sqm for review upstream, got back a few comments.
>> Intend to
>> resubmit again when I get a chance.
>>
>> >
>> > · From my perspective, it would be ideal to have this
>> available to
>> > many users in a more affordable platform, something like an 8mb flash
>> > router like the TP-Link WDR-4300, which is otherwise a very capable
>> > router with dual channels and good performance.
>> >
>> > · (I’ve managed to set up such a WDR-4300, with OpenWRT
>> trunk,
>> > figured how to telnet and install Luci, then luci-app-qos, and
>> > qos-scripts and I thought the bufferbloat control was remarkable.)
>> > How much better would it be if I were able to use luci-qos-scripts and
>> sqm-scripts instead?
>>
>> You can easily add the .ipk files for sqm-scripts and luci-app-sqm to any
>> release
>> of openwrt. They are just scripts. They need some optional kernel modules
>> and
>> tools.
>>
>> I regard the qos-scripts as pretty good - the core differences from sqm
>> are
>>
>> qos vs sqm
>> ---------------
>> both use fq_codel. :)
>> hfsc vs htb # A wash, hfsc mostly behaves like htb ping optimized vs
>> de-optimized
>> # optimizing for ping looks good in benchmarks but it's silly in the real
>> world
>> (l7) classification vs dscp # clear win to qos here, nearly nothing uses
>> dscp no
>> framing compensation vs comprehensive framing compensation # win here for
>> sqm no
>> alternate queue models vs many alternate queue models # with fq_codel the
>> winner,
>> who cares?
>> fits in 4mb flash vs barely fits in 4mb flash
>>
>> The real killer problem for qos-scripts, for me, was that they didn't do
>> ipv6. I'd
>> like to see that fixed, at the very least.
>>
>>
>> >
>> > · For these target users, they need simplicity, good
>> performance,
>> > ease of setup and affordability. They are not able to deal with the
>> > routing between subnets on wireless, IPv6 setup, and any complexities
>> > introduced by DNSSEC. Marketing the advantages of bufferbloat alone
>> > requires lots of education and publicity (and as we have seen there
>> > are many misleading posts by seemingly persuasive nay-sayers that it is
>> > all
>> smoke and mirrors.
>>
>> Well, my intent is to make the successful bits of technology widely
>> available.
>> They are widely available. And being adopted everywhere. Win.
>>
>> As for the additional complexities, well, they will get less complex over
>> time.
>>
>> In one respect, they are a stake in the ground. I have high hopes for the
>> eventual
>> success of hnetd and mdns proxy services, although they are alpha and
>> nearly
>> unusable right now, some are making substantial investments into them.
>>
>> In another the additional complexities of cero - like routing vs bridging
>> - are
>> there to further the research into fixing wifi technologies - which we
>> haven't
>> really even started yet. I'm increasingly convinced we need to do
>> make-wifi-fast
>> as a separate, focused project, building on a stable base.
>>
>> > · Would it be possible to have a simplified release of CeroWRT
>> (in
>> > terms of setup, and features), make It available for a reliable and
>> > affordable platform, and publicize it and have it reach hopefully a
>> > much wider audience? This could potentially be through the OpenWRT
>> > channels.
>>
>> Possible yes. Affordable, no. Given that this has been a nearly full time
>> project
>> for me, for the last 38 months, with nearly zero revenue, I have no intent
>> or
>> interest in gaining anything other than knowledgable, clued users that
>> want to
>> advance the state of the art. My mom doesn't run cerowrt, nor do I want
>> her to.
>>
>> If someone dropped ~1m/year on the project, that could change, but at
>> present
>> levels of funding I'd be better off working at mcdonalds. Even if funding
>> appeared
>> from the sky I'd rather spend it on R&D than GUI...
>>
>> Certainly IF there was some cost model that made sense, awesome! let's go
>> for
>> world domination!
>>
>> I continue to pursue the grant
>> route, but the only thing that resonates even slightly with potential
>> funders is
>> not speed but security issues, which give me nightmares. Another model
>> that works
>> is actually making and selling a router, but that requires up front
>> capital and
>> entry into a very tight, profit-limited market.
>>
>> Biggest problem we have is supporting ONLY one router, even-semi-well, is
>> a PITA.
>>
>> Adding a new one costs more. I'm now on my 4th day of trying to make the
>> archer
>> work. That's 6k of my life I'll never have back. And the ath10k in it
>> sucks, and
>> working to make that work well is not something I want to be doing due to
>> the
>> binary blob wifi firmware.
>>
>> I'm all in favor of handing off future cerowrt development to a nonprofit
>> of
>> interested users, and sitting back and focusing on fixing just the bits I
>> care
>> about, if anyone is interested in forming one...
>>
>> > · Part of the reason why Tomato had been so popular is that
>> the
>> > firmware upgrade, install, configuration, and management was well
>> > within the capabilities of the average weekend hacker, and there were
>> > compelling features and reliability vs the factory firmwares at the
>> > time.
>>
>> Yep. dd-wrt is the same. And various downstream users like buffalo, meraki
>> etc.
>>
>> I'm totally happy that they exist and have a working market.
>>
>> > · Even installing OpenWRT, especially Trunk, and finding,
>> > downloading and enabling packages, while very powerful, and flexible,
>> > is still quite complex to someone who does not spend a lot of time
>> > reading wiki’s and release notes.
>>
>> Yes, CeroWrt is an improvement on OpenWrt in that regard. But it isn't
>> enough.
>> Doing serious UI improvements and simplification IS necessary, and that's
>> not my
>> bag. The EFF is making noises about doing something with the front end of
>> openwrt
>> and/or cero in the next year or so (see their owtech list for more
>> details), that
>> also goes after the security issue.
>>
>> > I’d be interested in feedback on these thoughts.
>>
>> There you go. I LOVE that we have a happy userbase, and love what we've
>> accomplished, and have loved being here to help make it happen, and love
>> that lots
>> of people want to get it more out there to more people, it's gratifying as
>> hell,
>> and there are a lot of negatives too, like chasing bugs for months on
>> end...
>>
>> ... but after we freeze, I need a vacation and to do something else for a
>> while.
>>
>> I'm presently planning on spending the summer working on something that
>> pays, and
>> on improving ns3 with the GSOC, and testing a deployment of cerowrts on a
>> modest
>> scale, and working on a new/improved rate limiter integrated with
>> fq_codel. And
>> only updating cero for CVEs or major new features.
>>
>> That's a full plate.
>>
>> If someone else wants to step up to maintain or continue to push cerowrt
>> forward
>> in some direction or another, I'm all for it.
>>
>> It's kind of my hope a clear winner on the chipset front will emerge and
>> we can
>> move to that, but even if that happens it will be months and months before
>> it
>> could be considered stable...
>>
>>
>>
>> >
>> >
>> > Frits Riep
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > Cerowrt-devel mailing list
>> > Cerowrt-devel@lists.bufferbloat.net
>> > https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> >
>>
>>
>>
>> --
>> Dave Täht
>>
>> NSFW:
>>
>> https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
>>
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 11:42   ` Frits Riep
  2014-05-21 14:51     ` dpreed
@ 2014-05-21 15:07     ` Dave Taht
  2014-05-21 16:50       ` Michael Richardson
  2014-05-21 17:58       ` David Lang
  1 sibling, 2 replies; 51+ messages in thread
From: Dave Taht @ 2014-05-21 15:07 UTC (permalink / raw)
  To: Frits Riep; +Cc: cerowrt-devel

On Wed, May 21, 2014 at 4:42 AM, Frits Riep <riep@riepnet.com> wrote:
> Thanks Dave for your responses.  Based on this, it is very good that qos-scripts is available now through openwrt, and as I experienced, it provides a huge advantage for most users.

I should point out that another issue with deploying fq_codel widely
is that it requires an accurate
measurement (currently) of the providers bandwidth.

My hope/expectation is that more ISPs that
provide CPE will ship something that is configured correctly by
default, following in free.fr's footsteps,
and trying to beat the cable industry to the punch, now that the core
code is debugged and documented, creating an out-of-box win.

> I would agree prioritizing ping is in and of itself not the key goal, but based on what I've read so far, fq-codel provides dramatically better responsiveness for any interactive application such as web-browsing, voip, or gaming, so it qos-scripts would be advantageous for users like your mom if she were in an environment where she had a slow and shared internet connection.  Is that a valid interpretation?

Sure. My mom has a fast, non-shared internet connection. Her biggest
problem is she hasn't
got off of windows despite my brother's decade of attempts to move her
to osx.... :)

Markets where this stuff seriously applies as a rate limiter + qos system
today are small to medium business, cybercafes, shared workspaces,
geek-zones, and so on. It also applies on ethernet and in cases where
you want to artificially have a rate limit like:

http://pieknywidok.blogspot.com/2014/05/10g-1g.html

We're ~5 years ahead of the curve here at cerowrt-central. Tools "just
work" for any sysadmin with chops. Commercial products are in the
pipeline.

While it takes time to build it into a product, I'd kind of expect
barracuda and ubnt to add fq_codel
to their products fairly soon, and for at least one switch vendor to follow.

It's in shorewall, ipfire, streamboost, everything downstream from openwrt,
linux mainline (and thus every linux distro) already. I know of a
couple cloud providers that are running
sch_fq and fq_codel already.

One thing I'm a little frustrated about, is that I'd expected sch_fq
to replace pfifo_fast by default
on more linux distros by now. It's a single sysctl...

> I am interested in further understanding the differences based on the brief differences you provide.  It is true that few devices provide DSCP marking, but if the latency is controlled for all traffic, latency sensitive traffic benefits tremendously even without prioritizing by l7 (layer 7 ?). Is this interpretation also valid?

Very, very true. Most of the need for prioritization goes away
entirely, due to the "sparse" vs "full" (or fast vs slow) queue
concept in fq_codel. In most circumstances things like voip just cut
through other traffic like butter. Videoconferencing is vastly
improved, also.

However, on very, very slow links (<3mbit), nothing helps enough. It's
not just the qos system that needs to be tuned, but that modern TCPs
and the web are optimized for much faster links and have features that
hurt at low speeds.  (what helps most is installing adblock plus!).
Torrent is something of a special case - I find it totally bearable at
20mbit/4mbit without classification - but unbearable at 8/1.

I'm pretty satisfied we have the core algorithms and theory in place,
now, to build edge devices that work much better at 3mbit to 200mbit,
at least, possibly 10gbit or higher.

> Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but if it were set up for her, or if it could be incorporated into a consumer router with automatically determining speed parameters,

That automatic speedtest thing turns out to be hard.

>she would benefit totally from the performance improvement.

Meh. She needs to get off of windows.

>So the technology ultimately needs to be taken mainstream, and yes that is a huge task.

Yep. If we hadn't given away everything perhaps there would be a
business model to fund that - streamboost is trying that route.

My hope was that the technology is merely so compelling that vendors
would be falling over themselves to answer the customer complaints.
But few have tied "bufferbloat" to the problems gamers and small
business are having with their internet uplinks as yet and more
education and demonstration seems necessary.

There is a huge backlog of potential demand for a better dslam, in
particular, as well as better firewalls and cablemodems. I don't have
a lot of hope for the two CMTS vendors to move to improve things
anytime soon.

> Frits
>
> -----Original Message-----
> From: Dave Taht [mailto:dave.taht@gmail.com]
> Sent: Tuesday, May 20, 2014 7:14 PM
> To: Frits Riep
> Cc: cerowrt-devel@lists.bufferbloat.net
> Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
>
> On Tue, May 20, 2014 at 3:11 PM, Frits Riep <riep@riepnet.com> wrote:
>> The concept of eliminating bufferbloat on many more routers is quite
>> appealing.  Reading some of the recent posts makes it clear there is a
>> desire to  get to a stable code, and also to find a new platform
>> beyond the current Netgear.  However, as good as some of the proposed
>> platforms maybe for developing and for doing all of the new
>> capabilities of CeroWRT, I also would like to propose that there also
>> be some focus on reaching a wider and less sophisticated audience to
>> help broaden the awareness and make control of bufferbloat more available and easier to attain for more users.
>
> I agree that reaching more users is important. I disagree we need to reach them with cerowrt. More below:
>
>>
>>
>> ·         It appears there is a desire to merge the code into an upcoming
>> OpenWRT barrier breaker release, which is excellent as it will make it
>> easier to fight buffer bloat on a wide range of platforms and provide
>> users with a much easier to install firmware release.  I’d like to be
>> able to download luci-qos-scripts and sqm-scripts and have basic
>> bufferbloat control on a much greater variety of devices and to many
>> more users.  From an awareness perspective this would be a huge win.
>> Is the above scenario what is being planned, is it likely to happen in the reasonable future?
>
> Yes, I'd submitted sqm for review upstream, got back a few comments. Intend to resubmit again when I get a chance.
>
>>
>> ·         From my perspective, it would be ideal to have this available to
>> many users in a more affordable platform, something like an 8mb flash
>> router like the TP-Link WDR-4300, which is otherwise a very capable
>> router with dual channels and good performance.
>>
>> ·         (I’ve managed to set up such a WDR-4300, with OpenWRT trunk,
>> figured how to telnet and install Luci, then luci-app-qos, and
>> qos-scripts and I thought the bufferbloat control was remarkable.)
>> How much better would it be if I were able to use luci-qos-scripts and sqm-scripts instead?
>
> You can easily add the .ipk files for sqm-scripts and luci-app-sqm to any release of openwrt. They are just scripts. They need some optional kernel modules and tools.
>
> I regard the qos-scripts as pretty good - the core differences from sqm are
>
> qos  vs sqm
> ---------------
> both use fq_codel. :)
> hfsc vs htb # A wash, hfsc mostly behaves like htb ping optimized vs de-optimized # optimizing for ping looks good in benchmarks but it's silly in the real world
> (l7) classification vs dscp # clear win to qos here, nearly nothing uses dscp no framing compensation vs comprehensive framing compensation # win here for sqm no alternate queue models vs many alternate queue models # with fq_codel the winner, who cares?
> fits in 4mb flash vs barely fits in 4mb flash
>
> The real killer problem for qos-scripts, for me, was that they didn't do ipv6. I'd like to see that fixed, at the very least.
>
>
>>
>> ·         For these target users, they need simplicity, good performance,
>> ease of setup and affordability.  They are not able to deal with the
>> routing between subnets on wireless, IPv6 setup, and any complexities
>> introduced by DNSSEC.  Marketing  the advantages of bufferbloat alone
>> requires lots of education and publicity (and as we have seen there
>> are many misleading posts by seemingly persuasive nay-sayers that it is all smoke and mirrors.
>
> Well, my intent is to make the successful bits of technology widely available.
> They are widely available. And being adopted everywhere. Win.
>
> As for the additional complexities, well, they will get less complex over time.
>
> In one respect, they are a stake in the ground. I have high hopes for the eventual success of hnetd and mdns proxy services, although they are alpha and nearly unusable right now, some are making substantial investments into them.
>
> In another the additional complexities of cero - like routing vs bridging - are there to further the research into fixing wifi technologies - which we haven't really even started yet. I'm increasingly convinced we need to do make-wifi-fast as a separate, focused project, building on a stable base.
>
>> ·         Would it be possible to have a simplified release of CeroWRT (in
>> terms of setup, and features), make It available for a reliable and
>> affordable platform, and publicize it and have it reach hopefully a
>> much wider audience?  This could potentially be through the OpenWRT channels.
>
> Possible yes. Affordable, no. Given that this has been a nearly full time project for me, for the last 38 months, with nearly zero revenue, I have no intent or interest in gaining anything other than knowledgable, clued users that want to advance the state of the art. My mom doesn't run cerowrt, nor do I want her to.
>
> If someone dropped ~1m/year on the project, that could change, but at present levels of funding I'd be better off working at mcdonalds. Even if funding appeared from the sky I'd rather spend it on R&D than GUI...
>
> Certainly IF there was some cost model that made sense, awesome! let's go for world domination!
>
> I continue to pursue the grant
> route, but the only thing that resonates even slightly with potential funders is not speed but security issues, which give me nightmares. Another model that works is actually making and selling a router, but that requires up front capital and entry into a very tight, profit-limited market.
>
> Biggest problem we have is supporting ONLY one router, even-semi-well, is a PITA.
>
> Adding a new one costs more. I'm now on my 4th day of trying to make the archer work. That's 6k of my life I'll never have back. And the ath10k in it sucks, and working to make that work well is not something I want to be doing due to the binary blob wifi firmware.
>
> I'm all in favor of handing off future cerowrt development to a nonprofit of interested users, and sitting back and focusing on fixing just the bits I care about, if anyone is interested in forming one...
>
>> ·         Part of the reason why Tomato had been so popular is that the
>> firmware upgrade,  install, configuration, and management was well
>> within the capabilities of the average weekend hacker, and there were
>> compelling features and reliability vs the factory firmwares at the time.
>
> Yep. dd-wrt is the same. And various downstream users like buffalo, meraki etc.
>
> I'm totally happy that they exist and have a working market.
>
>> ·         Even installing OpenWRT, especially Trunk, and finding,
>> downloading and enabling packages, while very powerful, and flexible,
>> is still quite complex to someone who does not spend a lot of time
>> reading wiki’s and release notes.
>
> Yes, CeroWrt is an improvement on OpenWrt in that regard. But it isn't enough. Doing serious UI improvements and simplification IS necessary, and that's not my bag. The EFF is making noises about doing something with the front end of openwrt and/or cero in the next year or so (see their owtech list for more details), that also goes after the security issue.
>
>> I’d be interested in feedback on these thoughts.
>
> There you go. I LOVE that we have a happy userbase, and love what we've accomplished, and have loved being here to help make it happen, and love that lots of people want to get it more out there to more people, it's gratifying as hell, and there are a lot of negatives too, like chasing bugs for months on end...
>
> ... but after we freeze, I need a vacation and to do something else for a while.
>
> I'm presently planning on spending the summer working on something that pays, and on improving ns3 with the GSOC, and testing a deployment of cerowrts on a modest scale, and working on a new/improved rate limiter integrated with fq_codel.  And only updating cero for CVEs or major new features.
>
> That's a full plate.
>
> If someone else wants to step up to maintain or continue to push cerowrt forward in some direction or another, I'm all for it.
>
> It's kind of my hope a clear winner on the chipset front will emerge and we can move to that, but even if that happens it will be months and months before it could be considered stable...
>
>
>
>>
>>
>> Frits Riep
>>
>>
>>
>>
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>
>
>
>
> --
> Dave Täht
>
> NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
>



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-21 11:42   ` Frits Riep
@ 2014-05-21 14:51     ` dpreed
  2014-05-21 15:19       ` Dave Taht
  2014-05-21 15:07     ` Dave Taht
  1 sibling, 1 reply; 51+ messages in thread
From: dpreed @ 2014-05-21 14:51 UTC (permalink / raw)
  To: Frits Riep; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 12728 bytes --]


Besides deployment in cerowrt and openwrt, what would really have high leverage is that the techniques developed in cerowrt's exploration (including fq_codel) get deployed where they should be deployed: in the access network systems: CMTS's, DSLAM's, Enterprise boundary gear, etc. from the major players.
 
Cerowrt's fundamental focus has been proving that the techniques really, really work at scale.
 
However, the fundamental "bloat-induced" experiences are actually occurring due to bloat at points where "fast meets slow".  Cerowrt can't really fix the problem in the download direction (currently not so bad because of high download speeds relative to upload speeds in the US - that's in the CMTS's and DSLAM's.
 
What's depressing to me is that the IETF community spends more time trying to convince themselves that bloat is only a theoretical problem, never encountered in the field.  In fact, every lab I've worked at (including the startup accelerator where some of my current company work) has had the network managers complaining to me that a single heavy FTP I'm running causes all of the other users in the site to experience terrible web performance.  But when they call Cisco or F5 or whomever, they get told "there's nothing to do but buy complicated flow-based traffic management boxes to stick in line with the traffic (so they can "slow me down").
 
Bloat is the most common invisible elephant on the Internet.  Just fixing a few access points is a start, but even if we fix all the access points so that uploads interfere less, there's still more impact this one thing can have.
 
So, by all means get this stuff into mainstream, but it's time to start pushing on the access network technology companies (and there are now open switches from Cumulus and even Arista to hack)
 


On Wednesday, May 21, 2014 7:42am, "Frits Riep" <riep@riepnet.com> said:



> Thanks Dave for your responses.  Based on this, it is very good that qos-scripts
> is available now through openwrt, and as I experienced, it provides a huge
> advantage for most users.  I would agree prioritizing ping is in and of itself not
> the key goal, but based on what I've read so far, fq-codel provides dramatically
> better responsiveness for any interactive application such as web-browsing, voip,
> or gaming, so it qos-scripts would be advantageous for users like your mom if she
> were in an environment where she had a slow and shared internet connection.  Is
> that a valid interpretation?  I am interested in further understanding the
> differences based on the brief differences you provide.  It is true that few
> devices provide DSCP marking, but if the latency is controlled for all traffic,
> latency sensitive traffic benefits tremendously even without prioritizing by l7
> (layer 7 ?). Is this interpretation also valid?
> 
> Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but if it
> were set up for her, or if it could be incorporated into a consumer router with
> automatically determining speed parameters, she would benefit totally from the
> performance improvement.  So the technology ultimately needs to be taken
> mainstream, and yes that is a huge task.
> 
> Frits
> 
> -----Original Message-----
> From: Dave Taht [mailto:dave.taht@gmail.com]
> Sent: Tuesday, May 20, 2014 7:14 PM
> To: Frits Riep
> Cc: cerowrt-devel@lists.bufferbloat.net
> Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat
> control for consideration.
> 
> On Tue, May 20, 2014 at 3:11 PM, Frits Riep <riep@riepnet.com> wrote:
> > The concept of eliminating bufferbloat on many more routers is quite
> > appealing.  Reading some of the recent posts makes it clear there is a
> > desire to  get to a stable code, and also to find a new platform
> > beyond the current Netgear.  However, as good as some of the proposed
> > platforms maybe for developing and for doing all of the new
> > capabilities of CeroWRT, I also would like to propose that there also
> > be some focus on reaching a wider and less sophisticated audience to
> > help broaden the awareness and make control of bufferbloat more available and
> easier to attain for more users.
> 
> I agree that reaching more users is important. I disagree we need to reach them
> with cerowrt. More below:
> 
> >
> >
> > ·         It appears there is a desire to merge the code into an
> upcoming
> > OpenWRT barrier breaker release, which is excellent as it will make it
> > easier to fight buffer bloat on a wide range of platforms and provide
> > users with a much easier to install firmware release.  I’d like to be
> > able to download luci-qos-scripts and sqm-scripts and have basic
> > bufferbloat control on a much greater variety of devices and to many
> > more users.  From an awareness perspective this would be a huge win.
> > Is the above scenario what is being planned, is it likely to happen in the
> reasonable future?
> 
> Yes, I'd submitted sqm for review upstream, got back a few comments. Intend to
> resubmit again when I get a chance.
> 
> >
> > ·         From my perspective, it would be ideal to have this
> available to
> > many users in a more affordable platform, something like an 8mb flash
> > router like the TP-Link WDR-4300, which is otherwise a very capable
> > router with dual channels and good performance.
> >
> > ·         (I’ve managed to set up such a WDR-4300, with OpenWRT
> trunk,
> > figured how to telnet and install Luci, then luci-app-qos, and
> > qos-scripts and I thought the bufferbloat control was remarkable.)
> > How much better would it be if I were able to use luci-qos-scripts and
> sqm-scripts instead?
> 
> You can easily add the .ipk files for sqm-scripts and luci-app-sqm to any release
> of openwrt. They are just scripts. They need some optional kernel modules and
> tools.
> 
> I regard the qos-scripts as pretty good - the core differences from sqm are
> 
> qos  vs sqm
> ---------------
> both use fq_codel. :)
> hfsc vs htb # A wash, hfsc mostly behaves like htb ping optimized vs de-optimized
> # optimizing for ping looks good in benchmarks but it's silly in the real world
> (l7) classification vs dscp # clear win to qos here, nearly nothing uses dscp no
> framing compensation vs comprehensive framing compensation # win here for sqm no
> alternate queue models vs many alternate queue models # with fq_codel the winner,
> who cares?
> fits in 4mb flash vs barely fits in 4mb flash
> 
> The real killer problem for qos-scripts, for me, was that they didn't do ipv6. I'd
> like to see that fixed, at the very least.
> 
> 
> >
> > ·         For these target users, they need simplicity, good
> performance,
> > ease of setup and affordability.  They are not able to deal with the
> > routing between subnets on wireless, IPv6 setup, and any complexities
> > introduced by DNSSEC.  Marketing  the advantages of bufferbloat alone
> > requires lots of education and publicity (and as we have seen there
> > are many misleading posts by seemingly persuasive nay-sayers that it is all
> smoke and mirrors.
> 
> Well, my intent is to make the successful bits of technology widely available.
> They are widely available. And being adopted everywhere. Win.
> 
> As for the additional complexities, well, they will get less complex over time.
> 
> In one respect, they are a stake in the ground. I have high hopes for the eventual
> success of hnetd and mdns proxy services, although they are alpha and nearly
> unusable right now, some are making substantial investments into them.
> 
> In another the additional complexities of cero - like routing vs bridging - are
> there to further the research into fixing wifi technologies - which we haven't
> really even started yet. I'm increasingly convinced we need to do make-wifi-fast
> as a separate, focused project, building on a stable base.
> 
> > ·         Would it be possible to have a simplified release of CeroWRT
> (in
> > terms of setup, and features), make It available for a reliable and
> > affordable platform, and publicize it and have it reach hopefully a
> > much wider audience?  This could potentially be through the OpenWRT channels.
> 
> Possible yes. Affordable, no. Given that this has been a nearly full time project
> for me, for the last 38 months, with nearly zero revenue, I have no intent or
> interest in gaining anything other than knowledgable, clued users that want to
> advance the state of the art. My mom doesn't run cerowrt, nor do I want her to.
> 
> If someone dropped ~1m/year on the project, that could change, but at present
> levels of funding I'd be better off working at mcdonalds. Even if funding appeared
> from the sky I'd rather spend it on R&D than GUI...
> 
> Certainly IF there was some cost model that made sense, awesome! let's go for
> world domination!
> 
> I continue to pursue the grant
> route, but the only thing that resonates even slightly with potential funders is
> not speed but security issues, which give me nightmares. Another model that works
> is actually making and selling a router, but that requires up front capital and
> entry into a very tight, profit-limited market.
> 
> Biggest problem we have is supporting ONLY one router, even-semi-well, is a PITA.
> 
> Adding a new one costs more. I'm now on my 4th day of trying to make the archer
> work. That's 6k of my life I'll never have back. And the ath10k in it sucks, and
> working to make that work well is not something I want to be doing due to the
> binary blob wifi firmware.
> 
> I'm all in favor of handing off future cerowrt development to a nonprofit of
> interested users, and sitting back and focusing on fixing just the bits I care
> about, if anyone is interested in forming one...
> 
> > ·         Part of the reason why Tomato had been so popular is that
> the
> > firmware upgrade,  install, configuration, and management was well
> > within the capabilities of the average weekend hacker, and there were
> > compelling features and reliability vs the factory firmwares at the time.
> 
> Yep. dd-wrt is the same. And various downstream users like buffalo, meraki etc.
> 
> I'm totally happy that they exist and have a working market.
> 
> > ·         Even installing OpenWRT, especially Trunk, and finding,
> > downloading and enabling packages, while very powerful, and flexible,
> > is still quite complex to someone who does not spend a lot of time
> > reading wiki’s and release notes.
> 
> Yes, CeroWrt is an improvement on OpenWrt in that regard. But it isn't enough.
> Doing serious UI improvements and simplification IS necessary, and that's not my
> bag. The EFF is making noises about doing something with the front end of openwrt
> and/or cero in the next year or so (see their owtech list for more details), that
> also goes after the security issue.
> 
> > I’d be interested in feedback on these thoughts.
> 
> There you go. I LOVE that we have a happy userbase, and love what we've
> accomplished, and have loved being here to help make it happen, and love that lots
> of people want to get it more out there to more people, it's gratifying as hell,
> and there are a lot of negatives too, like chasing bugs for months on end...
> 
> ... but after we freeze, I need a vacation and to do something else for a while.
> 
> I'm presently planning on spending the summer working on something that pays, and
> on improving ns3 with the GSOC, and testing a deployment of cerowrts on a modest
> scale, and working on a new/improved rate limiter integrated with fq_codel.  And
> only updating cero for CVEs or major new features.
> 
> That's a full plate.
> 
> If someone else wants to step up to maintain or continue to push cerowrt forward
> in some direction or another, I'm all for it.
> 
> It's kind of my hope a clear winner on the chipset front will emerge and we can
> move to that, but even if that happens it will be months and months before it
> could be considered stable...
> 
> 
> 
> >
> >
> > Frits Riep
> >
> >
> >
> >
> > _______________________________________________
> > Cerowrt-devel mailing list
> > Cerowrt-devel@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cerowrt-devel
> >
> 
> 
> 
> --
> Dave Täht
> 
> NSFW:
> https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>

[-- Attachment #2: Type: text/html, Size: 15203 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-20 23:14 ` Dave Taht
@ 2014-05-21 11:42   ` Frits Riep
  2014-05-21 14:51     ` dpreed
  2014-05-21 15:07     ` Dave Taht
  0 siblings, 2 replies; 51+ messages in thread
From: Frits Riep @ 2014-05-21 11:42 UTC (permalink / raw)
  To: 'Dave Taht'; +Cc: cerowrt-devel

Thanks Dave for your responses.  Based on this, it is very good that qos-scripts is available now through openwrt, and as I experienced, it provides a huge advantage for most users.  I would agree prioritizing ping is in and of itself not the key goal, but based on what I've read so far, fq-codel provides dramatically better responsiveness for any interactive application such as web-browsing, voip, or gaming, so it qos-scripts would be advantageous for users like your mom if she were in an environment where she had a slow and shared internet connection.  Is that a valid interpretation?  I am interested in further understanding the differences based on the brief differences you provide.  It is true that few devices provide DSCP marking, but if the latency is controlled for all traffic, latency sensitive traffic benefits tremendously even without prioritizing by l7 (layer 7 ?). Is this interpretation also valid?

Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but if it were set up for her, or if it could be incorporated into a consumer router with automatically determining speed parameters, she would benefit totally from the performance improvement.  So the technology ultimately needs to be taken mainstream, and yes that is a huge task.

Frits

-----Original Message-----
From: Dave Taht [mailto:dave.taht@gmail.com] 
Sent: Tuesday, May 20, 2014 7:14 PM
To: Frits Riep
Cc: cerowrt-devel@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.

On Tue, May 20, 2014 at 3:11 PM, Frits Riep <riep@riepnet.com> wrote:
> The concept of eliminating bufferbloat on many more routers is quite 
> appealing.  Reading some of the recent posts makes it clear there is a 
> desire to  get to a stable code, and also to find a new platform 
> beyond the current Netgear.  However, as good as some of the proposed 
> platforms maybe for developing and for doing all of the new 
> capabilities of CeroWRT, I also would like to propose that there also 
> be some focus on reaching a wider and less sophisticated audience to 
> help broaden the awareness and make control of bufferbloat more available and easier to attain for more users.

I agree that reaching more users is important. I disagree we need to reach them with cerowrt. More below:

>
>
> ·         It appears there is a desire to merge the code into an upcoming
> OpenWRT barrier breaker release, which is excellent as it will make it 
> easier to fight buffer bloat on a wide range of platforms and provide 
> users with a much easier to install firmware release.  I’d like to be 
> able to download luci-qos-scripts and sqm-scripts and have basic 
> bufferbloat control on a much greater variety of devices and to many 
> more users.  From an awareness perspective this would be a huge win.  
> Is the above scenario what is being planned, is it likely to happen in the reasonable future?

Yes, I'd submitted sqm for review upstream, got back a few comments. Intend to resubmit again when I get a chance.

>
> ·         From my perspective, it would be ideal to have this available to
> many users in a more affordable platform, something like an 8mb flash 
> router like the TP-Link WDR-4300, which is otherwise a very capable 
> router with dual channels and good performance.
>
> ·         (I’ve managed to set up such a WDR-4300, with OpenWRT trunk,
> figured how to telnet and install Luci, then luci-app-qos, and 
> qos-scripts and I thought the bufferbloat control was remarkable.)  
> How much better would it be if I were able to use luci-qos-scripts and sqm-scripts instead?

You can easily add the .ipk files for sqm-scripts and luci-app-sqm to any release of openwrt. They are just scripts. They need some optional kernel modules and tools.

I regard the qos-scripts as pretty good - the core differences from sqm are

qos  vs sqm
---------------
both use fq_codel. :)
hfsc vs htb # A wash, hfsc mostly behaves like htb ping optimized vs de-optimized # optimizing for ping looks good in benchmarks but it's silly in the real world
(l7) classification vs dscp # clear win to qos here, nearly nothing uses dscp no framing compensation vs comprehensive framing compensation # win here for sqm no alternate queue models vs many alternate queue models # with fq_codel the winner, who cares?
fits in 4mb flash vs barely fits in 4mb flash

The real killer problem for qos-scripts, for me, was that they didn't do ipv6. I'd like to see that fixed, at the very least.

>
> ·         For these target users, they need simplicity, good performance,
> ease of setup and affordability.  They are not able to deal with the 
> routing between subnets on wireless, IPv6 setup, and any complexities 
> introduced by DNSSEC.  Marketing  the advantages of bufferbloat alone 
> requires lots of education and publicity (and as we have seen there 
> are many misleading posts by seemingly persuasive nay-sayers that it is all smoke and mirrors.

Well, my intent is to make the successful bits of technology widely available.
They are widely available. And being adopted everywhere. Win.

As for the additional complexities, well, they will get less complex over time.

In one respect, they are a stake in the ground. I have high hopes for the eventual success of hnetd and mdns proxy services, although they are alpha and nearly unusable right now, some are making substantial investments into them.

In another the additional complexities of cero - like routing vs bridging - are there to further the research into fixing wifi technologies - which we haven't really even started yet. I'm increasingly convinced we need to do make-wifi-fast as a separate, focused project, building on a stable base.

> ·         Would it be possible to have a simplified release of CeroWRT (in
> terms of setup, and features), make It available for a reliable and 
> affordable platform, and publicize it and have it reach hopefully a 
> much wider audience?  This could potentially be through the OpenWRT channels.

Possible yes. Affordable, no. Given that this has been a nearly full time project for me, for the last 38 months, with nearly zero revenue, I have no intent or interest in gaining anything other than knowledgable, clued users that want to advance the state of the art. My mom doesn't run cerowrt, nor do I want her to.

If someone dropped ~1m/year on the project, that could change, but at present levels of funding I'd be better off working at mcdonalds. Even if funding appeared from the sky I'd rather spend it on R&D than GUI...

Certainly IF there was some cost model that made sense, awesome! let's go for world domination!

I continue to pursue the grant
route, but the only thing that resonates even slightly with potential funders is not speed but security issues, which give me nightmares. Another model that works is actually making and selling a router, but that requires up front capital and entry into a very tight, profit-limited market.

Biggest problem we have is supporting ONLY one router, even-semi-well, is a PITA.

Adding a new one costs more. I'm now on my 4th day of trying to make the archer work. That's 6k of my life I'll never have back. And the ath10k in it sucks, and working to make that work well is not something I want to be doing due to the binary blob wifi firmware.

I'm all in favor of handing off future cerowrt development to a nonprofit of interested users, and sitting back and focusing on fixing just the bits I care about, if anyone is interested in forming one...

> ·         Part of the reason why Tomato had been so popular is that the
> firmware upgrade,  install, configuration, and management was well 
> within the capabilities of the average weekend hacker, and there were 
> compelling features and reliability vs the factory firmwares at the time.

Yep. dd-wrt is the same. And various downstream users like buffalo, meraki etc.

I'm totally happy that they exist and have a working market.

> ·         Even installing OpenWRT, especially Trunk, and finding,
> downloading and enabling packages, while very powerful, and flexible, 
> is still quite complex to someone who does not spend a lot of time 
> reading wiki’s and release notes.

Yes, CeroWrt is an improvement on OpenWrt in that regard. But it isn't enough. Doing serious UI improvements and simplification IS necessary, and that's not my bag. The EFF is making noises about doing something with the front end of openwrt and/or cero in the next year or so (see their owtech list for more details), that also goes after the security issue.

> I’d be interested in feedback on these thoughts.

There you go. I LOVE that we have a happy userbase, and love what we've accomplished, and have loved being here to help make it happen, and love that lots of people want to get it more out there to more people, it's gratifying as hell, and there are a lot of negatives too, like chasing bugs for months on end...

... but after we freeze, I need a vacation and to do something else for a while.

I'm presently planning on spending the summer working on something that pays, and on improving ns3 with the GSOC, and testing a deployment of cerowrts on a modest scale, and working on a new/improved rate limiter integrated with fq_codel.  And only updating cero for CVEs or major new features.

That's a full plate.

If someone else wants to step up to maintain or continue to push cerowrt forward in some direction or another, I'm all for it.

It's kind of my hope a clear winner on the chipset front will emerge and we can move to that, but even if that happens it will be months and months before it could be considered stable...

>
>
> Frits Riep
>
>
>
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>

--
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
  2014-05-20 22:11 Frits Riep
@ 2014-05-20 23:14 ` Dave Taht
  2014-05-21 11:42   ` Frits Riep
  0 siblings, 1 reply; 51+ messages in thread
From: Dave Taht @ 2014-05-20 23:14 UTC (permalink / raw)
  To: Frits Riep; +Cc: cerowrt-devel

On Tue, May 20, 2014 at 3:11 PM, Frits Riep <riep@riepnet.com> wrote:
> The concept of eliminating bufferbloat on many more routers is quite
> appealing.  Reading some of the recent posts makes it clear there is a
> desire to  get to a stable code, and also to find a new platform beyond the
> current Netgear.  However, as good as some of the proposed platforms maybe
> for developing and for doing all of the new capabilities of CeroWRT, I also
> would like to propose that there also be some focus on reaching a wider and
> less sophisticated audience to help broaden the awareness and make control
> of bufferbloat more available and easier to attain for more users.

I agree that reaching more users is important. I disagree we need to reach
them with cerowrt. More below:

>
>
> ·         It appears there is a desire to merge the code into an upcoming
> OpenWRT barrier breaker release, which is excellent as it will make it
> easier to fight buffer bloat on a wide range of platforms and provide users
> with a much easier to install firmware release.  I’d like to be able to
> download luci-qos-scripts and sqm-scripts and have basic bufferbloat control
> on a much greater variety of devices and to many more users.  From an
> awareness perspective this would be a huge win.  Is the above scenario what
> is being planned, is it likely to happen in the reasonable future?

Yes, I'd submitted sqm for review upstream, got back a few comments. Intend
to resubmit again when I get a chance.

>
> ·         From my perspective, it would be ideal to have this available to
> many users in a more affordable platform, something like an 8mb flash router
> like the TP-Link WDR-4300, which is otherwise a very capable router with
> dual channels and good performance.
>
> ·         (I’ve managed to set up such a WDR-4300, with OpenWRT trunk,
> figured how to telnet and install Luci, then luci-app-qos, and qos-scripts
> and I thought the bufferbloat control was remarkable.)  How much better
> would it be if I were able to use luci-qos-scripts and sqm-scripts instead?

You can easily add the .ipk files for sqm-scripts and luci-app-sqm to
any release of openwrt. They are just scripts. They need some optional kernel
modules and tools.

I regard the qos-scripts as pretty good - the core differences from sqm are

qos  vs sqm
---------------
both use fq_codel. :)
hfsc vs htb # A wash, hfsc mostly behaves like htb
ping optimized vs de-optimized # optimizing for ping looks good in
benchmarks but it's silly in the real world
(l7) classification vs dscp # clear win to qos here, nearly nothing uses dscp
no framing compensation vs comprehensive framing compensation # win here for sqm
no alternate queue models vs many alternate queue models # with
fq_codel the winner, who cares?
fits in 4mb flash vs barely fits in 4mb flash

The real killer problem for qos-scripts, for me, was that they didn't
do ipv6. I'd like to see that fixed, at the very least.

>
> ·         For these target users, they need simplicity, good performance,
> ease of setup and affordability.  They are not able to deal with the routing
> between subnets on wireless, IPv6 setup, and any complexities introduced by
> DNSSEC.  Marketing  the advantages of bufferbloat alone requires lots of
> education and publicity (and as we have seen there are many misleading posts
> by seemingly persuasive nay-sayers that it is all smoke and mirrors.

Well, my intent is to make the successful bits of technology widely available.
They are widely available. And being adopted everywhere. Win.

As for the additional complexities, well, they will get less complex over time.

In one respect, they are a stake in the ground. I have high hopes for the
eventual success of hnetd and mdns proxy services, although they are
alpha and nearly
unusable right now, some are making substantial investments into them.

In another the additional complexities of cero - like routing vs bridging -
are there to further the research into fixing wifi technologies -
which we haven't
really even started yet. I'm increasingly convinced we need to do
make-wifi-fast as
a separate, focused project, building on a stable base.

> ·         Would it be possible to have a simplified release of CeroWRT (in
> terms of setup, and features), make It available for a reliable and
> affordable platform, and publicize it and have it reach hopefully a much
> wider audience?  This could potentially be through the OpenWRT channels.

Possible yes. Affordable, no. Given that this has been a nearly full
time project for me,
for the last 38 months, with nearly zero revenue, I have no intent or
interest in gaining
anything other than knowledgable, clued users that want to advance the state
of the art. My mom doesn't run cerowrt, nor do I want her to.

If someone dropped ~1m/year on the project, that could change, but at present
levels of funding I'd be better off working at mcdonalds. Even if funding
appeared from the sky I'd rather spend it on R&D than GUI...

Certainly IF there was some cost model that made sense, awesome! let's
go for world
domination!

I continue to pursue the grant
route, but the only thing that resonates even slightly with potential funders
is not speed but security issues, which give me nightmares. Another model
that works is actually making and selling a router, but that requires
up front capital and entry into a very tight, profit-limited market.

Biggest problem we have is supporting ONLY one router, even-semi-well,
is a PITA.

Adding a new one costs more. I'm now on my 4th day of trying to make the
archer work. That's 6k of my life I'll never have back. And the ath10k in it
sucks, and working to make that work well is not something I want to be doing
due to the binary blob wifi firmware.

I'm all in favor of handing off future cerowrt development to a nonprofit
of interested users, and sitting back and focusing on fixing just the bits
I care about, if anyone is interested in forming one...

> ·         Part of the reason why Tomato had been so popular is that the
> firmware upgrade,  install, configuration, and management was well within
> the capabilities of the average weekend hacker, and there were compelling
> features and reliability vs the factory firmwares at the time.

Yep. dd-wrt is the same. And various downstream users like buffalo, meraki
etc.

I'm totally happy that they exist and have a working market.

> ·         Even installing OpenWRT, especially Trunk, and finding,
> downloading and enabling packages, while very powerful, and flexible, is
> still quite complex to someone who does not spend a lot of time reading
> wiki’s and release notes.

Yes, CeroWrt is an improvement on OpenWrt in that regard. But it
isn't enough. Doing serious UI improvements and simplification IS
necessary, and that's not my bag. The EFF is making noises about
doing something with the front end of openwrt and/or cero in the
next year or so (see their owtech list for more details), that also
goes after the security issue.

> I’d be interested in feedback on these thoughts.

There you go. I LOVE that we have a happy userbase, and love what
we've accomplished, and have loved being here to help make it happen,
and love that lots of people want to get it more out there to more people,
it's gratifying as hell, and there are a lot of negatives too, like chasing
bugs for months on end...

... but after we freeze, I need a vacation and to do something else for a while.

I'm presently planning on spending the summer working on something that pays,
and on improving ns3 with the GSOC, and testing a deployment of cerowrts
on a modest scale, and working on a new/improved rate limiter integrated
with fq_codel.  And only updating cero for CVEs or major new features.

That's a full plate.

If someone else wants to step up to maintain or continue to push cerowrt
forward in some direction or another, I'm all for it.

It's kind of my hope a clear winner on the chipset front will emerge and we
can move to that, but even if that happens it will be months and months
before it could be considered stable...

>
>
> Frits Riep
>
>
>
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>

-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
@ 2014-05-20 22:11 Frits Riep
  2014-05-20 23:14 ` Dave Taht
  0 siblings, 1 reply; 51+ messages in thread
From: Frits Riep @ 2014-05-20 22:11 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 3060 bytes --]

The concept of eliminating bufferbloat on many more routers is quite
appealing.  Reading some of the recent posts makes it clear there is a
desire to  get to a stable code, and also to find a new platform beyond the
current Netgear.  However, as good as some of the proposed platforms maybe
for developing and for doing all of the new capabilities of CeroWRT, I also
would like to propose that there also be some focus on reaching a wider and
less sophisticated audience to help broaden the awareness and make control
of bufferbloat more available and easier to attain for more users.

.         It appears there is a desire to merge the code into an upcoming
OpenWRT barrier breaker release, which is excellent as it will make it
easier to fight buffer bloat on a wide range of platforms and provide users
with a much easier to install firmware release.  I'd like to be able to
download luci-qos-scripts and sqm-scripts and have basic bufferbloat control
on a much greater variety of devices and to many more users.  From an
awareness perspective this would be a huge win.  Is the above scenario what
is being planned, is it likely to happen in the reasonable future?

.         From my perspective, it would be ideal to have this available to
many users in a more affordable platform, something like an 8mb flash router
like the TP-Link WDR-4300, which is otherwise a very capable router with
dual channels and good performance.

.         (I've managed to set up such a WDR-4300, with OpenWRT trunk,
figured how to telnet and install Luci, then luci-app-qos, and qos-scripts
and I thought the bufferbloat control was remarkable.)  How much better
would it be if I were able to use luci-qos-scripts and sqm-scripts instead?

.         For these target users, they need simplicity, good performance,
ease of setup and affordability.  They are not able to deal with the routing
between subnets on wireless, IPv6 setup, and any complexities introduced by
DNSSEC.  Marketing  the advantages of bufferbloat alone requires lots of
education and publicity (and as we have seen there are many misleading posts
by seemingly persuasive nay-sayers that it is all smoke and mirrors.

.         Would it be possible to have a simplified release of CeroWRT (in
terms of setup, and features), make It available for a reliable and
affordable platform, and publicize it and have it reach hopefully a much
wider audience?  This could potentially be through the OpenWRT channels.

.         Part of the reason why Tomato had been so popular is that the
firmware upgrade,  install, configuration, and management was well within
the capabilities of the average weekend hacker, and there were compelling
features and reliability vs the factory firmwares at the time.

.         Even installing OpenWRT, especially Trunk, and finding,
downloading and enabling packages, while very powerful, and flexible, is
still quite complex to someone who does not spend a lot of time reading
wiki's and release notes.

I'd be interested in feedback on these thoughts.

Frits Riep

[-- Attachment #2: Type: text/html, Size: 9899 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2014-08-02 20:17 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-24 14:03 [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration R.
2014-07-25 18:37 ` Valdis.Kletnieks
2014-07-25 21:03   ` David Lang
2014-07-26 11:30     ` Sebastian Moeller
2014-07-26 20:39       ` David Lang
2014-07-26 21:25         ` Sebastian Moeller
2014-07-26 21:45           ` David Lang
2014-07-26 22:24             ` David Lang
2014-07-27  9:50               ` Sebastian Moeller
2014-07-26 22:39             ` Sebastian Moeller
2014-07-26 22:53               ` David Lang
2014-07-26 23:39                 ` Sebastian Moeller
2014-07-27  0:49                   ` David Lang
2014-07-27 11:17                     ` Sebastian Moeller
2014-08-01  4:21       ` Michael Richardson
2014-08-01 18:28         ` Sebastian Moeller
2014-07-25 20:48 ` Wes Felter
2014-07-25 20:57   ` David Lang
2014-07-26 11:18     ` Sebastian Moeller
2014-07-26 20:21       ` David Lang
2014-07-26 20:54         ` Sebastian Moeller
2014-07-26 21:14           ` David Lang
2014-07-26 21:48             ` Sebastian Moeller
2014-07-26 22:23               ` David Lang
2014-07-26 23:08                 ` Sebastian Moeller
2014-07-27  1:04                   ` David Lang
2014-07-27 11:38                     ` Sebastian Moeller
2014-08-01  4:51                       ` Michael Richardson
2014-08-01 18:04                         ` Sebastian Moeller
2014-08-02 20:17                           ` Michael Richardson
2014-08-01  4:40       ` Michael Richardson
2014-07-26 11:01   ` Sebastian Moeller
  -- strict thread matches above, loose matches on Subject: below --
2014-05-24 14:12 R.
2014-05-24 17:31 ` Sebastian Moeller
2014-05-24 19:05 ` David P. Reed
2014-05-20 22:11 Frits Riep
2014-05-20 23:14 ` Dave Taht
2014-05-21 11:42   ` Frits Riep
2014-05-21 14:51     ` dpreed
2014-05-21 15:19       ` Dave Taht
2014-05-21 16:03         ` dpreed
2014-05-21 16:30           ` Dave Taht
2014-05-21 17:55             ` dpreed
2014-05-21 17:47           ` Jim Gettys
2014-05-21 17:53             ` Dave Taht
2014-05-21 17:56               ` dpreed
2014-05-21 17:57                 ` Jim Gettys
2014-05-21 18:31                   ` Dave Taht
2014-05-21 15:07     ` Dave Taht
2014-05-21 16:50       ` Michael Richardson
2014-05-21 17:58       ` David Lang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox