Development issues regarding the cerowrt test router project
 help / color / mirror / Atom feed
* [Cerowrt-devel] some kernel updates
@ 2013-08-21 18:42 Dave Taht
       [not found] ` <56B261F1-2277-457C-9A38-FAB89818288F@gmx.de>
  0 siblings, 1 reply; 43+ messages in thread
From: Dave Taht @ 2013-08-21 18:42 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 1819 bytes --]

It looks like the htb dsl fixes, and the ipv6_subtrees fix, will land in
3.10.9 or later. (they are already patched into my current tree, so yea! I
get to rip them out)

my thanks to everybody for looking into these problems and to those that
fixed them. Too many people to thank here.

Now, are there ANY other must-have features that are missing from the
present kernel that have gotta get in there?

The ONLY other thing that was on my original kernel-feature list for this
release of cero was ipv6 NAT support. I'm totally willing to defer that to
another release. (Any objections to this idea should come with working
packages to kernel and iptables!)

I did get a whole bunch of these radios in and I am thinking of trying to
make their drivers work, but that's about it

http://www.logicsupply.com/products/uwn100

It's based on the MT7601 chipset

and my only reason for making 'em work is I'm trying to get up to 30 radios
in a tight space so I can look at the dense mesh problem harder. (as well
as get a grip on usb interfaces)

In addition to 3.10.X in cero and being in sync with openwrt, there's also
a much better version of pie than I've had before, a possible enhancement
to codel, and I forget what else, in the as yet unreleased devbuild...

So:

I would really like to freeze on and profile the current kernel at this
point. And after that, move to fixing userspace stuff only.

I'm aware of a dhcp bug in userspace which is blocking me from getting a
release out. (for some reason se00 doesn't get a configuration file entry
from the openwrt script)

I keep forgetting to add support for not having an outbound or inbound rate
limit in the aqm code, too.

-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html

[-- Attachment #2: Type: text/html, Size: 2497 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
       [not found]     ` <2148E2EF-A119-4499-BAC1-7E647C53F077@gmx.de>
@ 2013-08-23  0:52       ` Sebastian Moeller
  2013-08-23  5:13         ` Dave Taht
  0 siblings, 1 reply; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-23  0:52 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jesper Dangaard Brouer, cerowrt-devel

Hi List, hi Jesper,

So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature. 
	Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms). 
	I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
	Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.  
`@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases? While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments). 
While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)


Best
	Sebastian


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23  0:52       ` Sebastian Moeller
@ 2013-08-23  5:13         ` Dave Taht
  2013-08-23  7:27           ` Jesper Dangaard Brouer
                             ` (3 more replies)
  0 siblings, 4 replies; 43+ messages in thread
From: Dave Taht @ 2013-08-23  5:13 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Jesper Dangaard Brouer, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 7247 bytes --]

On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi List, hi Jesper,
>
> So I tested 3.10.9-1 to assess the status of the HTB atm link layer
> adjustments to see whether the recent changes resurrected this feature.
>         Unfortunately the htb_private link layer adjustments still is
> broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
> as without link layer adjustments). On the bright side the tc_stab method
> still works as well as before (ping RTT around 40ms).
>         I would like to humbly propose to use the tc stab method in
> cerowrt to perform ATM link layer adjustments as default. To repeat myself,
> simply telling the kernel a lie about the packet size seems more robust
> than fudging HTB's rate tables. Especially since the kernel already fudges
> the packet size to account for the ethernet header and then some, so this
> path should receive more scrutiny by virtue of having more users?
>

It's my hope that the atm code works but is misconfigured. You can output
the tc commands by overriding the TC variable with TC="echo tc" and paste
here.


>         Now, I have been testing this using Dave's most recent cerowrt
> alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
> should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
> restore "linklayer atm" handling) but am not fully sure.
>

It does.


> `@Dave is there an easy way to find which patches you applied to the
> kernels of the cerowrt (testing-)releases?


Normally I DO commit stuff that is in testing, but my big push this time
around was to get everything important into mainline 3.10, as it will be
the "stable" release for a good long time.

So I am still mostly working the x86 side at the moment. I WAS kind of
hoping that everything I just landed would make it up to 3.10. But for your
perusal:

http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the
kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out
due to another weird bug I'm looking at. (It also has support for ipv6 nat
thx to the ever prolific stephen walker heeding the call for patches...).
100% totally untested, I have this weird bug to figure out how to fix next:

http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html

I fear it's a comparison gone south, maybe in bradley's optimizations for
not kernel trapping, don't know.

3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing
the close naming integration, but, had to try this....

If you guys want me to start committing and pushing patches again, I'll do
it, but most of that stuff will end up in 3.10.10, I think, in a couple
days. The rest might make 3.12. Pie has to survive scrutiny on the netdev
list in particular.

While I have you r attention :) I also tested 3.10.9-1's pie and it is way
> better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but
> still worse than fq_codel (ping RTTs around 40ms with proper atm link layer
> adjustments).
>

This is with simple.qos I imagine? Simplest should do better than that with
pie. Judging from how its estimator works I think it will do badly with
multiple queues. But testing will tell...

But, yea, this pie is actually usable, and the previous wasn't. Thank you
for looking at it!

It is different from cisco's last pie drop in that it can do ecn, does
local congestion notification, has a better use of net_random, it's mostly
KernelStyle, and I forget what else.

There is still a major rounding error in the code, and I'd like cisco to
fix the api so it uses identical syntax to codel. Right now you specify
"target 8" to get "target 7", and the "ms" is implied. target 5 becomes
target 3. The default target is a whopping 20 (rounded to 19), which is in
part where your 70+ms of extra delay came from.

Multiple parties have the delusion that 20ms is "good enough".

Part of the remaining delay may also be rounding error. Cisco uses kernels
with HZ=1000, cero uses HZ=250.....

Anyway, to get more comparable tests... you can fiddle with the two $QDISC
lines in simple*.qos to add a target 8 to get closer to a codel 5ms config,
but that would break a codel config which treats target 8 as target 8us.

I MIGHT, if I get energetic enough, fix the API, the time accounting, and a
few other things in pie, the problem is, that ns2_codel seems still more
effective on most workloads and *fq_codel smokes absolutely everything.
There are a few places where pie is a win over straight codel, notably on
packet floods. And it may well be easier to retrofit into existing hardware
fast path designs.

I worry about interactions between pie and other stuff. It seems inevitable
at this point that some form of pie will be widely deployed, and I simply
haven't tried enough traffic types and RTTs to draw a firm conclusion,
period. Long RTTs are the last big place where codel and pie and fq_codel
have to be seriously tested.

ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big
problem I have is getting decent long RTT emulation out of netem (some
preliminary code is up at github)

... and getting cero stable enough for others to actually use - next up is
fixing the userspace problems.

... and trying to make a small dent in the wifi problem along the way
(couple commits coming up)

... and find funding to get through the winter.

There's probably a few other things that are on that list but I forget. Oh,
yea, since the aqm wg was voted on to be formed, I decided I could quit
smoking.


> While I am not able to build kernels, it seems that I am able to quickly
> test whether link layer adjustments work or not. SO aim happy to help where
> I can :)
>

Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and
target 7ms, too. fq_codel, same....

tc -s qdisc show dev ge00
tc -s qdisc show dev ifb0

would be useful info to have in general after each test.

TIA.

There are also things like tcp_upload and tcp_download and
tcp_bidirectional that are useful tests in the rrul suite.

Thank you for your efforts on these early alpha releases. I hope things
will stablize more soon, and I'll fold your aqm stuff into my next attempt
this weekend.

This is some of the stuff I know that needs fixing in userspace:

* TODO readlink not found
* TODO netdev user missing
* TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already
running DHCP-server on interface 'se00' refusing to start, use 'option
force 1' to override
* TODO [   18.480468] Mirror/redirect action on
[   18.539062] Failed to load ipt action
* upload and download are reversed in aqm
* BCP38
* Squash CS values
* Replace ntp
* Make ahcp client mode
* Drop more privs for polipo
* upnp
* priv separation
* Review FW rules
* dhcpv6 support
* uci-defaults/make-cert.sh uses a bad path for px5g
* Doesn't configure the web browser either



>
> Best
>         Sebastian
>
>


-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html

[-- Attachment #2: Type: text/html, Size: 9274 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23  5:13         ` Dave Taht
@ 2013-08-23  7:27           ` Jesper Dangaard Brouer
  2013-08-23 10:15             ` Sebastian Moeller
  2013-08-23 19:51             ` Sebastian Moeller
  2013-08-23  9:16           ` Sebastian Moeller
                             ` (2 subsequent siblings)
  3 siblings, 2 replies; 43+ messages in thread
From: Jesper Dangaard Brouer @ 2013-08-23  7:27 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

On Thu, 22 Aug 2013 22:13:52 -0700
Dave Taht <dave.taht@gmail.com> wrote:

> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
> > Hi List, hi Jesper,
> >
> > So I tested 3.10.9-1 to assess the status of the HTB atm link layer
> > adjustments to see whether the recent changes resurrected this feature.
> >         Unfortunately the htb_private link layer adjustments still is
> > broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
> > as without link layer adjustments). On the bright side the tc_stab method
> > still works as well as before (ping RTT around 40ms).
> >         I would like to humbly propose to use the tc stab method in
> > cerowrt to perform ATM link layer adjustments as default. To repeat myself,
> > simply telling the kernel a lie about the packet size seems more robust
> > than fudging HTB's rate tables.

After the (regression) commit 56b765b79 ("htb: improved accuracy at
high rates"), the kernel no-longer uses the rate tables.  

My commit 8a8e3d84b1719 (net_sched: restore "linklayer atm" handling),
does the ATM cell overhead calculation directly on the packet length,
see psched_l2t_ns() doing (DIV_ROUND_UP(len,48)*53).
Thus, the cell calc should actually be more precise now.... but see below

> > Especially since the kernel already fudges
> > the packet size to account for the ethernet header and then some, so this
> > path should receive more scrutiny by virtue of having more users?

As you mention, the default kernel path (not tc stab) fudges the packet
size for Ethernet headers, AND I made a mistake (back in approx 2006,
sorry) that the "overhead" cannot be a negative number.  Meaning that
some ATM encap overheads simply cannot be configured correctly (as you
need to subtract the ethernet header). (And its quite problematic to
change the kABI to allow for a negative overhead)

Perhaps we should change to use "tc stab" for this reason.  But I'm not
sure "stab" does the right thing either, and its accuracy is also
limited as its actually also table based.  We could easily change the
kernel to perform the ATM cell overhead calc inside "stab", and we
should also fix the GSO packet overhead problem.
(for now remember to disable GSO packets when shaping)
 
> It's my hope that the atm code works but is misconfigured. You can output
> the tc commands by overriding the TC variable with TC="echo tc" and paste
> here.

I also hope is a misconfig.  Please show us the config/script.

I would appreciate a link to the scripts you are using... perhaps a git tree?

 
> >         Now, I have been testing this using Dave's most recent cerowrt
> > alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
> > should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
> > restore "linklayer atm" handling) but am not fully sure.
> >
> 
> It does.

It have not hit the stable tree yet, but DaveM promised he would pass it along.

It does seem Dave Taht have my patch applied:
http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/685-net_sched-restore-linklayer-atm-handling.patch

> > While I am not able to build kernels, it seems that I am able to quickly
> > test whether link layer adjustments work or not. SO aim happy to help where
> > I can :)

So, what is you setup lab, that allow you to test this quickly?


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23  5:13         ` Dave Taht
  2013-08-23  7:27           ` Jesper Dangaard Brouer
@ 2013-08-23  9:16           ` Sebastian Moeller
  2013-08-23 19:38           ` Sebastian Moeller
  2013-08-24 23:08           ` Sebastian Moeller
  3 siblings, 0 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-23  9:16 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jesper Dangaard Brouer, cerowrt-devel

Hi Dave,

On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:

> 
> 
> 
> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi List, hi Jesper,
> 
> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>         Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>         I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
> 
> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.

	I will do this once I am back home. But I did check "tc -d qdisc" and "tc -d class show dev ge00" and got:


>  root@nacktmulle:~# tc -d class show dev ge00
> class htb 1:11 parent 1:1 leaf 110: prio 1 quantum 1500 rate 128000bit overhead 40 ceil 810000bit burst 2Kb/1 mpu 0b overhead 0b cburst 12953b/1 mpu 0b overhead 0b level 0 
> class htb 1:1 root rate 2430Kbit overhead 40 ceil 2430Kbit burst 2Kb/1 mpu 0b overhead 0b cburst 2Kb/1 mpu 0b overhead 0b level 7 
> class htb 1:10 parent 1:1 prio 0 quantum 1500 rate 2430Kbit overhead 40 ceil 2430Kbit burst 2Kb/1 mpu 0b overhead 0b cburst 2Kb/1 mpu 0b overhead 0b level 0 
> class htb 1:13 parent 1:1 leaf 130: prio 3 quantum 1500 rate 405000bit overhead 40 ceil 2366Kbit burst 2Kb/1 mpu 0b overhead 0b cburst 11958b/1 mpu 0b overhead 0b level 0 
> class htb 1:12 parent 1:1 leaf 120: prio 2 quantum 1500 rate 405000bit overhead 40 ceil 2366Kbit burst 2Kb/1 mpu 0b overhead 0b cburst 11958b/1 mpu 0b overhead 0b level 0 
> class fq_codel 110:20e parent 110: 
> class fq_codel 120:10 parent 120: 
> root@nacktmulle:~# tc -d qdisc
> qdisc fq_codel 0: dev se00 root refcnt 2 limit 1024p flows 1024 quantum 300 target 5.0ms interval 100.0ms ecn 
> qdisc htb 1: dev ge00 root refcnt 2 r2q 10 default 12 direct_packets_stat 0 ver 3.17
> qdisc fq_codel 110: dev ge00 parent 1:11 limit 600p flows 1024 quantum 300 target 5.0ms interval 100.0ms 
> qdisc fq_codel 120: dev ge00 parent 1:12 limit 600p flows 1024 quantum 300 target 5.0ms interval 100.0ms 
> qdisc fq_codel 130: dev ge00 parent 1:13 limit 600p flows 1024 quantum 300 target 5.0ms interval 100.0ms 
> qdisc ingress ffff: dev ge00 parent ffff:fff1 ---------------- 
> qdisc htb 1: dev ifb0 root refcnt 2 r2q 10 default 12 direct_packets_stat 0 ver 3.17
> qdisc fq_codel 110: dev ifb0 parent 1:11 limit 1000p flows 1024 quantum 500 target 5.0ms interval 100.0ms ecn 
> qdisc fq_codel 120: dev ifb0 parent 1:12 limit 1000p flows 1024 quantum 1500 target 5.0ms interval 100.0ms ecn 
> qdisc fq_codel 130: dev ifb0 parent 1:13 limit 1000p flows 1024 quantum 1500 target 5.0ms interval 100.0ms ecn 
> qdisc mq 0: dev sw00 root 
> qdisc mq 0: dev gw01 root 
> qdisc mq 0: dev gw00 root 
> qdisc mq 0: dev sw10 root 
> qdisc mq 0: dev gw11 root 
> qdisc mq 0: dev gw10 root 

	So at least the configured overhead of 40 bytes shows up using htb_private. Unlike tc_stab which reports the link layer in "tc -d qdisc" I never figured out whether htb ever reports the link layer option at all. Changing the overhead value in AQM changes the reported overhead in "tc -d class show dev ge00".
	That said I will collect the tc output and post it here...



>         Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
> 
> It does. 

	You rock!

>  
> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
> 
> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time. 

	Oh sorry, I know that I am testing your WIP branch here, and I think it is great that you share this with us so we can test early and often. I just realized that I had no way of knowing which patches made it into 3.10.9-1...

>  
> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
> 
> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it.

	Thanks a lot!

> 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
> 
> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
> 
> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
> 
> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this….

	Getting IPv6 working is my next toy project, once the atm issue is gone for good :)


> 
> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days.

	Oh, no you are on the driver's seat here, you set the pace. I just got carried away by the thought that atm might be fixed and all that was needed was confirmation :)

> The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
> 
> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
> 
> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
> 
> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
> 
> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
> 
> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3.

	Is there a method to this madness?

> The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from. 

	so like 20 up 20 down, totaling 40ms just from a bad target value...

> 
> Multiple parties have the delusion that 20ms is "good enough".

	Hmm, I would have thought that cisco with its IP telephony products of all companies would think that increasing the latency by a factor of 3 to 4 over the unloaded condition would be "sub-optimal".


> 
> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250…..
> 
> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.

	Ah, I can do this tonight and run a test on pie to see whether the RTT comes down by 40ms - 2*7ms = 26ms...

> 
> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything.

	I agree, fq_codel looks like the winner (well efq_codel and nfq_codel are indiscernible from fq_codel in my RRUL tests, but they too are fq_codel for the most part I guess)

> There are a few places where pie is a win over straight codel, notably on packet floods.

	I am not set up in any way to test this.

> And it may well be easier to retrofit into existing hardware fast path designs. 

	Well, it seems superior to no AQM so looks like a decent stop gap measure until fq_cofdel can migrate to all routers :)


> 
> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested. 

	What do you consider to be a long RTT? From home I have a best case ping RTT to snapon of 180ms, so if this is sufficient I might be able to help. Would starting netperf on my router help you in testing? My bandwidth up 2430Kbit/s and down 15494Kbit/s might be a bit measly. I will be taking my family on holiday next week, so there could be another remote test site if you want.

> 
> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github) 

	Or just testing over real long paths?

> 
> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems. 

	I think it actually is pretty useable even in its current CI state.

> 
> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
> 
> ... and find funding to get through the winter.
>  
> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.

	Congrats!

>  
> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
> 
> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same….

	Aye, will do.

>  
> tc -s qdisc show dev ge00
> tc -s qdisc show dev ifb0
> 
> would be useful info to have in general after each test.

	Agreed, that is basically what I do, but so far never saved the results...

> 
> TIA.
> 
> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.

	I might get around to test those, but for the only small niche where I can offer testing RRUL seems to work quite well.

> 
> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.

	Thanks a lot.

> 
> This is some of the stuff I know that needs fixing in userspace:
> 
> * TODO readlink not found
> * TODO netdev user missing
> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
> * TODO [   18.480468] Mirror/redirect action on
> [   18.539062] Failed to load ipt action
> * upload and download are reversed in aqm

I think that is fixed, at least the rate I set in download is applied to the htb attached to ifb0 and the upload to ge00 which seems quite correct. Or are you concerned about the initial values that show up in the AQM guy? If the latter I can try to set the defaults in model/cbi/aqm.lua…

> * BCP38
> * Squash CS values
> * Replace ntp
> * Make ahcp client mode
> * Drop more privs for polipo
> * upnp
> * priv separation
> * Review FW rules
> * dhcpv6 support
> * uci-defaults/make-cert.sh uses a bad path for px5g
> * Doesn't configure the web browser either

	I would love to see the open connect client package to be included (https://dev.openwrt.org/browser/packages/net/openconnect/Makefile), but I might be the only one.

Thanks a lt & Best Regards
	Sebastian


> 
> 
> 
> 
> Best
>         Sebastian
> 
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23  7:27           ` Jesper Dangaard Brouer
@ 2013-08-23 10:15             ` Sebastian Moeller
  2013-08-23 11:16               ` Jesper Dangaard Brouer
  2013-08-23 19:51             ` Sebastian Moeller
  1 sibling, 1 reply; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-23 10:15 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: cerowrt-devel

Hi Jesper,


On Aug 23, 2013, at 09:27 , Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:

> On Thu, 22 Aug 2013 22:13:52 -0700
> Dave Taht <dave.taht@gmail.com> wrote:
> 
>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>>> Hi List, hi Jesper,
>>> 
>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer
>>> adjustments to see whether the recent changes resurrected this feature.
>>>        Unfortunately the htb_private link layer adjustments still is
>>> broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
>>> as without link layer adjustments). On the bright side the tc_stab method
>>> still works as well as before (ping RTT around 40ms).
>>>        I would like to humbly propose to use the tc stab method in
>>> cerowrt to perform ATM link layer adjustments as default. To repeat myself,
>>> simply telling the kernel a lie about the packet size seems more robust
>>> than fudging HTB's rate tables.
> 
> After the (regression) commit 56b765b79 ("htb: improved accuracy at
> high rates"), the kernel no-longer uses the rate tables.  

	See, I am quite a layman here, spelunking through the tc and kernel source code made me believe that the rate tables are still used (I might have looked at too old versions of both repositories though).

> 
> My commit 8a8e3d84b1719 (net_sched: restore "linklayer atm" handling),
> does the ATM cell overhead calculation directly on the packet length,
> see psched_l2t_ns() doing (DIV_ROUND_UP(len,48)*53).
> Thus, the cell calc should actually be more precise now.... but see below

	Is there any way to make HTB report which link layer it assumes?

> 
>>> Especially since the kernel already fudges
>>> the packet size to account for the ethernet header and then some, so this
>>> path should receive more scrutiny by virtue of having more users?
> 
> As you mention, the default kernel path (not tc stab) fudges the packet
> size for Ethernet headers, AND I made a mistake (back in approx 2006,
> sorry) that the "overhead" cannot be a negative number.  

	Mmh, does this also apply to stab?

> Meaning that
> some ATM encap overheads simply cannot be configured correctly (as you
> need to subtract the ethernet header).

	Yes, I see, luckily PPPoA and IPoA seem quite rare, and setting the overhead to be larger than it actually is is relatively benign, as it will overestimate packe size.

> (And its quite problematic to
> change the kABI to allow for a negative overhead)

	Again I have no clue but overhead seems to be integer, not unsigned, so why can it not be negative?

> 
> Perhaps we should change to use "tc stab" for this reason.  But I'm not
> sure "stab" does the right thing either, and its accuracy is also
> limited as its actually also table based.

	But why should a table be problematic here? As long as we can assure the table is equal or larger to the largest packet we are golden. So either we do the manly and  stupid thing and go for 9000 byte jumbo packets for the table size. Or we assume that for the most part ATM users will art best use baby jumbo frames (I think BT does this to allow payload MTU 1500 in spite of PPPoE encapsulation overhead) but than we are quite fine with the default size table maxMTU of 2048 bytes, no?

>  We could easily change the
> kernel to perform the ATM cell overhead calc inside "stab", and we
> should also fix the GSO packet overhead problem.
> (for now remember to disable GSO packets when shaping)

	Yeah I stumbled over the fact that the stab mechanism does not honor the kernels earlier adjustments of packet length (but I seem to be unable to find the actual file and line where this initially is handeled). It would seem relatively easy to make stab take the earlier adjustment into account. Regarding GSO, I assumed that GSO will not play nicely with a AQM anyway as a single large packet will hog too much transfer time...

> 
>> It's my hope that the atm code works but is misconfigured. You can output
>> the tc commands by overriding the TC variable with TC="echo tc" and paste
>> here.
> 
> I also hope is a misconfig.  Please show us the config/script.

	Will do this later. I would be delighted if it is just me being stupid.

> 
> I would appreciate a link to the scripts you are using... perhaps a git tree?

	Unfortunately I have no git tree and no experience with git. I do not think I will be able to set something up quickly. But I use a modified version of cerowrt's AQM scripts which I will post later.

> 
> 
>>>        Now, I have been testing this using Dave's most recent cerowrt
>>> alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
>>> should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
>>> restore "linklayer atm" handling) but am not fully sure.
>>> 
>> 
>> It does.
> 
> It have not hit the stable tree yet, but DaveM promised he would pass it along.
> 
> It does seem Dave Taht have my patch applied:
> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/685-net_sched-restore-linklayer-atm-handling.patch

	Ah, good so it should have worked.

> 
>>> While I am not able to build kernels, it seems that I am able to quickly
>>> test whether link layer adjustments work or not. SO aim happy to help where
>>> I can :)
> 
> So, what is you setup lab, that allow you to test this quickly?

	Oh, Dave and Toke are the giants on whose shoulders I stand here (thanks guys), all I bring to the table basically is the fact that I have an ATM carried ADSL2+ connection at home. 
	Anyway, my theory is that proper link layer adjustments should only show up if not performing these would make my traffic exceed my link-speed and hence accumulate in the DSL modems bloated buffers leading to measurable increases in latency. So I try to saturate the both up- and down-link while measuring latency und different conditions. SInce the worst case overhead of the ATM encapsulation approaches 50% (with best case being around 10%) I try to test the system while shaping to 95% percent of link rates where do expect to see an effect of the link layer adjustments and while shaping to 50% where do not expect to see an effect. And basically that seems to work.
	Practically, I use Toke's netsurf-wrapper project with the RRUL test from my cerowrt router behind an ADSL2+ modem to a close netperf server in Germany. The link layer adjustments are configured in my cerowrt router, using Dave's simple.qos script (3 band HTB shaper with fq_codel on each leaf, taking my overhead of 40 bytes into account and optionally the link layer).
	It turns out that this test nicely saturates my link with 4 up and 4 down TCP flows ad uses a train ping probes at 0.2 second period to assess the latency induced by saturating the links. Now I shape down to 95% and 50% of line rates and simply look at the ping RTT plot for different conditions. In my rig I see around 30ms ping RTT without load, 80ms with full saturation and no linklayer adjustments, and 40ms with working link layer adjustments (hand in hand with slightly reduced TCP good put just as one would expect). In my testing so far activating the HTB link layer adjustments yielded the same 80ms delay I get without link layer adjustments. If I shape down to 50% of link rates HTB, stab and no link layer adjustments yield a ping RTT of ~40ms. Still with proper link layer adjustments the TCP good-put is reduced even at 50% shaping. As Dave explained with an unloaded swallow ermm, ping RTT and fq_codel's target set to 5ms the best case would be 30ms + 2*5ms or 40ms, so I am pretty close to ideal with proper link layer adjustments.

	I guess it should be possible to simply use the reduction in good-put as an easy indicator whether the link layer adjustments work or not. But to do this properly I would need to be able to control the size of the sent packets which I am not, at least not with RRUL. But I am quite sure real computer scientists could easily set something up to test the good-put through a shaping device at differently sized packet streams of the same bandwidth, but I digress. 

	On the other hand I do not claim to be an expert in this field in any way and my measurement method might be flawed, if you think so please do not hesitate to let me know how I could improve it.


Best Regards
	Sebastian


> 
> 
> -- 
> Best regards,
>  Jesper Dangaard Brouer
>  MSc.CS, Sr. Network Kernel Developer at Red Hat
>  Author of http://www.iptv-analyzer.org
>  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 10:15             ` Sebastian Moeller
@ 2013-08-23 11:16               ` Jesper Dangaard Brouer
  2013-08-23 12:37                 ` Sebastian Moeller
  0 siblings, 1 reply; 43+ messages in thread
From: Jesper Dangaard Brouer @ 2013-08-23 11:16 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

On Fri, 23 Aug 2013 12:15:12 +0200
Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Jesper,
> 
> 
> On Aug 23, 2013, at 09:27 , Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:
> 
> > On Thu, 22 Aug 2013 22:13:52 -0700
> > Dave Taht <dave.taht@gmail.com> wrote:
> > 
> >> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> >> 
> >>> Hi List, hi Jesper,
> >>> 
> >>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer
> >>> adjustments to see whether the recent changes resurrected this feature.
> >>>        Unfortunately the htb_private link layer adjustments still is
> >>> broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
> >>> as without link layer adjustments). On the bright side the tc_stab method
> >>> still works as well as before (ping RTT around 40ms).
> >>>        I would like to humbly propose to use the tc stab method in
> >>> cerowrt to perform ATM link layer adjustments as default. To repeat myself,
> >>> simply telling the kernel a lie about the packet size seems more robust
> >>> than fudging HTB's rate tables.
> > 
> > After the (regression) commit 56b765b79 ("htb: improved accuracy at
> > high rates"), the kernel no-longer uses the rate tables.  
> 
> 	See, I am quite a layman here, spelunking through the tc and kernel source code made me believe that the rate tables are still used (I might have looked at too old versions of both repositories though).
> 
> > 
> > My commit 8a8e3d84b1719 (net_sched: restore "linklayer atm" handling),
> > does the ATM cell overhead calculation directly on the packet length,
> > see psched_l2t_ns() doing (DIV_ROUND_UP(len,48)*53).
> > Thus, the cell calc should actually be more precise now.... but see below
> 
> 	Is there any way to make HTB report which link layer it assumes?

I added some print debug statements in my patch, so you can see this in
the kernel log / dmesg.  Run activate the debugging print statments:

  mount -t debugfs none /sys/kernel/debug/
  echo "func __detect_linklayer +p"
> /sys/kernel/debug/dynamic_debug/control

Run your tc script, and run dmesg, or look in kernel-syslog.


> > 
> >>> Especially since the kernel already fudges
> >>> the packet size to account for the ethernet header and then some, so this
> >>> path should receive more scrutiny by virtue of having more users?
> > 
> > As you mention, the default kernel path (not tc stab) fudges the packet
> > size for Ethernet headers, AND I made a mistake (back in approx 2006,
> > sorry) that the "overhead" cannot be a negative number.  
> 
> 	Mmh, does this also apply to stab?

This seems to be two question...

Yes, the Ethernet header size gets adjusted/added before the "stab"
call.
For reference
See: net/core/dev.c function __dev_xmit_skb()
 Call qdisc_pkt_len_init(skb); // adjust Ethernet and account for GSO
 Call qdisc_calculate_pkt_len(skb, q); // is the stab call
  (ps calls __qdisc_calculate_pkt_len() in net/sched/sch_api.c)

The qdisc_pkt_len_init() call were introduced by Eric in
v3.9-rc1~139^2~411.

Thus, in kernels >= 3.9, you would need to change/reduce your tc
"overhead" parameter with -14 bytes (iif you accounted encapsulated
Ethernet header before)

The "overhead" of stab can be negative, so no problem here, in an "int"
for stab.


> > Meaning that
> > some ATM encap overheads simply cannot be configured correctly (as you
> > need to subtract the ethernet header).
> 
> 	Yes, I see, luckily PPPoA and IPoA seem quite rare, and setting the overhead to be larger than it actually is is relatively benign, as it will overestimate packe size.
> 
> > (And its quite problematic to
> > change the kABI to allow for a negative overhead)
> 
> 	Again I have no clue but overhead seems to be integer, not unsigned, so why can it not be negative?

Nope, for reference in include/uapi/linux/pkt_sched.h

This struct tc_ratespec is used by the normal "HTB/TBF" rate system,
notice "unsigned short	overhead".

struct tc_ratespec {
	unsigned char	cell_log;
	__u8		linklayer; /* lower 4 bits */
	unsigned short	overhead;
	short		cell_align;
	unsigned short	mpu;
	__u32		rate;
};


This struct tc_sizespec is used by stab system, where the overhead is
an int.

struct tc_sizespec {
	unsigned char	cell_log;
	unsigned char	size_log;
	short		cell_align;
	int		overhead;
	unsigned int	linklayer;
	unsigned int	mpu;
	unsigned int	mtu;
	unsigned int	tsize;
};




> > 
> > Perhaps we should change to use "tc stab" for this reason.  But I'm not
> > sure "stab" does the right thing either, and its accuracy is also
> > limited as its actually also table based.
> 
> 	But why should a table be problematic here? As long as we can assure the table is equal or larger to the largest packet we are golden. So either we do the manly and  stupid thing and go for 9000 byte jumbo packets for the table size. Or we assume that for the most part ATM users will art best use baby jumbo frames (I think BT does this to allow payload MTU 1500 in spite of PPPoE encapsulation overhead) but than we are quite fine with the default size table maxMTU of 2048 bytes, no?


It is the GSO problem that I'm worried about.  The kernel will bunch up
packets, and that caused the length calculation issue... just disable
GSO.

 ethtool -K eth63 gso off gro off tso off


> >  We could easily change the
> > kernel to perform the ATM cell overhead calc inside "stab", and we
> > should also fix the GSO packet overhead problem.
> > (for now remember to disable GSO packets when shaping)
> 
> 	Yeah I stumbled over the fact that the stab mechanism does not honor the kernels earlier adjustments of packet length (but I seem to be unable to find the actual file and line where this initially is handeled). It would seem relatively easy to make stab take the earlier adjustment into account. Regarding GSO, I assumed that GSO will not play nicely with a AQM anyway as a single large packet will hog too much transfer time...
> 

Yes, just disable GSO ;-)


> > 
> >> It's my hope that the atm code works but is misconfigured. You can output
> >> the tc commands by overriding the TC variable with TC="echo tc" and paste
> >> here.
> > 
> > I also hope is a misconfig.  Please show us the config/script.
> 
> 	Will do this later. I would be delighted if it is just me being stupid.
> 
> > 
> > I would appreciate a link to the scripts you are using... perhaps a git tree?
> 
> 	Unfortunately I have no git tree and no experience with git. I do not think I will be able to set something up quickly. But I use a modified version of cerowrt's AQM scripts which I will post later.
> 

Someone just point me to the cerowrt git repo... please, and point me
at simple.qos script.

Did you add the "linklayer atm" yourself to Dave's script?



 
> > 
> >>>        Now, I have been testing this using Dave's most recent cerowrt
> >>> alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
> >>> should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
> >>> restore "linklayer atm" handling) but am not fully sure.
> >>> 
> >> 
> >> It does.
> > 
> > It have not hit the stable tree yet, but DaveM promised he would pass it along.
> > 
> > It does seem Dave Taht have my patch applied:
> > http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/685-net_sched-restore-linklayer-atm-handling.patch
> 
> 	Ah, good so it should have worked.

It should...

 
> > 
> >>> While I am not able to build kernels, it seems that I am able to quickly
> >>> test whether link layer adjustments work or not. SO aim happy to help where
> >>> I can :)
> > 
> > So, what is you setup lab, that allow you to test this quickly?
> 
> 	Oh, Dave and Toke are the giants on whose shoulders I stand here (thanks guys), all I bring to the table basically is the fact that I have an ATM carried ADSL2+ connection at home.

I will soon have a ADSL lab again, so I try and reproduce your results.
Actually, almost while typing this email, the postman arrived at my
house and delivered a new ADSL modem... as a local Danish ISP
www.fullrate.dk have been so kind to give me a testline for free
(thanks fullrate!).


> 	Anyway, my theory is that proper link layer adjustments should only show up if not performing these would make my traffic exceed my link-speed and hence accumulate in the DSL modems bloated buffers leading to measurable increases in latency. So I try to saturate the both up- and down-link while measuring latency und different conditions. SInce the worst case overhead of the ATM encapsulation approaches 50% (with best case being around 10%) I try to test the system while shaping to 95% percent of link rates where do expect to see an effect of the link layer adjustments and while shaping to 50% where do not expect to see an effect. And basically that seems to work.

> 	Practically, I use Toke's netsurf-wrapper project with the RRUL test from my cerowrt router behind an ADSL2+ modem to a close netperf server in Germany. The link layer adjustments are configured in my cerowrt router, using Dave's simple.qos script (3 band HTB shaper with fq_codel on each leaf, taking my overhead of 40 bytes into account and optionally the link layer).

>	It turns out that this test nicely saturates my link with 4 up
> and 4 down TCP flows ad uses a train ping probes at 0.2 second period
> to assess the latency induced by saturating the links. Now I shape down
> to 95% and 50% of line rates and simply look at the ping RTT plot for
> different conditions. In my rig I see around 30ms ping RTT without
> load, 80ms with full saturation and no linklayer adjustments, and 40ms
> with working link layer adjustments (hand in hand with slightly reduced
> TCP good put just as one would expect). In my testing so far activating
> the HTB link layer adjustments yielded the same 80ms delay I get
> without link layer adjustments. If I shape down to 50% of link rates
> HTB, stab and no link layer adjustments yield a ping RTT of ~40ms.
> Still with proper link layer adjustments the TCP good-put is reduced
> even at 50% shaping. As Dave explained with an unloaded swallow ermm,
> ping RTT and fq_codel's target set to 5ms the best case would be 30ms +
> 2*5ms or 40ms, so I am pretty close to ideal with proper link layer
> adjustments.
> 

> 	I guess it should be possible to simply use the reduction in good-put as an easy indicator whether the link layer adjustments work or not. But to do this properly I would need to be able to control the size of the sent packets which I am not, at least not with RRUL. But I am quite sure real computer scientists could easily set something up to test the good-put through a shaping device at differently sized packet streams of the same bandwidth, but I digress. 
> 
> 	On the other hand I do not claim to be an expert in this field in any way and my measurement method might be flawed, if you think so please do not hesitate to let me know how I could improve it.

I have a hard time following your description... sorry.

So, did you get a working-low-latency setup by using 95% shaping and
"stab" linklayer adjustment?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 11:16               ` Jesper Dangaard Brouer
@ 2013-08-23 12:37                 ` Sebastian Moeller
  2013-08-23 13:02                   ` Fred Stratton
                                     ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-23 12:37 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 8604 bytes --]

Hi Jesper,

thanks for your time…

On Aug 23, 2013, at 13:16 , Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:

> On Fri, 23 Aug 2013 12:15:12 +0200
> Sebastian Moeller <moeller0@gmx.de> wrote:
> 
>> Hi Jesper,
>> 
>> 
>> On Aug 23, 2013, at 09:27 , Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:
>> 
>>> On Thu, 22 Aug 2013 22:13:52 -0700
>>> Dave Taht <dave.taht@gmail.com> wrote:
>>> 
>>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>> 
>>>>> Hi List, hi Jesper,
>>>>> 
>>>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer
>>>>> adjustments to see whether the recent changes resurrected this feature.
>>>>>       Unfortunately the htb_private link layer adjustments still is
>>>>> broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
>>>>> as without link layer adjustments). On the bright side the tc_stab method
>>>>> still works as well as before (ping RTT around 40ms).
>>>>>       I would like to humbly propose to use the tc stab method in
>>>>> cerowrt to perform ATM link layer adjustments as default. To repeat myself,
>>>>> simply telling the kernel a lie about the packet size seems more robust
>>>>> than fudging HTB's rate tables.
>>> 
>>> After the (regression) commit 56b765b79 ("htb: improved accuracy at
>>> high rates"), the kernel no-longer uses the rate tables.  
>> 
>> 	See, I am quite a layman here, spelunking through the tc and kernel source code made me believe that the rate tables are still used (I might have looked at too old versions of both repositories though).
>> 
>>> 
>>> My commit 8a8e3d84b1719 (net_sched: restore "linklayer atm" handling),
>>> does the ATM cell overhead calculation directly on the packet length,
>>> see psched_l2t_ns() doing (DIV_ROUND_UP(len,48)*53).
>>> Thus, the cell calc should actually be more precise now.... but see below
>> 
>> 	Is there any way to make HTB report which link layer it assumes?
> 
> I added some print debug statements in my patch, so you can see this in
> the kernel log / dmesg.  Run activate the debugging print statments:
> 
>  mount -t debugfs none /sys/kernel/debug/
>  echo "func __detect_linklayer +p"
>> /sys/kernel/debug/dynamic_debug/control
> 
> Run your tc script, and run dmesg, or look in kernel-syslog.

	Ah, unfortunately I am not setup to build new kernels for the router I am testing on, so I would hereby like to beg Dave to include that patch in one of the next releases. Would it not a good idea to teach tc to report the link layer for HTB as it does for stab? Having to empirically figure out whether it is applied or not is somewhat cumbersome...


> 
> 
>>> 
>>>>> Especially since the kernel already fudges
>>>>> the packet size to account for the ethernet header and then some, so this
>>>>> path should receive more scrutiny by virtue of having more users?
>>> 
>>> As you mention, the default kernel path (not tc stab) fudges the packet
>>> size for Ethernet headers, AND I made a mistake (back in approx 2006,
>>> sorry) that the "overhead" cannot be a negative number.  
>> 
>> 	Mmh, does this also apply to stab?
> 
> This seems to be two question...
> 
> Yes, the Ethernet header size gets adjusted/added before the "stab"
> call.
> For reference
> See: net/core/dev.c function __dev_xmit_skb()
> Call qdisc_pkt_len_init(skb); // adjust Ethernet and account for GSO
> Call qdisc_calculate_pkt_len(skb, q); // is the stab call
>  (ps calls __qdisc_calculate_pkt_len() in net/sched/sch_api.c)
> 
> The qdisc_pkt_len_init() call were introduced by Eric in
> v3.9-rc1~139^2~411.

	So I look at 3.10 here:

net/core/dev.c, qdisc_pkt_len_init
line 2628: 	qdisc_skb_cb(skb)->pkt_len = skb->len;
and in 
line 2650: qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
so the adjusted size does not seem to end in skb->len


and then in 
net/sched/sch_api.c, __qdisc_calculate_pkt_len
line440: pkt_len = skb->len + stab->szopts.overhead;

So to my eyes this looks like stab is not honoring the changes made in qdisc_pkt_len_init, no? At least I fail to see where 
skb->len is assigned qdisc_skb_cb(skb)->pkt_len
But I happily admit that I am truly a novice in these matters and easily intimidated by C code.


> 
> Thus, in kernels >= 3.9, you would need to change/reduce your tc
> "overhead" parameter with -14 bytes (iif you accounted encapsulated
> Ethernet header before)

	That is what I thought before, but my kernel spelunking made me reconsider and switch to not subtract the 14 bytes since as I understand it the kernel actively does not do it if stab is used.

> 
> The "overhead" of stab can be negative, so no problem here, in an "int"
> for stab.
> 
> 
>>> Meaning that
>>> some ATM encap overheads simply cannot be configured correctly (as you
>>> need to subtract the ethernet header).
>> 
>> 	Yes, I see, luckily PPPoA and IPoA seem quite rare, and setting the overhead to be larger than it actually is is relatively benign, as it will overestimate packe size.
>> 
>>> (And its quite problematic to
>>> change the kABI to allow for a negative overhead)
>> 
>> 	Again I have no clue but overhead seems to be integer, not unsigned, so why can it not be negative?
> 
> Nope, for reference in include/uapi/linux/pkt_sched.h
> 
> This struct tc_ratespec is used by the normal "HTB/TBF" rate system,
> notice "unsigned short	overhead".
> 
> struct tc_ratespec {
> 	unsigned char	cell_log;
> 	__u8		linklayer; /* lower 4 bits */
> 	unsigned short	overhead;
> 	short		cell_align;
> 	unsigned short	mpu;
> 	__u32		rate;
> };
> 
> 
> This struct tc_sizespec is used by stab system, where the overhead is
> an int.
> 
> struct tc_sizespec {
> 	unsigned char	cell_log;
> 	unsigned char	size_log;
> 	short		cell_align;
> 	int		overhead;
> 	unsigned int	linklayer;
> 	unsigned int	mpu;
> 	unsigned int	mtu;
> 	unsigned int	tsize;
> };

	Ah, good to know.

> 
> 
> 
> 
>>> 
>>> Perhaps we should change to use "tc stab" for this reason.  But I'm not
>>> sure "stab" does the right thing either, and its accuracy is also
>>> limited as its actually also table based.
>> 
>> 	But why should a table be problematic here? As long as we can assure the table is equal or larger to the largest packet we are golden. So either we do the manly and  stupid thing and go for 9000 byte jumbo packets for the table size. Or we assume that for the most part ATM users will art best use baby jumbo frames (I think BT does this to allow payload MTU 1500 in spite of PPPoE encapsulation overhead) but than we are quite fine with the default size table maxMTU of 2048 bytes, no?
> 
> 
> It is the GSO problem that I'm worried about.  The kernel will bunch up
> packets, and that caused the length calculation issue... just disable
> GSO.
> 
> ethtool -K eth63 gso off gro off tso off

	Oh, as always Dave has this tackled in cerowrt already, all offloads are off (at 16000Kbit/s 2500Kbit/s there should be no need for offloads nowadays I guess). But I see the issue now. 


> 
> 
>>> We could easily change the
>>> kernel to perform the ATM cell overhead calc inside "stab", and we
>>> should also fix the GSO packet overhead problem.
>>> (for now remember to disable GSO packets when shaping)
>> 
>> 	Yeah I stumbled over the fact that the stab mechanism does not honor the kernels earlier adjustments of packet length (but I seem to be unable to find the actual file and line where this initially is handeled). It would seem relatively easy to make stab take the earlier adjustment into account. Regarding GSO, I assumed that GSO will not play nicely with a AQM anyway as a single large packet will hog too much transfer time...
>> 
> 
> Yes, just disable GSO ;-)

	Done.

> 
> 
>>> 
>>>> It's my hope that the atm code works but is misconfigured. You can output
>>>> the tc commands by overriding the TC variable with TC="echo tc" and paste
>>>> here.
>>> 
>>> I also hope is a misconfig.  Please show us the config/script.
>> 
>> 	Will do this later. I would be delighted if it is just me being stupid.
>> 
>>> 
>>> I would appreciate a link to the scripts you are using... perhaps a git tree?
>> 
>> 	Unfortunately I have no git tree and no experience with git. I do not think I will be able to set something up quickly. But I use a modified version of cerowrt's AQM scripts which I will post later.
>> 
> 
> Someone just point me to the cerowrt git repo... please, and point me
> at simple.qos script.
> 

[-- Attachment #2: simple.qos --]
[-- Type: application/octet-stream, Size: 6424 bytes --]

#!/bin/sh
# Cero3 Shaper
# A 3 bin tc_codel and ipv6 enabled shaping script for
# ethernet gateways

# Copyright (C) 2012 Michael D Taht
# GPLv2

# Compared to the complexity that debloat had become
# this cleanly shows a means of going from diffserv marking
# to prioritization using the current tools (ip(6)tables
# and tc. I note that the complexity of debloat exists for
# a reason, and it is expected that script is run first
# to setup various other parameters such as BQL and ethtool.
# (And that the debloat script has setup the other interfaces)

# You need to jiggle these parameters. Note limits are tuned towards a <10Mbit uplink <60Mbup down




. /usr/lib/aqm/functions.sh



ipt_setup() {

ipt -t mangle -N QOS_MARK_${IFACE}

ipt -t mangle -A QOS_MARK_${IFACE} -j MARK --set-mark 0x2
# You can go further with classification but...
ipt -t mangle -A QOS_MARK_${IFACE} -m dscp --dscp-class CS1 -j MARK --set-mark 0x3
ipt -t mangle -A QOS_MARK_${IFACE} -m dscp --dscp-class CS6 -j MARK --set-mark 0x1
ipt -t mangle -A QOS_MARK_${IFACE} -m dscp --dscp-class EF -j MARK --set-mark 0x1
ipt -t mangle -A QOS_MARK_${IFACE} -m dscp --dscp-class AF42 -j MARK --set-mark 0x1
ipt -t mangle -A QOS_MARK_${IFACE} -m tos  --tos Minimize-Delay -j MARK --set-mark 0x1

# and it might be a good idea to do it for udp tunnels too

# Turn it on. Preserve classification if already performed

ipt -t mangle -A POSTROUTING -o $DEV -m mark --mark 0x00 -g QOS_MARK_${IFACE} 
ipt -t mangle -A POSTROUTING -o $IFACE -m mark --mark 0x00 -g QOS_MARK_${IFACE} 

# The Syn optimization was nice but fq_codel does it for us
# ipt -t mangle -A PREROUTING -i s+ -p tcp -m tcp --tcp-flags SYN,RST,ACK SYN -j MARK --set-mark 0x01
# Not sure if this will work. Encapsulation is a problem period

ipt -t mangle -A PREROUTING -i vtun+ -p tcp -j MARK --set-mark 0x2 # tcp tunnels need ordering

# Emanating from router, do a little more optimization
# but don't bother with it too much. 

ipt -t mangle -A OUTPUT -p udp -m multiport --ports 123,53 -j DSCP --set-dscp-class AF42

#Not clear if the second line is needed
#ipt -t mangle -A OUTPUT -o $IFACE -g QOS_MARK_${IFACE}

}


# TC rules

egress() {

CEIL=${UPLINK}
PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
BE_RATE=`expr $CEIL / 6`   # Min for best effort
BK_RATE=`expr $CEIL / 6`   # Min for background
BE_CEIL=`expr $CEIL - 64`  # A little slop at the top

LQ="quantum `get_mtu $IFACE`"

$TC qdisc del dev $IFACE root 2> /dev/null
$TC qdisc add dev $IFACE root handle 1: ${STABSTRING} htb default 12
$TC class add dev $IFACE parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
$TC class add dev $IFACE parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
$TC class add dev $IFACE parent 1:1 classid 1:11 htb $LQ rate 128kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
$TC class add dev $IFACE parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
$TC class add dev $IFACE parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL

$TC qdisc add dev $IFACE parent 1:11 handle 110: $QDISC limit 600 $NOECN `get_quantum 300` `get_flows ${PRIO_RATE}`
$TC qdisc add dev $IFACE parent 1:12 handle 120: $QDISC limit 600 $NOECN `get_quantum 300` `get_flows ${BE_RATE}`
$TC qdisc add dev $IFACE parent 1:13 handle 130: $QDISC limit 600 $NOECN `get_quantum 300` `get_flows ${BK_RATE}`

# Need a catchall rule

$TC filter add dev $IFACE parent 1:0 protocol all prio 999 u32 \
        match ip protocol 0 0x00 flowid 1:12  

# FIXME should probably change the filter here to do pre-nat
        
$TC filter add dev $IFACE parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
$TC filter add dev $IFACE parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
$TC filter add dev $IFACE parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13

# ipv6 support. Note that the handle indicates the fw mark bucket that is looked for

$TC filter add dev $IFACE parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
$TC filter add dev $IFACE parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
$TC filter add dev $IFACE parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13

# Arp traffic

$TC filter add dev $IFACE parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11

}

ingress() {

CEIL=$DOWNLINK
PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
BE_RATE=`expr $CEIL / 6`   # Min for best effort
BK_RATE=`expr $CEIL / 6`   # Min for background
BE_CEIL=`expr $CEIL - 64`  # A little slop at the top

LQ="quantum `get_mtu $IFACE`"

$TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
$TC qdisc add dev $IFACE handle ffff: ingress
 
$TC qdisc del dev $DEV root  2> /dev/null
$TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
$TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit
$TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0
$TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1
$TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2
$TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3

# I'd prefer to use a pre-nat filter but that causes permutation...

$TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN `get_quantum 500` `get_flows ${PRIO_RATE}`
$TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BE_RATE}`
$TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BK_RATE}`

diffserv $DEV

ifconfig $DEV up

# redirect all IP packets arriving in $IFACE to ifb0 

$TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
  match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV

}

do_modules
ipt_setup
egress 
ingress

# References:
# This alternate shaper attempts to go for 1/u performance in a clever way
# http://git.coverfire.com/?p=linux-qos-scripts.git;a=blob;f=src-3tos.sh;hb=HEAD

# Comments
# This does the right thing with ipv6 traffic.
# It also tries to leverage diffserv to some sane extent. In particular,
# the 'priority' queue is limited to 33% of the total, so EF, and IMM traffic
# cannot starve other types. The rfc suggested 30%. 30% is probably
# a lot in today's world.

# Flaws
# Many!

[-- Attachment #3: functions.sh --]
[-- Type: application/octet-stream, Size: 4122 bytes --]

insmod() {
  lsmod | grep -q ^$1 || $INSMOD $1
}

ipt() {
  d=`echo $* | sed s/-A/-D/g`
  [ "$d" != "$*" ] && {
	iptables $d > /dev/null 2>&1
	ip6tables $d > /dev/null 2>&1
  }
  iptables $* > /dev/null 2>&1
  ip6tables $* > /dev/null 2>&1
}

do_modules() {
	insmod sch_$QDISC                                          
	insmod sch_ingress                                      
	insmod act_mirred                                         
	insmod cls_fw                                          
	insmod sch_htb                                              
}
                                                                                                                  
# You need to jiggle these parameters. Note limits are tuned towards a <10Mbit uplink <60Mbup down

[ -z "$UPLINK" ] && UPLINK=2302
[ -z "$DOWNLINK" ] && DOWNLINK=14698
[ -z "$DEV" ] && DEV=ifb0
[ -z "$QDISC" ] && QDISC=fq_codel
[ -z "$IFACE" ] && IFACE=ge00
[ -z "$LLAM" ] && LLAM="none"
[ -z "$LINKLAYER" ] && LINKLAYER=ethernet
[ -z "$OVERHEAD" ] && OVERHEAD=0
[ -z "$STAB_MTU" ] && STAB_MTU=2047
[ -z "$STAB_MPU" ] && STAB_MPU=0
[ -z "$STAB_TSIZE" ] && STAB_TSIZE=512
[ -z "$AUTOFLOW" ] && AUTOFLOW=0
[ -z "$AUTOECN" ] && AUTOECN=1
[ -z "$TC" ] && TC=`which tc`
[ -z "$INSMOD" ] && INSMOD=`which insmod`


#logger "LLAM: ${LLAM}"
#logger "LINKLAYER: ${LINKLAYER}"

CEIL=$UPLINK

ADSLL=""
if [ "$LLAM" = "htb_private" ]; 
then
	# HTB defaults to MTU 1600 and an implicit fixed TSIZE of 256
	ADSLL="mpu ${STAB_MPU} linklayer ${LINKLAYER} overhead ${OVERHEAD} mtu ${STAB_MTU}"
	logger "ADSLL: ${ADSLL}"
fi

if [ "${LLAM}" = "tc_stab" ]; 
then
	STABSTRING="stab mtu ${STAB_MTU} tsize ${STAB_TSIZE} mpu ${STAB_MPU} overhead ${OVERHEAD} linklayer ${LINKLAYER}"
	logger "STAB: ${STABSTRING}"
fi


aqm_stop() {
	$TC qdisc del dev $IFACE ingress
	$TC qdisc del dev $IFACE root
	$TC qdisc del dev $DEV root
}

# Note this has side effects on the prio variable
# and depends on the interface global too

fc() {
$TC filter add dev $interface protocol ip parent $1 prio $prio u32 match ip tos $2 0xfc classid $3
prio=$(($prio + 1))
$TC filter add dev $interface protocol ipv6 parent $1 prio $prio u32 match ip6 priority $2 0xfc classid $3
prio=$(($prio + 1))
}

# FIXME: actually you need to get the underlying MTU on PPOE thing

get_mtu() {
	F=`cat /sys/class/net/$1/mtu`
	if [ -z "$F" ]
	then
	echo 1500
	else
	echo $F
	fi
}

# FIXME should also calculate the limit
# Frankly I think Xfq_codel can pretty much always run with high numbers of flows
# now that it does fate sharing
# But right now I'm trying to match the ns2 model behavior better
# So SET the autoflow variable to 1 if you want the cablelabs behavior

get_flows() {
	if [ "$AUTOFLOW" -eq "1" ] 
	then
	FLOWS=8
	[ $1 -gt 999 ] && FLOWS=16
	[ $1 -gt 2999 ] && FLOWS=32
	[ $1 -gt 7999 ] && FLOWS=48
	[ $1 -gt 9999 ] && FLOWS=64
	[ $1 -gt 19999 ] && FLOWS=128
	[ $1 -gt 39999 ] && FLOWS=256
	[ $1 -gt 69999 ] && FLOWS=512
	[ $1 -gt 99999 ] && FLOWS=1024
	case $QDISC in
		codel|ns2_codel|pie) ;;
		fq_codel|*fq_codel|sfq) echo flows $FLOWS ;;
	esac
	fi
}	

# set quantum parameter if available for this qdisc

get_quantum() {
    case $QDISC in
	*fq_codel|fq_pie|drr) echo quantum $1 ;;
	*) ;;
    esac

}

# Set some variables to handle different qdiscs

ECN=""
NOECN=""

# ECN is somewhat useful but it helps to have a way
# to turn it on or off. Note we never do ECN on egress currently.

qdisc_variants() {
    if [ "$AUTOECN" -eq "1" ]
    then
    case $QDISC in
	*codel|*pie) ECN=ecn; NOECN=noecn ;;
	*) ;;
    esac
    fi
}

qdisc_variants

# This could be a complete diffserv implementation

diffserv() {

interface=$1
prio=1

# Catchall

$TC filter add dev $interface parent 1:0 protocol all prio 999 u32 \
        match ip protocol 0 0x00 flowid 1:12

# Find the most common matches fast

fc 1:0 0x00 1:12 # BE
fc 1:0 0x20 1:13 # CS1
fc 1:0 0x10 1:11 # IMM
fc 1:0 0xb8 1:11 # EF
fc 1:0 0xc0 1:11 # CS3
fc 1:0 0xe0 1:11 # CS6
fc 1:0 0x90 1:11 # AF42 (mosh)

# Arp traffic
$TC filter add dev $interface parent 1:0 protocol arp prio $prio handle 1 fw classid 1:11
prio=$(($prio + 1))
}

[-- Attachment #4: Type: text/plain, Size: 7022 bytes --]

> 
> Did you add the "linklayer atm" yourself to Dave's script?

	Well, partly the option for HTB was already in his script but under tested, I changed the script to add stab and to allow easier configuration of overhead, mow, mtu and tsize (just for stab) from the guy, but the code is Dave's. I attached the scripts. functions.sh gets the values from the configuration GUI. I extended the way the linklayer option strings are created, but basically it is the same method that dave used. And I do see the right overhead values appear in "tc -d qdisc", so at least something is reaching HTB. Sorry, that I have no repository for easier access.

 


> 
> 
> 
> 
>>> 
>>>>>       Now, I have been testing this using Dave's most recent cerowrt
>>>>> alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
>>>>> should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
>>>>> restore "linklayer atm" handling) but am not fully sure.
>>>>> 
>>>> 
>>>> It does.
>>> 
>>> It have not hit the stable tree yet, but DaveM promised he would pass it along.
>>> 
>>> It does seem Dave Taht have my patch applied:
>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/685-net_sched-restore-linklayer-atm-handling.patch
>> 
>> 	Ah, good so it should have worked.
> 
> It should...
> 
> 
>>> 
>>>>> While I am not able to build kernels, it seems that I am able to quickly
>>>>> test whether link layer adjustments work or not. SO aim happy to help where
>>>>> I can :)
>>> 
>>> So, what is you setup lab, that allow you to test this quickly?
>> 
>> 	Oh, Dave and Toke are the giants on whose shoulders I stand here (thanks guys), all I bring to the table basically is the fact that I have an ATM carried ADSL2+ connection at home.
> 
> I will soon have a ADSL lab again, so I try and reproduce your results.
> Actually, almost while typing this email, the postman arrived at my
> house and delivered a new ADSL modem... as a local Danish ISP
> www.fullrate.dk have been so kind to give me a testline for free
> (thanks fullrate!).

	This is great! Even though I am quite sure that no real DSL link is actually required to test the effect of the link layer adjustments.

> 
> 
>> 	Anyway, my theory is that proper link layer adjustments should only show up if not performing these would make my traffic exceed my link-speed and hence accumulate in the DSL modems bloated buffers leading to measurable increases in latency. So I try to saturate the both up- and down-link while measuring latency und different conditions. SInce the worst case overhead of the ATM encapsulation approaches 50% (with best case being around 10%) I try to test the system while shaping to 95% percent of link rates where do expect to see an effect of the link layer adjustments and while shaping to 50% where do not expect to see an effect. And basically that seems to work.
> 
>> 	Practically, I use Toke's netsurf-wrapper project with the RRUL test from my cerowrt router behind an ADSL2+ modem to a close netperf server in Germany. The link layer adjustments are configured in my cerowrt router, using Dave's simple.qos script (3 band HTB shaper with fq_codel on each leaf, taking my overhead of 40 bytes into account and optionally the link layer).
> 
>> 	It turns out that this test nicely saturates my link with 4 up
>> and 4 down TCP flows ad uses a train ping probes at 0.2 second period
>> to assess the latency induced by saturating the links. Now I shape down
>> to 95% and 50% of line rates and simply look at the ping RTT plot for
>> different conditions. In my rig I see around 30ms ping RTT without
>> load, 80ms with full saturation and no linklayer adjustments, and 40ms
>> with working link layer adjustments (hand in hand with slightly reduced
>> TCP good put just as one would expect). In my testing so far activating
>> the HTB link layer adjustments yielded the same 80ms delay I get
>> without link layer adjustments. If I shape down to 50% of link rates
>> HTB, stab and no link layer adjustments yield a ping RTT of ~40ms.
>> Still with proper link layer adjustments the TCP good-put is reduced
>> even at 50% shaping. As Dave explained with an unloaded swallow ermm,
>> ping RTT and fq_codel's target set to 5ms the best case would be 30ms +
>> 2*5ms or 40ms, so I am pretty close to ideal with proper link layer
>> adjustments.
>> 
> 
>> 	I guess it should be possible to simply use the reduction in good-put as an easy indicator whether the link layer adjustments work or not. But to do this properly I would need to be able to control the size of the sent packets which I am not, at least not with RRUL. But I am quite sure real computer scientists could easily set something up to test the good-put through a shaping device at differently sized packet streams of the same bandwidth, but I digress. 
>> 
>> 	On the other hand I do not claim to be an expert in this field in any way and my measurement method might be flawed, if you think so please do not hesitate to let me know how I could improve it.
> 
> I have a hard time following your description... sorry.

Okay, I see, let me try to present the data in a more ordered fashion:
BW: bandwidth [Kbit/s]
LLAM = link-layer adjustment method
LL: link layer
GP: goodput [Kbit/s]

#	shaped	downBW	(%)		upBW	(%)		LLAM	LL		ping RTT	downGP		upGP
1	no		16309		100		2544	100		none	none	300ms		10000 		1600
2	yes		14698		90		2430	95		none	none	80ms		13600		1800
3	yes		14698		90		2430	95		stab		adsl		40ms		11600		1600
4	yes		15494		95		2430	95		stab		adsl		42ms		12400		1600
5	yes		14698		90		2430	95		htb		adsl		75ms		13200		1600

2	yes		7349		45		1215	48		none	none	45ms		6800		1000
4	yes		7349		45		1215	48		stab		adsl		42ms		5800		800
5	yes		7349		45		1215	48		htb		adsl		45ms		6600		1000

Notes: upGP is way noisier than downGP and therefore harder to estimate

So condition# 3 and 4 show the best latency at high link saturation where link layer adjustments actually make a difference, by controlling whether the DSL modem will buffer or not
At ~50% link saturation there is not much, if any effect of the link layer adjustments on latency, but it still leaves its hallmark on good put reduction. (The partial reduction for htb might be caused by the specification of 40 bytes of overhead which seem to have been honored).
I take the disappearance of the latency effect at 50% as a control data point that shows my measurement approach seems sane enough.
I hope this clears up the information I wanted to give you the first time around.



> 
> So, did you get a working-low-latency setup by using 95% shaping and
> "stab" linklayer adjustment?

	Yes. (With a 3 leaf HTB as shaper and fq_codel as disc)

Best Regards
	Sebastian

> 
> -- 
> Best regards,
>  Jesper Dangaard Brouer
>  MSc.CS, Sr. Network Kernel Developer at Red Hat
>  Author of http://www.iptv-analyzer.org
>  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 12:37                 ` Sebastian Moeller
@ 2013-08-23 13:02                   ` Fred Stratton
  2013-08-23 19:49                     ` Sebastian Moeller
  2013-08-23 15:05                   ` Jesper Dangaard Brouer
  2013-08-23 17:23                   ` Toke Høiland-Jørgensen
  2 siblings, 1 reply; 43+ messages in thread
From: Fred Stratton @ 2013-08-23 13:02 UTC (permalink / raw)
  To: cerowrt-devel


On 23 Aug 2013, at 13:37, Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Jesper,
> 
> thanks for your time…
> 
> On Aug 23, 2013, at 13:16 , Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:
> 
>> On Fri, 23 Aug 2013 12:15:12 +0200
>> Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>>> Hi Jesper,
>>> 
>>> 
>>> On Aug 23, 2013, at 09:27 , Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:
>>> 
>>>> On Thu, 22 Aug 2013 22:13:52 -0700
>>>> Dave Taht <dave.taht@gmail.com> wrote:
>>>> 
>>>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>> 
>>>>>> Hi List, hi Jesper,
>>>>>> 
>>>>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer
>>>>>> adjustments to see whether the recent changes resurrected this feature.
>>>>>>      Unfortunately the htb_private link layer adjustments still is
>>>>>> broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
>>>>>> as without link layer adjustments). On the bright side the tc_stab method
>>>>>> still works as well as before (ping RTT around 40ms).
>>>>>>      I would like to humbly propose to use the tc stab method in
>>>>>> cerowrt to perform ATM link layer adjustments as default. To repeat myself,
>>>>>> simply telling the kernel a lie about the packet size seems more robust
>>>>>> than fudging HTB's rate tables.
>>>> 
>>>> After the (regression) commit 56b765b79 ("htb: improved accuracy at
>>>> high rates"), the kernel no-longer uses the rate tables.  
>>> 
>>> 	See, I am quite a layman here, spelunking through the tc and kernel source code made me believe that the rate tables are still used (I might have looked at too old versions of both repositories though).
>>> 
>>>> 
>>>> My commit 8a8e3d84b1719 (net_sched: restore "linklayer atm" handling),
>>>> does the ATM cell overhead calculation directly on the packet length,
>>>> see psched_l2t_ns() doing (DIV_ROUND_UP(len,48)*53).
>>>> Thus, the cell calc should actually be more precise now.... but see below
>>> 
>>> 	Is there any way to make HTB report which link layer it assumes?
>> 
>> I added some print debug statements in my patch, so you can see this in
>> the kernel log / dmesg.  Run activate the debugging print statments:
>> 
>> mount -t debugfs none /sys/kernel/debug/
>> echo "func __detect_linklayer +p"
>>> /sys/kernel/debug/dynamic_debug/control
>> 
>> Run your tc script, and run dmesg, or look in kernel-syslog.
> 
> 	Ah, unfortunately I am not setup to build new kernels for the router I am testing on, so I would hereby like to beg Dave to include that patch in one of the next releases. Would it not a good idea to teach tc to report the link layer for HTB as it does for stab? Having to empirically figure out whether it is applied or not is somewhat cumbersome...
> 
> 
>> 
>> 
>>>> 
>>>>>> Especially since the kernel already fudges
>>>>>> the packet size to account for the ethernet header and then some, so this
>>>>>> path should receive more scrutiny by virtue of having more users?
>>>> 
>>>> As you mention, the default kernel path (not tc stab) fudges the packet
>>>> size for Ethernet headers, AND I made a mistake (back in approx 2006,
>>>> sorry) that the "overhead" cannot be a negative number.  
>>> 
>>> 	Mmh, does this also apply to stab?
>> 
>> This seems to be two question...
>> 
>> Yes, the Ethernet header size gets adjusted/added before the "stab"
>> call.
>> For reference
>> See: net/core/dev.c function __dev_xmit_skb()
>> Call qdisc_pkt_len_init(skb); // adjust Ethernet and account for GSO
>> Call qdisc_calculate_pkt_len(skb, q); // is the stab call
>> (ps calls __qdisc_calculate_pkt_len() in net/sched/sch_api.c)
>> 
>> The qdisc_pkt_len_init() call were introduced by Eric in
>> v3.9-rc1~139^2~411.
> 
> 	So I look at 3.10 here:
> 
> net/core/dev.c, qdisc_pkt_len_init
> line 2628: 	qdisc_skb_cb(skb)->pkt_len = skb->len;
> and in 
> line 2650: qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
> so the adjusted size does not seem to end in skb->len
> 
> 
> and then in 
> net/sched/sch_api.c, __qdisc_calculate_pkt_len
> line440: pkt_len = skb->len + stab->szopts.overhead;
> 
> So to my eyes this looks like stab is not honoring the changes made in qdisc_pkt_len_init, no? At least I fail to see where 
> skb->len is assigned qdisc_skb_cb(skb)->pkt_len
> But I happily admit that I am truly a novice in these matters and easily intimidated by C code.
> 
> 
>> 
>> Thus, in kernels >= 3.9, you would need to change/reduce your tc
>> "overhead" parameter with -14 bytes (iif you accounted encapsulated
>> Ethernet header before)
> 
> 	That is what I thought before, but my kernel spelunking made me reconsider and switch to not subtract the 14 bytes since as I understand it the kernel actively does not do it if stab is used.
> 
>> 
>> The "overhead" of stab can be negative, so no problem here, in an "int"
>> for stab.
>> 
>> 
>>>> Meaning that
>>>> some ATM encap overheads simply cannot be configured correctly (as you
>>>> need to subtract the ethernet header).
>>> 
>>> 	Yes, I see, luckily PPPoA and IPoA seem quite rare, and setting the overhead to be larger than it actually is is relatively benign, as it will overestimate packe size.


As a point of information, the entire UK uses PPPoA rather than PPPoE, and some hundreds of thousands of users IPoA.



>>> 
>>>> (And its quite problematic to
>>>> change the kABI to allow for a negative overhead)
>>> 
>>> 	Again I have no clue but overhead seems to be integer, not unsigned, so why can it not be negative?
>> 
>> Nope, for reference in include/uapi/linux/pkt_sched.h
>> 
>> This struct tc_ratespec is used by the normal "HTB/TBF" rate system,
>> notice "unsigned short	overhead".
>> 
>> struct tc_ratespec {
>> 	unsigned char	cell_log;
>> 	__u8		linklayer; /* lower 4 bits */
>> 	unsigned short	overhead;
>> 	short		cell_align;
>> 	unsigned short	mpu;
>> 	__u32		rate;
>> };
>> 
>> 
>> This struct tc_sizespec is used by stab system, where the overhead is
>> an int.
>> 
>> struct tc_sizespec {
>> 	unsigned char	cell_log;
>> 	unsigned char	size_log;
>> 	short		cell_align;
>> 	int		overhead;
>> 	unsigned int	linklayer;
>> 	unsigned int	mpu;
>> 	unsigned int	mtu;
>> 	unsigned int	tsize;
>> };
> 
> 	Ah, good to know.
> 
>> 
>> 
>> 
>> 
>>>> 
>>>> Perhaps we should change to use "tc stab" for this reason.  But I'm not
>>>> sure "stab" does the right thing either, and its accuracy is also
>>>> limited as its actually also table based.
>>> 
>>> 	But why should a table be problematic here? As long as we can assure the table is equal or larger to the largest packet we are golden. So either we do the manly and  stupid thing and go for 9000 byte jumbo packets for the table size. Or we assume that for the most part ATM users will art best use baby jumbo frames (I think BT does this to allow payload MTU 1500 in spite of PPPoE encapsulation overhead) but than we are quite fine with the default size table maxMTU of 2048 bytes, no?
>> 
>> 
>> It is the GSO problem that I'm worried about.  The kernel will bunch up
>> packets, and that caused the length calculation issue... just disable
>> GSO.
>> 
>> ethtool -K eth63 gso off gro off tso off
> 
> 	Oh, as always Dave has this tackled in cerowrt already, all offloads are off (at 16000Kbit/s 2500Kbit/s there should be no need for offloads nowadays I guess). But I see the issue now. 
> 
> 
>> 
>> 
>>>> We could easily change the
>>>> kernel to perform the ATM cell overhead calc inside "stab", and we
>>>> should also fix the GSO packet overhead problem.
>>>> (for now remember to disable GSO packets when shaping)
>>> 
>>> 	Yeah I stumbled over the fact that the stab mechanism does not honor the kernels earlier adjustments of packet length (but I seem to be unable to find the actual file and line where this initially is handeled). It would seem relatively easy to make stab take the earlier adjustment into account. Regarding GSO, I assumed that GSO will not play nicely with a AQM anyway as a single large packet will hog too much transfer time...
>>> 
>> 
>> Yes, just disable GSO ;-)
> 
> 	Done.
> 
>> 
>> 
>>>> 
>>>>> It's my hope that the atm code works but is misconfigured. You can output
>>>>> the tc commands by overriding the TC variable with TC="echo tc" and paste
>>>>> here.
>>>> 
>>>> I also hope is a misconfig.  Please show us the config/script.
>>> 
>>> 	Will do this later. I would be delighted if it is just me being stupid.
>>> 
>>>> 
>>>> I would appreciate a link to the scripts you are using... perhaps a git tree?
>>> 
>>> 	Unfortunately I have no git tree and no experience with git. I do not think I will be able to set something up quickly. But I use a modified version of cerowrt's AQM scripts which I will post later.
>>> 
>> 
>> Someone just point me to the cerowrt git repo... please, and point me
>> at simple.qos script.
>> 
> <simple.qos><functions.sh>
>> 
>> 
>> Did you add the "linklayer atm" yourself to Dave's script?
> 
> 	Well, partly the option for HTB was already in his script but under tested, I changed the script to add stab and to allow easier configuration of overhead, mow, mtu and tsize (just for stab) from the guy, but the code is Dave's. I attached the scripts. functions.sh gets the values from the configuration GUI. I extended the way the linklayer option strings are created, but basically it is the same method that dave used. And I do see the right overhead values appear in "tc -d qdisc", so at least something is reaching HTB. Sorry, that I have no repository for easier access.
> 
> 
> 
> 
>> 
>> 
>> 
>> 
>>>> 
>>>>>>      Now, I have been testing this using Dave's most recent cerowrt
>>>>>> alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
>>>>>> should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
>>>>>> restore "linklayer atm" handling) but am not fully sure.
>>>>>> 
>>>>> 
>>>>> It does.
>>>> 
>>>> It have not hit the stable tree yet, but DaveM promised he would pass it along.
>>>> 
>>>> It does seem Dave Taht have my patch applied:
>>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/685-net_sched-restore-linklayer-atm-handling.patch
>>> 
>>> 	Ah, good so it should have worked.
>> 
>> It should...
>> 
>> 
>>>> 
>>>>>> While I am not able to build kernels, it seems that I am able to quickly
>>>>>> test whether link layer adjustments work or not. SO aim happy to help where
>>>>>> I can :)
>>>> 
>>>> So, what is you setup lab, that allow you to test this quickly?
>>> 
>>> 	Oh, Dave and Toke are the giants on whose shoulders I stand here (thanks guys), all I bring to the table basically is the fact that I have an ATM carried ADSL2+ connection at home.
>> 
>> I will soon have a ADSL lab again, so I try and reproduce your results.
>> Actually, almost while typing this email, the postman arrived at my
>> house and delivered a new ADSL modem... as a local Danish ISP
>> www.fullrate.dk have been so kind to give me a testline for free
>> (thanks fullrate!).
> 
> 	This is great! Even though I am quite sure that no real DSL link is actually required to test the effect of the link layer adjustments.
> 
>> 
>> 
>>> 	Anyway, my theory is that proper link layer adjustments should only show up if not performing these would make my traffic exceed my link-speed and hence accumulate in the DSL modems bloated buffers leading to measurable increases in latency. So I try to saturate the both up- and down-link while measuring latency und different conditions. SInce the worst case overhead of the ATM encapsulation approaches 50% (with best case being around 10%) I try to test the system while shaping to 95% percent of link rates where do expect to see an effect of the link layer adjustments and while shaping to 50% where do not expect to see an effect. And basically that seems to work.
>> 
>>> 	Practically, I use Toke's netsurf-wrapper project with the RRUL test from my cerowrt router behind an ADSL2+ modem to a close netperf server in Germany. The link layer adjustments are configured in my cerowrt router, using Dave's simple.qos script (3 band HTB shaper with fq_codel on each leaf, taking my overhead of 40 bytes into account and optionally the link layer).
>> 
>>> 	It turns out that this test nicely saturates my link with 4 up
>>> and 4 down TCP flows ad uses a train ping probes at 0.2 second period
>>> to assess the latency induced by saturating the links. Now I shape down
>>> to 95% and 50% of line rates and simply look at the ping RTT plot for
>>> different conditions. In my rig I see around 30ms ping RTT without
>>> load, 80ms with full saturation and no linklayer adjustments, and 40ms
>>> with working link layer adjustments (hand in hand with slightly reduced
>>> TCP good put just as one would expect). In my testing so far activating
>>> the HTB link layer adjustments yielded the same 80ms delay I get
>>> without link layer adjustments. If I shape down to 50% of link rates
>>> HTB, stab and no link layer adjustments yield a ping RTT of ~40ms.
>>> Still with proper link layer adjustments the TCP good-put is reduced
>>> even at 50% shaping. As Dave explained with an unloaded swallow ermm,
>>> ping RTT and fq_codel's target set to 5ms the best case would be 30ms +
>>> 2*5ms or 40ms, so I am pretty close to ideal with proper link layer
>>> adjustments.
>>> 
>> 
>>> 	I guess it should be possible to simply use the reduction in good-put as an easy indicator whether the link layer adjustments work or not. But to do this properly I would need to be able to control the size of the sent packets which I am not, at least not with RRUL. But I am quite sure real computer scientists could easily set something up to test the good-put through a shaping device at differently sized packet streams of the same bandwidth, but I digress. 
>>> 
>>> 	On the other hand I do not claim to be an expert in this field in any way and my measurement method might be flawed, if you think so please do not hesitate to let me know how I could improve it.
>> 
>> I have a hard time following your description... sorry.
> 
> Okay, I see, let me try to present the data in a more ordered fashion:
> BW: bandwidth [Kbit/s]
> LLAM = link-layer adjustment method
> LL: link layer
> GP: goodput [Kbit/s]
> 
> #	shaped	downBW	(%)		upBW	(%)		LLAM	LL		ping RTT	downGP		upGP
> 1	no		16309		100		2544	100		none	none	300ms		10000 		1600
> 2	yes		14698		90		2430	95		none	none	80ms		13600		1800
> 3	yes		14698		90		2430	95		stab		adsl		40ms		11600		1600
> 4	yes		15494		95		2430	95		stab		adsl		42ms		12400		1600
> 5	yes		14698		90		2430	95		htb		adsl		75ms		13200		1600
> 
> 2	yes		7349		45		1215	48		none	none	45ms		6800		1000
> 4	yes		7349		45		1215	48		stab		adsl		42ms		5800		800
> 5	yes		7349		45		1215	48		htb		adsl		45ms		6600		1000
> 
> Notes: upGP is way noisier than downGP and therefore harder to estimate
> 
> So condition# 3 and 4 show the best latency at high link saturation where link layer adjustments actually make a difference, by controlling whether the DSL modem will buffer or not
> At ~50% link saturation there is not much, if any effect of the link layer adjustments on latency, but it still leaves its hallmark on good put reduction. (The partial reduction for htb might be caused by the specification of 40 bytes of overhead which seem to have been honored).
> I take the disappearance of the latency effect at 50% as a control data point that shows my measurement approach seems sane enough.
> I hope this clears up the information I wanted to give you the first time around.
> 
> 
> 
>> 
>> So, did you get a working-low-latency setup by using 95% shaping and
>> "stab" linklayer adjustment?
> 
> 	Yes. (With a 3 leaf HTB as shaper and fq_codel as disc)
> 
> Best Regards
> 	Sebastian
> 
>> 
>> -- 
>> Best regards,
>> Jesper Dangaard Brouer
>> MSc.CS, Sr. Network Kernel Developer at Red Hat
>> Author of http://www.iptv-analyzer.org
>> LinkedIn: http://www.linkedin.com/in/brouer
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 12:37                 ` Sebastian Moeller
  2013-08-23 13:02                   ` Fred Stratton
@ 2013-08-23 15:05                   ` Jesper Dangaard Brouer
  2013-08-23 17:23                   ` Toke Høiland-Jørgensen
  2 siblings, 0 replies; 43+ messages in thread
From: Jesper Dangaard Brouer @ 2013-08-23 15:05 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

On Fri, 23 Aug 2013 14:37:47 +0200
Sebastian Moeller <moeller0@gmx.de> wrote:

> On Aug 23, 2013, at 13:16 , Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:
> 
> > On Fri, 23 Aug 2013 12:15:12 +0200
> > Sebastian Moeller <moeller0@gmx.de> wrote:

[...]

> >>>>> Especially since the kernel already fudges
> >>>>> the packet size to account for the ethernet header and then some, so this
> >>>>> path should receive more scrutiny by virtue of having more users?
> >>> 
> >>> As you mention, the default kernel path (not tc stab) fudges the packet
> >>> size for Ethernet headers, AND I made a mistake (back in approx 2006,
> >>> sorry) that the "overhead" cannot be a negative number.  
> >> 
> >> 	Mmh, does this also apply to stab?
> > 
> > This seems to be two question...
> > 
> > Yes, the Ethernet header size gets adjusted/added before the "stab"
> > call.
> > For reference
> > See: net/core/dev.c function __dev_xmit_skb()
> > Call qdisc_pkt_len_init(skb); // adjust Ethernet and account for GSO
> > Call qdisc_calculate_pkt_len(skb, q); // is the stab call
> >  (ps calls __qdisc_calculate_pkt_len() in net/sched/sch_api.c)
> > 
> > The qdisc_pkt_len_init() call were introduced by Eric in
> > v3.9-rc1~139^2~411.
> 
> 	So I look at 3.10 here:
> 
> net/core/dev.c, qdisc_pkt_len_init
> line 2628: 	qdisc_skb_cb(skb)->pkt_len = skb->len;
> and in 
> line 2650: qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
> so the adjusted size does not seem to end in skb->len
> 
> 
> and then in 
> net/sched/sch_api.c, __qdisc_calculate_pkt_len
> line440: pkt_len = skb->len + stab->szopts.overhead;
> 
> So to my eyes this looks like stab is not honoring the changes made in qdisc_pkt_len_init, no? At least I fail to see where 
> skb->len is assigned qdisc_skb_cb(skb)->pkt_len
> But I happily admit that I am truly a novice in these matters and easily intimidated by C code.


You are absolutely correct, and I were wrong. Guess I didn't read
__qdisc_calculate_pkt_len() carefully enough, sorry.

When stab is enabled, it will override the skb pkt_len.


> > Thus, in kernels >= 3.9, you would need to change/reduce your tc
> > "overhead" parameter with -14 bytes (iif you accounted encapsulated
> > Ethernet header before)
> 
> 	That is what I thought before, but my kernel spelunking made
> me reconsider and switch to not subtract the 14 bytes since as I
> understand it the kernel actively does not do it if stab is used.

You are correct, and I was wrong.  Good to have more eyeballs on the
code :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 12:37                 ` Sebastian Moeller
  2013-08-23 13:02                   ` Fred Stratton
  2013-08-23 15:05                   ` Jesper Dangaard Brouer
@ 2013-08-23 17:23                   ` Toke Høiland-Jørgensen
  2013-08-23 20:09                     ` Sebastian Moeller
  2 siblings, 1 reply; 43+ messages in thread
From: Toke Høiland-Jørgensen @ 2013-08-23 17:23 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel, Jesper Dangaard Brouer

[-- Attachment #1: Type: text/plain, Size: 954 bytes --]

Sebastian Moeller <moeller0@gmx.de> writes:

> 	Well, partly the option for HTB was already in his script but under
> tested, I changed the script to add stab and to allow easier configuration of
> overhead, mow, mtu and tsize (just for stab) from the guy, but the code is
> Dave's. I attached the scripts. functions.sh gets the values from the
> configuration GUI. I extended the way the linklayer option strings are created,
> but basically it is the same method that dave used. And I do see the right
> overhead values appear in "tc -d qdisc", so at least something is reaching HTB.
> Sorry, that I have no repository for easier access.

The repository containing the cerowrt-specific packages is at
https://github.com/dtaht/ceropackages-3.3 -- the AQM script specifically
is here:

https://github.com/dtaht/ceropackages-3.3/tree/master/net/aqm-scripts

With the gui at:

https://github.com/dtaht/ceropackages-3.3/tree/master/luci/luci-app-aqm

-Toke

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23  5:13         ` Dave Taht
  2013-08-23  7:27           ` Jesper Dangaard Brouer
  2013-08-23  9:16           ` Sebastian Moeller
@ 2013-08-23 19:38           ` Sebastian Moeller
  2013-08-23 19:47             ` Dave Taht
  2013-08-27 10:42             ` [Cerowrt-devel] some kernel updates Jesper Dangaard Brouer
  2013-08-24 23:08           ` Sebastian Moeller
  3 siblings, 2 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-23 19:38 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jesper Dangaard Brouer, cerowrt-devel

Hi Dave,

On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:

> 
> 
> 
> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi List, hi Jesper,
> 
> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>         Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>         I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
> 
> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.

	So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):

  tc qdisc del dev ge00 root
  tc qdisc add dev ge00 root handle 1: htb default 12
  tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
  tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
  tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
  tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
  tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
  tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn quantum 300
  tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn quantum 300
  tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum 300
  tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
  tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
  tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
  tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
  tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
  tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
  tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13
  tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11
  tc qdisc del dev ge00 handle ffff: ingress
  tc qdisc add dev ge00 handle ffff: ingress
  tc qdisc del dev ifb0 root
  tc qdisc add dev ifb0 root handle 1: htb default 12
  tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
  tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
  tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
  tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
  tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
  tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn quantum 500
  tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn quantum 1500
  tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
  tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
  tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
  tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc classid 1:12
  tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos 0x20 0xfc classid 1:13
  tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid 1:13
  tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos 0x10 0xfc classid 1:11
  tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
  tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 0xfc classid 1:11
  tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
  tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc classid 1:11
  tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6 priority 0xc0 0xfc classid 1:11
  tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid 1:11
  tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6 priority 0xe0 0xfc classid 1:11
  tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid 1:11
  tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6 priority 0x90 0xfc classid 1:11
  tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
  tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0

I notice it seem this only shows up for egress(), but looking at simple.qos ingress() is not addend ${ADSLL} at all so that is to be expected. There is nothing in dmesg at all.

So I am off to add ADSLL to ingress() as well and then test RRUL again...


Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
but got no output even though debugs was already mounted…

Best 
	Sebastian

>  
>         Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
> 
> It does. 
>  
> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
> 
> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time. 
>  
> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
> 
> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
> 
> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
> 
> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
> 
> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
> 
> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
> 
> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
> 
> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
> 
> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
> 
> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
> 
> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3. The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from. 
> 
> Multiple parties have the delusion that 20ms is "good enough".
> 
> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
> 
> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
> 
> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs. 
> 
> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested. 
> 
> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github) 
> 
> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems. 
> 
> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
> 
> ... and find funding to get through the winter.
>  
> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
>  
> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
> 
> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
>  
> tc -s qdisc show dev ge00
> tc -s qdisc show dev ifb0
> 
> would be useful info to have in general after each test.
> 
> TIA.
> 
> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
> 
> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
> 
> This is some of the stuff I know that needs fixing in userspace:
> 
> * TODO readlink not found
> * TODO netdev user missing
> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
> * TODO [   18.480468] Mirror/redirect action on
> [   18.539062] Failed to load ipt action
> * upload and download are reversed in aqm
> * BCP38
> * Squash CS values
> * Replace ntp
> * Make ahcp client mode
> * Drop more privs for polipo
> * upnp
> * priv separation
> * Review FW rules
> * dhcpv6 support
> * uci-defaults/make-cert.sh uses a bad path for px5g
> * Doesn't configure the web browser either
> 
> 
> 
> 
> Best
>         Sebastian
> 
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 19:38           ` Sebastian Moeller
@ 2013-08-23 19:47             ` Dave Taht
  2013-08-23 19:56               ` Sebastian Moeller
  2013-08-27 10:42             ` [Cerowrt-devel] some kernel updates Jesper Dangaard Brouer
  1 sibling, 1 reply; 43+ messages in thread
From: Dave Taht @ 2013-08-23 19:47 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Jesper Dangaard Brouer, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 13510 bytes --]

quick note: running this script requires that you

ifconfig ifb0 up

at some point.


On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Dave,
>
> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
>
> >
> >
> >
> > On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de>
> wrote:
> > Hi List, hi Jesper,
> >
> > So I tested 3.10.9-1 to assess the status of the HTB atm link layer
> adjustments to see whether the recent changes resurrected this feature.
> >         Unfortunately the htb_private link layer adjustments still is
> broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
> as without link layer adjustments). On the bright side the tc_stab method
> still works as well as before (ping RTT around 40ms).
> >         I would like to humbly propose to use the tc stab method in
> cerowrt to perform ATM link layer adjustments as default. To repeat myself,
> simply telling the kernel a lie about the packet size seems more robust
> than fudging HTB's rate tables. Especially since the kernel already fudges
> the packet size to account for the ethernet header and then some, so this
> path should receive more scrutiny by virtue of having more users?
> >
> > It's my hope that the atm code works but is misconfigured. You can
> output the tc commands by overriding the TC variable with TC="echo tc" and
> paste here.
>
>         So I went for TC="logger tc" and used log read to harvest as I
> could not find the echo output, but I guess that should not matter. So here
> is the result (slightly edited to get rid of the log timestamps and log
> level):
>
>   tc qdisc del dev ge00 root
>   tc qdisc add dev ge00 root handle 1: htb default 12
>   tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate
> 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate
> 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate
> 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate
> 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate
> 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn
> quantum 300
>   tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn
> quantum 300
>   tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn
> quantum 300
>   tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip
> protocol 0 0x00 flowid 1:12
>   tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid
> 1:11
>   tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid
> 1:12
>   tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid
> 1:13
>   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw
> classid 1:11
>   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw
> classid 1:12
>   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw
> classid 1:13
>   tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw
> classid 1:11
>   tc qdisc del dev ge00 handle ffff: ingress
>   tc qdisc add dev ge00 handle ffff: ingress
>   tc qdisc del dev ifb0 root
>   tc qdisc add dev ifb0 root handle 1: htb default 12
>   tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate
> 15494kbit ceil 15494kbit
>   tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate
> 15494kbit ceil 15494kbit prio 0
>   tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate
> 32kbit ceil 5164kbit prio 1
>   tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate
> 2582kbit ceil 15430kbit prio 2
>   tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate
> 2582kbit ceil 15430kbit prio 3
>   tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn
> quantum 500
>   tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn
> quantum 1500
>   tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn
> quantum 1500
>   tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip
> protocol 0 0x00 flowid 1:12
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos
> 0x00 0xfc classid 1:12
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6
> priority 0x00 0xfc classid 1:12
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos
> 0x20 0xfc classid 1:13
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6
> priority 0x20 0xfc classid 1:13
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos
> 0x10 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6
> priority 0x10 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos
> 0xb8 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6
> priority 0xb8 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos
> 0xc0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6
> priority 0xc0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos
> 0xe0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6
> priority 0xe0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos
> 0x90 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6
> priority 0x90 0xfc classid 1:11
>   tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw
> classid 1:11
>   tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0
> 0 flowid 1:1 action mirred egress redirect dev ifb0
>
> I notice it seem this only shows up for egress(), but looking at
> simple.qos ingress() is not addend ${ADSLL} at all so that is to be
> expected. There is nothing in dmesg at all.
>
> So I am off to add ADSLL to ingress() as well and then test RRUL again...
>
>
> Jesper please let me know if this looks reasonable, at least to my eye it
> seems to fit with what "tc disc add htb help" tells me. I tried your:
> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
> but got no output even though debugs was already mounted…
>
> Best
>         Sebastian
>
> >
> >         Now, I have been testing this using Dave's most recent cerowrt
> alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
> should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
> restore "linklayer atm" handling) but am not fully sure.
> >
> > It does.
> >
> > `@Dave is there an easy way to find which patches you applied to the
> kernels of the cerowrt (testing-)releases?
> >
> > Normally I DO commit stuff that is in testing, but my big push this time
> around was to get everything important into mainline 3.10, as it will be
> the "stable" release for a good long time.
> >
> > So I am still mostly working the x86 side at the moment. I WAS kind of
> hoping that everything I just landed would make it up to 3.10. But for your
> perusal:
> >
> > http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of
> the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped
> out due to another weird bug I'm looking at. (It also has support for ipv6
> nat thx to the ever prolific stephen walker heeding the call for
> patches...). 100% totally untested, I have this weird bug to figure out how
> to fix next:
> >
> >
> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
> >
> > I fear it's a comparison gone south, maybe in bradley's optimizations
> for not kernel trapping, don't know.
> >
> > 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE
> losing the close naming integration, but, had to try this....
> >
> > If you guys want me to start committing and pushing patches again, I'll
> do it, but most of that stuff will end up in 3.10.10, I think, in a couple
> days. The rest might make 3.12. Pie has to survive scrutiny on the netdev
> list in particular.
> >
> > While I have you r attention :) I also tested 3.10.9-1's pie and it is
> way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms)
> but still worse than fq_codel (ping RTTs around 40ms with proper atm link
> layer adjustments).
> >
> > This is with simple.qos I imagine? Simplest should do better than that
> with pie. Judging from how its estimator works I think it will do badly
> with multiple queues. But testing will tell...
> >
> > But, yea, this pie is actually usable, and the previous wasn't. Thank
> you for looking at it!
> >
> > It is different from cisco's last pie drop in that it can do ecn, does
> local congestion notification, has a better use of net_random, it's mostly
> KernelStyle, and I forget what else.
> >
> > There is still a major rounding error in the code, and I'd like cisco to
> fix the api so it uses identical syntax to codel. Right now you specify
> "target 8" to get "target 7", and the "ms" is implied. target 5 becomes
> target 3. The default target is a whopping 20 (rounded to 19), which is in
> part where your 70+ms of extra delay came from.
> >
> > Multiple parties have the delusion that 20ms is "good enough".
> >
> > Part of the remaining delay may also be rounding error. Cisco uses
> kernels with HZ=1000, cero uses HZ=250.....
> >
> > Anyway, to get more comparable tests... you can fiddle with the two
> $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms
> config, but that would break a codel config which treats target 8 as target
> 8us.
> >
> > I MIGHT, if I get energetic enough, fix the API, the time accounting,
> and a few other things in pie, the problem is, that ns2_codel seems still
> more effective on most workloads and *fq_codel smokes absolutely
> everything. There are a few places where pie is a win over straight codel,
> notably on packet floods. And it may well be easier to retrofit into
> existing hardware fast path designs.
> >
> > I worry about interactions between pie and other stuff. It seems
> inevitable at this point that some form of pie will be widely deployed, and
> I simply haven't tried enough traffic types and RTTs to draw a firm
> conclusion, period. Long RTTs are the last big place where codel and pie
> and fq_codel have to be seriously tested.
> >
> > ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A
> big problem I have is getting decent long RTT emulation out of netem (some
> preliminary code is up at github)
> >
> > ... and getting cero stable enough for others to actually use - next up
> is fixing the userspace problems.
> >
> > ... and trying to make a small dent in the wifi problem along the way
> (couple commits coming up)
> >
> > ... and find funding to get through the winter.
> >
> > There's probably a few other things that are on that list but I forget.
> Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit
> smoking.
> >
> > While I am not able to build kernels, it seems that I am able to quickly
> test whether link layer adjustments work or not. SO aim happy to help where
> I can :)
> >
> > Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and
> target 7ms, too. fq_codel, same....
> >
> > tc -s qdisc show dev ge00
> > tc -s qdisc show dev ifb0
> >
> > would be useful info to have in general after each test.
> >
> > TIA.
> >
> > There are also things like tcp_upload and tcp_download and
> tcp_bidirectional that are useful tests in the rrul suite.
> >
> > Thank you for your efforts on these early alpha releases. I hope things
> will stablize more soon, and I'll fold your aqm stuff into my next attempt
> this weekend.
> >
> > This is some of the stuff I know that needs fixing in userspace:
> >
> > * TODO readlink not found
> > * TODO netdev user missing
> > * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already
> running DHCP-server on interface 'se00' refusing to start, use 'option
> force 1' to override
> > * TODO [   18.480468] Mirror/redirect action on
> > [   18.539062] Failed to load ipt action
> > * upload and download are reversed in aqm
> > * BCP38
> > * Squash CS values
> > * Replace ntp
> > * Make ahcp client mode
> > * Drop more privs for polipo
> > * upnp
> > * priv separation
> > * Review FW rules
> > * dhcpv6 support
> > * uci-defaults/make-cert.sh uses a bad path for px5g
> > * Doesn't configure the web browser either
> >
> >
> >
> >
> > Best
> >         Sebastian
> >
> >
> >
> >
> > --
> > Dave Täht
> >
> > Fixing bufferbloat with cerowrt:
> http://www.teklibre.com/cerowrt/subscribe.html
>
>


-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html

[-- Attachment #2: Type: text/html, Size: 14997 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 13:02                   ` Fred Stratton
@ 2013-08-23 19:49                     ` Sebastian Moeller
  0 siblings, 0 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-23 19:49 UTC (permalink / raw)
  To: Fred Stratton; +Cc: cerowrt-devel

Hi Fred,


On Aug 23, 2013, at 15:02 , Fred Stratton <fredstratton@imap.cc> wrote:
[snipp]
>> 
>>> 
>>> Thus, in kernels >= 3.9, you would need to change/reduce your tc
>>> "overhead" parameter with -14 bytes (iif you accounted encapsulated
>>> Ethernet header before)
>> 
>> 	That is what I thought before, but my kernel spelunking made me reconsider and switch to not subtract the 14 bytes since as I understand it the kernel actively does not do it if stab is used.
>> 
>>> 
>>> The "overhead" of stab can be negative, so no problem here, in an "int"
>>> for stab.
>>> 
>>> 
>>>>> Meaning that
>>>>> some ATM encap overheads simply cannot be configured correctly (as you
>>>>> need to subtract the ethernet header).
>>>> 
>>>> 	Yes, I see, luckily PPPoA and IPoA seem quite rare, and setting the overhead to be larger than it actually is is relatively benign, as it will overestimate packe size.
> 
> 
> As a point of information, the entire UK uses PPPoA rather than PPPoE, and some hundreds of thousands of users IPoA.

	Lucky you! I guess one more reason to switch cerowrt over to stab, since PPPoA with VC/mux just adds 10 bytes of overhead, so if the ethernet would be accounted for already that would mean overhead -4 which HTB can not represent anyway. That said, unlike Jesper, I am not sure that tc stab includes the ethernet header by itself currently. Thanks for you input.

[snipp]
Best
	Sebastian



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23  7:27           ` Jesper Dangaard Brouer
  2013-08-23 10:15             ` Sebastian Moeller
@ 2013-08-23 19:51             ` Sebastian Moeller
  1 sibling, 0 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-23 19:51 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: cerowrt-devel

Hi Jesper hi List,


On Aug 23, 2013, at 09:27 , Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:

> On Thu, 22 Aug 2013 22:13:52 -0700
> Dave Taht <dave.taht@gmail.com> wrote:
> 
>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>>> Hi List, hi Jesper,
>>> 
>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer
>>> adjustments to see whether the recent changes resurrected this feature.
>>>        Unfortunately the htb_private link layer adjustments still is
>>> broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
>>> as without link layer adjustments). On the bright side the tc_stab method
>>> still works as well as before (ping RTT around 40ms).
>>>        I would like to humbly propose to use the tc stab method in
>>> cerowrt to perform ATM link layer adjustments as default. To repeat myself,
>>> simply telling the kernel a lie about the packet size seems more robust
>>> than fudging HTB's rate tables.
> 
> After the (regression) commit 56b765b79 ("htb: improved accuracy at
> high rates"), the kernel no-longer uses the rate tables.  
> 
> My commit 8a8e3d84b1719 (net_sched: restore "linklayer atm" handling),
> does the ATM cell overhead calculation directly on the packet length,
> see psched_l2t_ns() doing (DIV_ROUND_UP(len,48)*53).
> Thus, the cell calc should actually be more precise now.... but see below
> 
>>> Especially since the kernel already fudges
>>> the packet size to account for the ethernet header and then some, so this
>>> path should receive more scrutiny by virtue of having more users?
> 
> As you mention, the default kernel path (not tc stab) fudges the packet
> size for Ethernet headers, AND I made a mistake (back in approx 2006,
> sorry) that the "overhead" cannot be a negative number.  Meaning that
> some ATM encap overheads simply cannot be configured correctly (as you
> need to subtract the ethernet header). (And its quite problematic to
> change the kABI to allow for a negative overhead)
> 
> Perhaps we should change to use "tc stab" for this reason.  But I'm not
> sure "stab" does the right thing either, and its accuracy is also
> limited as its actually also table based.  We could easily change the
> kernel to perform the ATM cell overhead calc inside "stab", and we
> should also fix the GSO packet overhead problem.
> (for now remember to disable GSO packets when shaping)
> 
>> It's my hope that the atm code works but is misconfigured. You can output
>> the tc commands by overriding the TC variable with TC="echo tc" and paste
>> here.
> 
> I also hope is a misconfig.  Please show us the config/script.

	I guess you nailed it. While I got no output whatsoever from echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control. I also followed Dave's advise to dump the tc commands to file (see earlier mail). I turns out that the script only added the HTB link layer adjustments to egress and not to ingress as well, fixing that pushed the ping RTT for ht.'s link layer adjustemtes (at 95% of linerate) down to ~45ms which is close enough to what stab delivers.


> 
> I would appreciate a link to the scripts you are using... perhaps a git tree?
> 
> 
>>>        Now, I have been testing this using Dave's most recent cerowrt
>>> alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
>>> should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
>>> restore "linklayer atm" handling) but am not fully sure.
>>> 
>> 
>> It does.
> 
> It have not hit the stable tree yet, but DaveM promised he would pass it along.
> 
> It does seem Dave Taht have my patch applied:
> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/685-net_sched-restore-linklayer-atm-handling.patch
> 
>>> While I am not able to build kernels, it seems that I am able to quickly
>>> test whether link layer adjustments work or not. SO aim happy to help where
>>> I can :)
> 
> So, what is you setup lab, that allow you to test this quickly?
> 
> 
> -- 
> Best regards,
>  Jesper Dangaard Brouer
>  MSc.CS, Sr. Network Kernel Developer at Red Hat
>  Author of http://www.iptv-analyzer.org
>  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 19:47             ` Dave Taht
@ 2013-08-23 19:56               ` Sebastian Moeller
  2013-08-23 20:29                 ` Dave Taht
                                   ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-23 19:56 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jesper Dangaard Brouer, cerowrt-devel

Hi Dave,

I guess I found the culprit:

once I added $ADSLL to the ingress() in simple.qos:

ingress() {

CEIL=$DOWNLINK
PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
BE_RATE=`expr $CEIL / 6`   # Min for best effort
BK_RATE=`expr $CEIL / 6`   # Min for background
BE_CEIL=`expr $CEIL - 64`  # A little slop at the top

LQ="quantum `get_mtu $IFACE`"

$TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
$TC qdisc add dev $IFACE handle ffff: ingress

$TC qdisc del dev $DEV root  2> /dev/null
$TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
$TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
$TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
$TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
$TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
$TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL

# I'd prefer to use a pre-nat filter but that causes permutation...

$TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN `get_quantum 500` `get_flows ${PRIO_RATE}`
$TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BE_RATE}`
$TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BK_RATE}`

diffserv $DEV

ifconfig $DEV up

# redirect all IP packets arriving in $IFACE to ifb0

$TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
  match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV

}

I get basically the same RRUL ping RTTs for htb_private as for tc_stab. So Jesper was right the patch seems to fix the issue. I guess I should send out my current version of yours and Toke's AQM scripts soon.



Best
	Sebastian

P.S.: I am not sure whether I want to tackle the PIE issue today...



On Aug 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> wrote:

> quick note: running this script requires that you 
> 
> ifconfig ifb0 up
> 
> at some point.

	In my case on cerowrt you took care of that already...


> 
> 
> On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Dave,
> 
> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
> 
> >
> >
> >
> > On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> > Hi List, hi Jesper,
> >
> > So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
> >         Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
> >         I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
> >
> > It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
> 
>         So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):
> 
>   tc qdisc del dev ge00 root
>   tc qdisc add dev ge00 root handle 1: htb default 12
>   tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn quantum 300
>   tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn quantum 300
>   tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum 300
>   tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>   tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
>   tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
>   tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
>   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
>   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
>   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13
>   tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11
>   tc qdisc del dev ge00 handle ffff: ingress
>   tc qdisc add dev ge00 handle ffff: ingress
>   tc qdisc del dev ifb0 root
>   tc qdisc add dev ifb0 root handle 1: htb default 12
>   tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
>   tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
>   tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
>   tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
>   tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
>   tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn quantum 500
>   tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn quantum 1500
>   tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
>   tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc classid 1:12
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos 0x20 0xfc classid 1:13
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid 1:13
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos 0x10 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6 priority 0xc0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6 priority 0xe0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6 priority 0x90 0xfc classid 1:11
>   tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
>   tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0
> 
> I notice it seem this only shows up for egress(), but looking at simple.qos ingress() is not addend ${ADSLL} at all so that is to be expected. There is nothing in dmesg at all.
> 
> So I am off to add ADSLL to ingress() as well and then test RRUL again...
> 
> 
> Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
> but got no output even though debugs was already mounted…
> 
> Best
>         Sebastian
> 
> >
> >         Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
> >
> > It does.
> >
> > `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
> >
> > Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time.
> >
> > So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
> >
> > http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
> >
> > http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
> >
> > I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
> >
> > 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
> >
> > If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
> >
> > While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
> >
> > This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
> >
> > But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
> >
> > It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
> >
> > There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3. The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from.
> >
> > Multiple parties have the delusion that 20ms is "good enough".
> >
> > Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
> >
> > Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
> >
> > I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs.
> >
> > I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested.
> >
> > ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github)
> >
> > ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems.
> >
> > ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
> >
> > ... and find funding to get through the winter.
> >
> > There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
> >
> > While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
> >
> > Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
> >
> > tc -s qdisc show dev ge00
> > tc -s qdisc show dev ifb0
> >
> > would be useful info to have in general after each test.
> >
> > TIA.
> >
> > There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
> >
> > Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
> >
> > This is some of the stuff I know that needs fixing in userspace:
> >
> > * TODO readlink not found
> > * TODO netdev user missing
> > * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
> > * TODO [   18.480468] Mirror/redirect action on
> > [   18.539062] Failed to load ipt action
> > * upload and download are reversed in aqm
> > * BCP38
> > * Squash CS values
> > * Replace ntp
> > * Make ahcp client mode
> > * Drop more privs for polipo
> > * upnp
> > * priv separation
> > * Review FW rules
> > * dhcpv6 support
> > * uci-defaults/make-cert.sh uses a bad path for px5g
> > * Doesn't configure the web browser either
> >
> >
> >
> >
> > Best
> >         Sebastian
> >
> >
> >
> >
> > --
> > Dave Täht
> >
> > Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
> 
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 17:23                   ` Toke Høiland-Jørgensen
@ 2013-08-23 20:09                     ` Sebastian Moeller
  2013-08-23 20:46                       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-23 20:09 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: cerowrt-devel, Jesper Dangaard Brouer

Hi Toke,

I guess I should have been clearer in stating that you are author of the AQM scripts.

On Aug 23, 2013, at 19:23 , Toke Høiland-Jørgensen <toke@toke.dk> wrote:

> Sebastian Moeller <moeller0@gmx.de> writes:
> 
>> 	Well, partly the option for HTB was already in his script but under
>> tested, I changed the script to add stab and to allow easier configuration of
>> overhead, mow, mtu and tsize (just for stab) from the guy, but the code is
>> Dave's. I attached the scripts. functions.sh gets the values from the
>> configuration GUI. I extended the way the linklayer option strings are created,
>> but basically it is the same method that dave used. And I do see the right
>> overhead values appear in "tc -d qdisc", so at least something is reaching HTB.
>> Sorry, that I have no repository for easier access.
> 
> The repository containing the cerowrt-specific packages is at
> https://github.com/dtaht/ceropackages-3.3 -- the AQM script specifically
> is here:
> 
> https://github.com/dtaht/ceropackages-3.3/tree/master/net/aqm-scripts
> 
> With the gui at:
> 
> https://github.com/dtaht/ceropackages-3.3/tree/master/luci/luci-app-aqm

	This is quite helpful, only Jesper would need access to my modified scripts which are not in the repository (yet)


Best
	Sebastian

> 
> -Toke


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 19:56               ` Sebastian Moeller
@ 2013-08-23 20:29                 ` Dave Taht
  2013-08-24 20:51                   ` Sebastian Moeller
                                     ` (2 more replies)
  2013-08-27 10:45                 ` Jesper Dangaard Brouer
  2013-08-30 15:46                 ` [Cerowrt-devel] some kernel updates + new userspace patch Jesper Dangaard Brouer
  2 siblings, 3 replies; 43+ messages in thread
From: Dave Taht @ 2013-08-23 20:29 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Jesper Dangaard Brouer, cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 17588 bytes --]

On Fri, Aug 23, 2013 at 12:56 PM, Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Dave,
>
> I guess I found the culprit:
>
> once I added $ADSLL to the ingress() in simple.qos:
>
>

I had that in there originally. I ripped it out because it seemed to help
with ADSL at the time - as I was unaware the extent that the whole
subsystem was busted!

I like to think of the process we've just gone through as "wow, we just
fixed the uk, and a few other countries". :) Feels kind of good, doesn't
it? (Too bad the pay sucks.) I mean, jeeze, chopping another 30+ms off the
latency of that many systems should get medals from economists worldwide
monitoring productivity.

Does anyone have a date/kernel version on when linklayer overhead
compensation stopped working? There was a bug even prior to 3.8 that looked
bad. (and RED was busted for 3 years).

Another step would be trying to improve openwrt's native qos system
somewhat in the DSL case. They don't use this subsystem (probably because
it didn't work), and it's also broke on ipv6. (They use conntrack)

At some point I'd like to have a mechanism for saner diffserv
classification on egress, and to clamp ingress values to egress ones. There
is a ton of work going on on finding sane codepoints on webrtc in the
ietf....







> ingress() {
>
> CEIL=$DOWNLINK
> PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
> BE_RATE=`expr $CEIL / 6`   # Min for best effort
> BK_RATE=`expr $CEIL / 6`   # Min for background
> BE_CEIL=`expr $CEIL - 64`  # A little slop at the top
>
> LQ="quantum `get_mtu $IFACE`"
>
> $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
> $TC qdisc add dev $IFACE handle ffff: ingress
>
> $TC qdisc del dev $DEV root  2> /dev/null
> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil
> ${CEIL}kbit $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit
> ceil ${CEIL}kbit prio 0 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil
> ${PRIO_RATE}kbit prio 1 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit
> ceil ${BE_CEIL}kbit prio 2 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit
> ceil ${BE_CEIL}kbit prio 3 $ADSLL
>
> # I'd prefer to use a pre-nat filter but that causes permutation...
>
> $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN
> `get_quantum 500` `get_flows ${PRIO_RATE}`
> $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN
> `get_quantum 1500` `get_flows ${BE_RATE}`
> $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN
> `get_quantum 1500` `get_flows ${BK_RATE}`
>
> diffserv $DEV
>
> ifconfig $DEV up
>
> # redirect all IP packets arriving in $IFACE to ifb0
>
> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
>   match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV
>
> }
>
> I get basically the same RRUL ping RTTs for htb_private as for tc_stab. So
> Jesper was right the patch seems to fix the issue. I guess I should send
> out my current version of yours and Toke's AQM scripts soon.
>
>
>
> Best
>         Sebastian
>
> P.S.: I am not sure whether I want to tackle the PIE issue today...
>
>
>
> On Aug 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> wrote:
>
> > quick note: running this script requires that you
> >
> > ifconfig ifb0 up
> >
> > at some point.
>
>         In my case on cerowrt you took care of that already...
>
>
> >
> >
> > On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de>
> wrote:
> > Hi Dave,
> >
> > On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
> >
> > >
> > >
> > >
> > > On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de>
> wrote:
> > > Hi List, hi Jesper,
> > >
> > > So I tested 3.10.9-1 to assess the status of the HTB atm link layer
> adjustments to see whether the recent changes resurrected this feature.
> > >         Unfortunately the htb_private link layer adjustments still is
> broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
> as without link layer adjustments). On the bright side the tc_stab method
> still works as well as before (ping RTT around 40ms).
> > >         I would like to humbly propose to use the tc stab method in
> cerowrt to perform ATM link layer adjustments as default. To repeat myself,
> simply telling the kernel a lie about the packet size seems more robust
> than fudging HTB's rate tables. Especially since the kernel already fudges
> the packet size to account for the ethernet header and then some, so this
> path should receive more scrutiny by virtue of having more users?
> > >
> > > It's my hope that the atm code works but is misconfigured. You can
> output the tc commands by overriding the TC variable with TC="echo tc" and
> paste here.
> >
> >         So I went for TC="logger tc" and used log read to harvest as I
> could not find the echo output, but I guess that should not matter. So here
> is the result (slightly edited to get rid of the log timestamps and log
> level):
> >
> >   tc qdisc del dev ge00 root
> >   tc qdisc add dev ge00 root handle 1: htb default 12
> >   tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate
> 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
> >   tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate
> 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
> >   tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate
> 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
> >   tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate
> 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
> >   tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate
> 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
> >   tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn
> quantum 300
> >   tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn
> quantum 300
> >   tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn
> quantum 300
> >   tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip
> protocol 0 0x00 flowid 1:12
> >   tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw
> classid 1:11
> >   tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw
> classid 1:12
> >   tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw
> classid 1:13
> >   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw
> classid 1:11
> >   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw
> classid 1:12
> >   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw
> classid 1:13
> >   tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw
> classid 1:11
> >   tc qdisc del dev ge00 handle ffff: ingress
> >   tc qdisc add dev ge00 handle ffff: ingress
> >   tc qdisc del dev ifb0 root
> >   tc qdisc add dev ifb0 root handle 1: htb default 12
> >   tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate
> 15494kbit ceil 15494kbit
> >   tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate
> 15494kbit ceil 15494kbit prio 0
> >   tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate
> 32kbit ceil 5164kbit prio 1
> >   tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate
> 2582kbit ceil 15430kbit prio 2
> >   tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate
> 2582kbit ceil 15430kbit prio 3
> >   tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn
> quantum 500
> >   tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn
> quantum 1500
> >   tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn
> quantum 1500
> >   tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip
> protocol 0 0x00 flowid 1:12
> >   tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos
> 0x00 0xfc classid 1:12
> >   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6
> priority 0x00 0xfc classid 1:12
> >   tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos
> 0x20 0xfc classid 1:13
> >   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6
> priority 0x20 0xfc classid 1:13
> >   tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos
> 0x10 0xfc classid 1:11
> >   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6
> priority 0x10 0xfc classid 1:11
> >   tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos
> 0xb8 0xfc classid 1:11
> >   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6
> priority 0xb8 0xfc classid 1:11
> >   tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos
> 0xc0 0xfc classid 1:11
> >   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6
> priority 0xc0 0xfc classid 1:11
> >   tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos
> 0xe0 0xfc classid 1:11
> >   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6
> priority 0xe0 0xfc classid 1:11
> >   tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos
> 0x90 0xfc classid 1:11
> >   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6
> priority 0x90 0xfc classid 1:11
> >   tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw
> classid 1:11
> >   tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32
> 0 0 flowid 1:1 action mirred egress redirect dev ifb0
> >
> > I notice it seem this only shows up for egress(), but looking at
> simple.qos ingress() is not addend ${ADSLL} at all so that is to be
> expected. There is nothing in dmesg at all.
> >
> > So I am off to add ADSLL to ingress() as well and then test RRUL again...
> >
> >
> > Jesper please let me know if this looks reasonable, at least to my eye
> it seems to fit with what "tc disc add htb help" tells me. I tried your:
> > echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
> > but got no output even though debugs was already mounted…
> >
> > Best
> >         Sebastian
> >
> > >
> > >         Now, I have been testing this using Dave's most recent cerowrt
> alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
> should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
> restore "linklayer atm" handling) but am not fully sure.
> > >
> > > It does.
> > >
> > > `@Dave is there an easy way to find which patches you applied to the
> kernels of the cerowrt (testing-)releases?
> > >
> > > Normally I DO commit stuff that is in testing, but my big push this
> time around was to get everything important into mainline 3.10, as it will
> be the "stable" release for a good long time.
> > >
> > > So I am still mostly working the x86 side at the moment. I WAS kind of
> hoping that everything I just landed would make it up to 3.10. But for your
> perusal:
> > >
> > > http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most
> of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch
> ripped out due to another weird bug I'm looking at. (It also has support
> for ipv6 nat thx to the ever prolific stephen walker heeding the call for
> patches...). 100% totally untested, I have this weird bug to figure out how
> to fix next:
> > >
> > >
> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
> > >
> > > I fear it's a comparison gone south, maybe in bradley's optimizations
> for not kernel trapping, don't know.
> > >
> > > 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE
> losing the close naming integration, but, had to try this....
> > >
> > > If you guys want me to start committing and pushing patches again,
> I'll do it, but most of that stuff will end up in 3.10.10, I think, in a
> couple days. The rest might make 3.12. Pie has to survive scrutiny on the
> netdev list in particular.
> > >
> > > While I have you r attention :) I also tested 3.10.9-1's pie and it is
> way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms)
> but still worse than fq_codel (ping RTTs around 40ms with proper atm link
> layer adjustments).
> > >
> > > This is with simple.qos I imagine? Simplest should do better than that
> with pie. Judging from how its estimator works I think it will do badly
> with multiple queues. But testing will tell...
> > >
> > > But, yea, this pie is actually usable, and the previous wasn't. Thank
> you for looking at it!
> > >
> > > It is different from cisco's last pie drop in that it can do ecn, does
> local congestion notification, has a better use of net_random, it's mostly
> KernelStyle, and I forget what else.
> > >
> > > There is still a major rounding error in the code, and I'd like cisco
> to fix the api so it uses identical syntax to codel. Right now you specify
> "target 8" to get "target 7", and the "ms" is implied. target 5 becomes
> target 3. The default target is a whopping 20 (rounded to 19), which is in
> part where your 70+ms of extra delay came from.
> > >
> > > Multiple parties have the delusion that 20ms is "good enough".
> > >
> > > Part of the remaining delay may also be rounding error. Cisco uses
> kernels with HZ=1000, cero uses HZ=250.....
> > >
> > > Anyway, to get more comparable tests... you can fiddle with the two
> $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms
> config, but that would break a codel config which treats target 8 as target
> 8us.
> > >
> > > I MIGHT, if I get energetic enough, fix the API, the time accounting,
> and a few other things in pie, the problem is, that ns2_codel seems still
> more effective on most workloads and *fq_codel smokes absolutely
> everything. There are a few places where pie is a win over straight codel,
> notably on packet floods. And it may well be easier to retrofit into
> existing hardware fast path designs.
> > >
> > > I worry about interactions between pie and other stuff. It seems
> inevitable at this point that some form of pie will be widely deployed, and
> I simply haven't tried enough traffic types and RTTs to draw a firm
> conclusion, period. Long RTTs are the last big place where codel and pie
> and fq_codel have to be seriously tested.
> > >
> > > ns2_codel is looking pretty good now, at the shorter RTTs I've tried.
> A big problem I have is getting decent long RTT emulation out of netem
> (some preliminary code is up at github)
> > >
> > > ... and getting cero stable enough for others to actually use - next
> up is fixing the userspace problems.
> > >
> > > ... and trying to make a small dent in the wifi problem along the way
> (couple commits coming up)
> > >
> > > ... and find funding to get through the winter.
> > >
> > > There's probably a few other things that are on that list but I
> forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I
> could quit smoking.
> > >
> > > While I am not able to build kernels, it seems that I am able to
> quickly test whether link layer adjustments work or not. SO aim happy to
> help where I can :)
> > >
> > > Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms
> and target 7ms, too. fq_codel, same....
> > >
> > > tc -s qdisc show dev ge00
> > > tc -s qdisc show dev ifb0
> > >
> > > would be useful info to have in general after each test.
> > >
> > > TIA.
> > >
> > > There are also things like tcp_upload and tcp_download and
> tcp_bidirectional that are useful tests in the rrul suite.
> > >
> > > Thank you for your efforts on these early alpha releases. I hope
> things will stablize more soon, and I'll fold your aqm stuff into my next
> attempt this weekend.
> > >
> > > This is some of the stuff I know that needs fixing in userspace:
> > >
> > > * TODO readlink not found
> > > * TODO netdev user missing
> > > * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already
> running DHCP-server on interface 'se00' refusing to start, use 'option
> force 1' to override
> > > * TODO [   18.480468] Mirror/redirect action on
> > > [   18.539062] Failed to load ipt action
> > > * upload and download are reversed in aqm
> > > * BCP38
> > > * Squash CS values
> > > * Replace ntp
> > > * Make ahcp client mode
> > > * Drop more privs for polipo
> > > * upnp
> > > * priv separation
> > > * Review FW rules
> > > * dhcpv6 support
> > > * uci-defaults/make-cert.sh uses a bad path for px5g
> > > * Doesn't configure the web browser either
> > >
> > >
> > >
> > >
> > > Best
> > >         Sebastian
> > >
> > >
> > >
> > >
> > > --
> > > Dave Täht
> > >
> > > Fixing bufferbloat with cerowrt:
> http://www.teklibre.com/cerowrt/subscribe.html
> >
> >
> >
> >
> > --
> > Dave Täht
> >
> > Fixing bufferbloat with cerowrt:
> http://www.teklibre.com/cerowrt/subscribe.html
>
>


-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html

[-- Attachment #2: Type: text/html, Size: 20168 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 20:09                     ` Sebastian Moeller
@ 2013-08-23 20:46                       ` Toke Høiland-Jørgensen
  2013-08-24 20:51                         ` Sebastian Moeller
  0 siblings, 1 reply; 43+ messages in thread
From: Toke Høiland-Jørgensen @ 2013-08-23 20:46 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel, Jesper Dangaard Brouer

[-- Attachment #1: Type: text/plain, Size: 631 bytes --]

Sebastian Moeller <moeller0@gmx.de> writes:

> I guess I should have been clearer in stating that you are author of
> the AQM scripts.

Well, the scripts are originally Dave's; I mainly contributed the Luci
GUI for them (and some adjustments to make the scripts work with it).
Glad they're turning out to be useful. :)

> This is quite helpful, only Jesper would need access to my modified
> scripts which are not in the repository (yet)

Well, the repository is Dave's, so he's the one to bug for commit
access. Otherwise, Github has a nifty 'fork' button that'll let you
maintain your own version if that works better...


-Toke

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 20:46                       ` Toke Høiland-Jørgensen
@ 2013-08-24 20:51                         ` Sebastian Moeller
  0 siblings, 0 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-24 20:51 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: cerowrt-devel, Jesper Dangaard Brouer

Hi Toke,


On Aug 23, 2013, at 22:46 , Toke Høiland-Jørgensen <toke@toke.dk> wrote:

> Sebastian Moeller <moeller0@gmx.de> writes:
> 
>> I guess I should have been clearer in stating that you are author of
>> the AQM scripts.
> 
> Well, the scripts are originally Dave's; I mainly contributed the Luci
> GUI for them (and some adjustments to make the scripts work with it).
> Glad they're turning out to be useful. :)

	I love your work in getting a GUI for the scripts. This made adding a few new variables a piece of cake and that allowed for quick turn around in testing the different implementations. I am sure that I would have gotten confused and dropped the ball if I had  have to to everything by hand. So mane tak.

> 
>> This is quite helpful, only Jesper would need access to my modified
>> scripts which are not in the repository (yet)
> 
> Well, the repository is Dave's, so he's the one to bug for commit
> access. Otherwise, Github has a nifty 'fork' button that'll let you
> maintain your own version if that works better…

	Ah, I have to admit I have not gotten any further on getting accustomed with git. Maybe its really time to look into that…

Best Regards
	Sebastian

> 
> 
> -Toke


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 20:29                 ` Dave Taht
@ 2013-08-24 20:51                   ` Sebastian Moeller
  2013-08-24 20:51                   ` Sebastian Moeller
  2013-08-27 11:10                   ` Jesper Dangaard Brouer
  2 siblings, 0 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-24 20:51 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jesper Dangaard Brouer, cerowrt-devel

Hi Dave,


On Aug 23, 2013, at 22:29 , Dave Taht <dave.taht@gmail.com> wrote:

> 
> 
> 
> On Fri, Aug 23, 2013 at 12:56 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Dave,
> 
> I guess I found the culprit:
> 
> once I added $ADSLL to the ingress() in simple.qos:
> 
> 
> 
> I had that in there originally. I ripped it out because it seemed to help with ADSL at the time - as I was unaware the extent that the whole subsystem was busted!

	Ah, and I had added my stab based version to both ingress() and egress() assuming that both links need to be kept under control. So with fixed htb link layer adjustment (LLA) it only worked on the uplink and in retrospect if I look at my initial test data I actually see one of the hallmarks of a working LLA for the upstream. (The upstream good-put was reduced compared to the no LLA test, caused by LLA making the actually sent packets larger so fewer packets fit through the shaped link). But since I was not expecting only half a working system I overlooked that in the data.
	But looking at the latency of the ping RTT probes it becomes quite clear that only doing link layer adjustments on the uplink is even worse than not doing it all (because the latency is still almost as bad as without LLA but the up-link bandwidth is reduced).

> 
> I like to think of the process we've just gone through as "wow, we just fixed the uk, and a few other countries". :) Feels kind of good, doesn't it? (Too bad the pay sucks.)

	Oh, I can not complain about pay, I have a day job in totally different field, so this is more of a hobby for me :) 

> I mean, jeeze, chopping another 30+ms off the latency of that many systems should get medals from economists worldwide monitoring productivity. 
> 
> Does anyone have a date/kernel version on when linklayer overhead compensation stopped working? There was a bug even prior to 3.8 that looked bad. (and RED was busted for 3 years).
> 
> Another step would be trying to improve openwrt's native qos system somewhat in the DSL case. They don't use this subsystem (probably because it didn't work), and it's also broke on ipv6. (They use conn track)

	Oh, in the bql-40 time frame I hacked the stab based LLA into their generate.sh and it worked quite well, even though at time my measurements were quite crude. SInce their qos scripts are HFSC based the HTB private implementation is not going to do them any good. Luckily now that does not seem to matter as both methods now perform identically as they should. (Well, now Jespers last changes are nicer than the old table lookup, but it should be relatively say to implant the same for stab, heck once I got my linux machine up I might take this as my first attempt at making local changes to the kernel :) ). So adding it to openwrt proper should be a piece of cake. Do you know by any chance who would be the best person to contact for that, ?

> 
> At some point I'd like to have a mechanism for saner diffserv classification on egress, and to clamp ingress values to egress ones. There is a ton of work going on on finding sane codepoints on webrtc in the ietf….
> 
> 
> 
> 
> 
> 
> ingress() {
> 
> CEIL=$DOWNLINK
> PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
> BE_RATE=`expr $CEIL / 6`   # Min for best effort
> BK_RATE=`expr $CEIL / 6`   # Min for background
> BE_CEIL=`expr $CEIL - 64`  # A little slop at the top
> 
> LQ="quantum `get_mtu $IFACE`"
> 
> $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
> $TC qdisc add dev $IFACE handle ffff: ingress
> 
> $TC qdisc del dev $DEV root  2> /dev/null
> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL
> 
> # I'd prefer to use a pre-nat filter but that causes permutation...
> 
> $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN `get_quantum 500` `get_flows ${PRIO_RATE}`
> $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BE_RATE}`
> $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BK_RATE}`
> 
> diffserv $DEV
> 
> ifconfig $DEV up
> 
> # redirect all IP packets arriving in $IFACE to ifb0
> 
> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
> match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV
> 
> }
> 
> I get basically the same RRUL ping RTTs for htb_private as for tc_stab. So Jesper was right the patch seems to fix the issue. I guess I should send out my current version of yours and Toke's AQM scripts soon.
> 
> 
> 
> Best
>    Sebastian
> 
> P.S.: I am not sure whether I want to tackle the PIE issue today...
> 
> 
> 
> On Aug 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> wrote:
> 
>> quick note: running this script requires that you
>> 
>> ifconfig ifb0 up
>> 
>> at some point.
> 
>    In my case on cerowrt you took care of that already...
> 
> 
>> 
>> 
>> On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> Hi Dave,
>> 
>> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> 
>>> 
>>> 
>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> Hi List, hi Jesper,
>>> 
>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>>>    Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>>>    I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
>>> 
>>> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
>> 
>>    So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):
>> 
>> tc qdisc del dev ge00 root
>> tc qdisc add dev ge00 root handle 1: htb default 12
>> tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn quantum 300
>> tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn quantum 300
>> tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum 300
>> tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>> tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
>> tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
>> tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13
>> tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11
>> tc qdisc del dev ge00 handle ffff: ingress
>> tc qdisc add dev ge00 handle ffff: ingress
>> tc qdisc del dev ifb0 root
>> tc qdisc add dev ifb0 root handle 1: htb default 12
>> tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
>> tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
>> tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
>> tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
>> tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
>> tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn quantum 500
>> tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn quantum 1500
>> tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
>> tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc classid 1:12
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos 0x20 0xfc classid 1:13
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid 1:13
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos 0x10 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6 priority 0xc0 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6 priority 0xe0 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6 priority 0x90 0xfc classid 1:11
>> tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
>> tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0
>> 
>> I notice it seem this only shows up for egress(), but looking at simple.qos ingress() is not addend ${ADSLL} at all so that is to be expected. There is nothing in dmesg at all.
>> 
>> So I am off to add ADSLL to ingress() as well and then test RRUL again...
>> 
>> 
>> Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
>> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
>> but got no output even though debugs was already mounted…
>> 
>> Best
>>    Sebastian
>> 
>>> 
>>>    Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
>>> 
>>> It does.
>>> 
>>> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
>>> 
>>> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time.
>>> 
>>> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
>>> 
>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
>>> 
>>> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
>>> 
>>> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
>>> 
>>> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
>>> 
>>> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
>>> 
>>> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
>>> 
>>> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
>>> 
>>> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
>>> 
>>> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
>>> 
>>> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3. The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from.
>>> 
>>> Multiple parties have the delusion that 20ms is "good enough".
>>> 
>>> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
>>> 
>>> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
>>> 
>>> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs.
>>> 
>>> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested.
>>> 
>>> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github)
>>> 
>>> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems.
>>> 
>>> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
>>> 
>>> ... and find funding to get through the winter.
>>> 
>>> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
>>> 
>>> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
>>> 
>>> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
>>> 
>>> tc -s qdisc show dev ge00
>>> tc -s qdisc show dev ifb0
>>> 
>>> would be useful info to have in general after each test.
>>> 
>>> TIA.
>>> 
>>> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
>>> 
>>> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
>>> 
>>> This is some of the stuff I know that needs fixing in userspace:
>>> 
>>> * TODO readlink not found
>>> * TODO netdev user missing
>>> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
>>> * TODO [   18.480468] Mirror/redirect action on
>>> [   18.539062] Failed to load ipt action
>>> * upload and download are reversed in aqm
>>> * BCP38
>>> * Squash CS values
>>> * Replace ntp
>>> * Make ahcp client mode
>>> * Drop more privs for polipo
>>> * upnp
>>> * priv separation
>>> * Review FW rules
>>> * dhcpv6 support
>>> * uci-defaults/make-cert.sh uses a bad path for px5g
>>> * Doesn't configure the web browser either
>>> 
>>> 
>>> 
>>> 
>>> Best
>>>    Sebastian
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Dave Täht
>>> 
>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>> 
>> 
>> 
>> 
>> --
>> Dave Täht
>> 
>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
> 
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 20:29                 ` Dave Taht
  2013-08-24 20:51                   ` Sebastian Moeller
@ 2013-08-24 20:51                   ` Sebastian Moeller
  2013-08-25  9:21                     ` Fred Stratton
  2013-08-27 11:10                   ` Jesper Dangaard Brouer
  2 siblings, 1 reply; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-24 20:51 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jesper Dangaard Brouer, cerowrt-devel

Hi Dave,


On Aug 23, 2013, at 22:29 , Dave Taht <dave.taht@gmail.com> wrote:

> 
> 
> 
> On Fri, Aug 23, 2013 at 12:56 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Dave,
> 
> I guess I found the culprit:
> 
> once I added $ADSLL to the ingress() in simple.qos:
> 
> 
> 
> I had that in there originally. I ripped it out because it seemed to help with ADSL at the time - as I was unaware the extent that the whole subsystem was busted!

	Ah, and I had added my stab based version to both ingress() and egress() assuming that both links need to be kept under control. So with fixed htb link layer adjustment (LLA) it only worked on the uplink and in retrospect if I look at my initial test data I actually see one of the hallmarks of a working LLA for the upstream. (The upstream good-put was reduced compared to the no LLA test, caused by LLA making the actually sent packets larger so fewer packets fit through the shaped link). But since I was not expecting only half a working system I overlooked that in the data.
	But looking at the latency of the ping RTT probes it becomes quite clear that only doing link layer adjustments on the uplink is even worse than not doing it all (because the latency is still almost as bad as without LLA but the up-link bandwidth is reduced).

> 
> I like to think of the process we've just gone through as "wow, we just fixed the uk, and a few other countries". :) Feels kind of good, doesn't it? (Too bad the pay sucks.)

	Oh, I can not complain about pay, I have a day job in totally different field, so this is more of a hobby for me :) 

> I mean, jeeze, chopping another 30+ms off the latency of that many systems should get medals from economists worldwide monitoring productivity. 
> 
> Does anyone have a date/kernel version on when linklayer overhead compensation stopped working? There was a bug even prior to 3.8 that looked bad. (and RED was busted for 3 years).
> 
> Another step would be trying to improve openwrt's native qos system somewhat in the DSL case. They don't use this subsystem (probably because it didn't work), and it's also broke on ipv6. (They use conn track)

	Oh, in the bql-40 time frame I hacked the stab based LLA into their generate.sh and it worked quite well, even though at time my measurements were quite crude. SInce their qos scripts are HFSC based the HTB private implementation is not going to do them any good. Luckily now that does not seem to matter as both methods now perform identically as they should. (Well, now Jespers last changes are nicer than the old table lookup, but it should be relatively say to implant the same for stab, heck once I got my linux machine up I might take this as my first attempt at making local changes to the kernel :) ). So adding it to openwrt proper should be a piece of cake. Do you know by any chance who would be the best person to contact for that, ?

> 
> At some point I'd like to have a mechanism for saner diffserv classification on egress, and to clamp ingress values to egress ones. There is a ton of work going on on finding sane codepoints on webrtc in the ietf….
> 
> 
> 
> 
> 
> 
> ingress() {
> 
> CEIL=$DOWNLINK
> PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
> BE_RATE=`expr $CEIL / 6`   # Min for best effort
> BK_RATE=`expr $CEIL / 6`   # Min for background
> BE_CEIL=`expr $CEIL - 64`  # A little slop at the top
> 
> LQ="quantum `get_mtu $IFACE`"
> 
> $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
> $TC qdisc add dev $IFACE handle ffff: ingress
> 
> $TC qdisc del dev $DEV root  2> /dev/null
> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL
> 
> # I'd prefer to use a pre-nat filter but that causes permutation...
> 
> $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN `get_quantum 500` `get_flows ${PRIO_RATE}`
> $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BE_RATE}`
> $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BK_RATE}`
> 
> diffserv $DEV
> 
> ifconfig $DEV up
> 
> # redirect all IP packets arriving in $IFACE to ifb0
> 
> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
> match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV
> 
> }
> 
> I get basically the same RRUL ping RTTs for htb_private as for tc_stab. So Jesper was right the patch seems to fix the issue. I guess I should send out my current version of yours and Toke's AQM scripts soon.
> 
> 
> 
> Best
>   Sebastian
> 
> P.S.: I am not sure whether I want to tackle the PIE issue today...
> 
> 
> 
> On Aug 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> wrote:
> 
>> quick note: running this script requires that you
>> 
>> ifconfig ifb0 up
>> 
>> at some point.
> 
>   In my case on cerowrt you took care of that already...
> 
> 
>> 
>> 
>> On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> Hi Dave,
>> 
>> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> 
>>> 
>>> 
>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> Hi List, hi Jesper,
>>> 
>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>>>   Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>>>   I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
>>> 
>>> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
>> 
>>   So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):
>> 
>> tc qdisc del dev ge00 root
>> tc qdisc add dev ge00 root handle 1: htb default 12
>> tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>> tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn quantum 300
>> tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn quantum 300
>> tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum 300
>> tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>> tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
>> tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
>> tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13
>> tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11
>> tc qdisc del dev ge00 handle ffff: ingress
>> tc qdisc add dev ge00 handle ffff: ingress
>> tc qdisc del dev ifb0 root
>> tc qdisc add dev ifb0 root handle 1: htb default 12
>> tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
>> tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
>> tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
>> tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
>> tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
>> tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn quantum 500
>> tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn quantum 1500
>> tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
>> tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc classid 1:12
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos 0x20 0xfc classid 1:13
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid 1:13
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos 0x10 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6 priority 0xc0 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6 priority 0xe0 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid 1:11
>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6 priority 0x90 0xfc classid 1:11
>> tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
>> tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0
>> 
>> I notice it seem this only shows up for egress(), but looking at simple.qos ingress() is not addend ${ADSLL} at all so that is to be expected. There is nothing in dmesg at all.
>> 
>> So I am off to add ADSLL to ingress() as well and then test RRUL again...
>> 
>> 
>> Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
>> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
>> but got no output even though debugs was already mounted…
>> 
>> Best
>>   Sebastian
>> 
>>> 
>>>   Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
>>> 
>>> It does.
>>> 
>>> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
>>> 
>>> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time.
>>> 
>>> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
>>> 
>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
>>> 
>>> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
>>> 
>>> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
>>> 
>>> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
>>> 
>>> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
>>> 
>>> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
>>> 
>>> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
>>> 
>>> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
>>> 
>>> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
>>> 
>>> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3. The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from.
>>> 
>>> Multiple parties have the delusion that 20ms is "good enough".
>>> 
>>> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
>>> 
>>> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
>>> 
>>> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs.
>>> 
>>> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested.
>>> 
>>> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github)
>>> 
>>> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems.
>>> 
>>> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
>>> 
>>> ... and find funding to get through the winter.
>>> 
>>> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
>>> 
>>> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
>>> 
>>> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
>>> 
>>> tc -s qdisc show dev ge00
>>> tc -s qdisc show dev ifb0
>>> 
>>> would be useful info to have in general after each test.
>>> 
>>> TIA.
>>> 
>>> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
>>> 
>>> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
>>> 
>>> This is some of the stuff I know that needs fixing in userspace:
>>> 
>>> * TODO readlink not found
>>> * TODO netdev user missing
>>> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
>>> * TODO [   18.480468] Mirror/redirect action on
>>> [   18.539062] Failed to load ipt action
>>> * upload and download are reversed in aqm
>>> * BCP38
>>> * Squash CS values
>>> * Replace ntp
>>> * Make ahcp client mode
>>> * Drop more privs for polipo
>>> * upnp
>>> * priv separation
>>> * Review FW rules
>>> * dhcpv6 support
>>> * uci-defaults/make-cert.sh uses a bad path for px5g
>>> * Doesn't configure the web browser either
>>> 
>>> 
>>> 
>>> 
>>> Best
>>>   Sebastian
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Dave Täht
>>> 
>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>> 
>> 
>> 
>> 
>> --
>> Dave Täht
>> 
>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
> 
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23  5:13         ` Dave Taht
                             ` (2 preceding siblings ...)
  2013-08-23 19:38           ` Sebastian Moeller
@ 2013-08-24 23:08           ` Sebastian Moeller
  3 siblings, 0 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-24 23:08 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

Hi Dave,

so I git around to do the PIE tests...

On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:

> 
> 
> 
> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi List, hi Jesper,
> 
> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>         Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>         I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
> 
> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
>  
>         Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
> 
> It does. 
>  
> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
> 
> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time. 
>  
> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
> 
> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
> 
> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
> 
> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
> 
> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
> 
> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
> 
> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
> 
> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
> 
> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
> 
> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
> 
> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3.

	This is as confusing as it is funny….

> The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from. 
> 
> Multiple parties have the delusion that 20ms is "good enough".

	It certainly is better than nothing, if the hardware does not allow codel…, but it is not like changing target has that big an effect on RRUL ping RTT:
AQM		nominal target[ms]	estimated ping RTT[ms]	down avg good-put[Mbits/s]	up avg good-put[Mbits/s]
pie			20					110						2.8							0.4
pie			8					100						2.7							0.41
pie			5					90						2.7							0.4

so the target does not have very strong effect on the pie latency...

ns2_codel	7					50						3.1							0.39
ns2_codel	3					45						3.1							0.37
fq_codel		7					39						3.1							0.38
fq_codel		3					38						3.15						0.38

Nor does it have a big effect on codel.

But latency-wise, I agree, the codels are in a league of their own. And pie sacrifices more down good-put than codel, but seems to retain a bit more of the up good-put. I hope the fraction of routers that can only ever do PIE is not too large, codel looks like the better solution…

(I am happy to share my plots but due to email size issues I will not attach them without someone asking)

> 
> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
> 
> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
> 
> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs. 
> 
> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested. 
> 
> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github) 
> 
> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems. 
> 
> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
> 
> ... and find funding to get through the winter.
>  
> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
>  
> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
> 
> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
>  
> tc -s qdisc show dev ge00
> tc -s qdisc show dev ifb0
> 
> would be useful info to have in general after each test.

	So I took these and can post if you are interested. BTW so far I did not bother to save the actual netsurf-wrapper data, all I keep are the plots (as I am somewhat tight on disk space for my day job), since I do not think I will try further analysis on the data (yet).


Best
	Sebastian

> 
> TIA.
> 
> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
> 
> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
> 
> This is some of the stuff I know that needs fixing in userspace:
> 
> * TODO readlink not found
> * TODO netdev user missing
> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
> * TODO [   18.480468] Mirror/redirect action on
> [   18.539062] Failed to load ipt action
> * upload and download are reversed in aqm
> * BCP38
> * Squash CS values
> * Replace ntp
> * Make ahcp client mode
> * Drop more privs for polipo
> * upnp
> * priv separation
> * Review FW rules
> * dhcpv6 support
> * uci-defaults/make-cert.sh uses a bad path for px5g
> * Doesn't configure the web browser either
> 
> 
> 
> 
> Best
>         Sebastian
> 
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-24 20:51                   ` Sebastian Moeller
@ 2013-08-25  9:21                     ` Fred Stratton
  2013-08-25 10:17                       ` Fred Stratton
  0 siblings, 1 reply; 43+ messages in thread
From: Fred Stratton @ 2013-08-25  9:21 UTC (permalink / raw)
  To: cerowrt-devel

As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.

The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this. 

The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.

YouTube has no problems.

I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.

The current ISP connection is IPoA LLC. Whatever byte value is used for tc-stab makes no change.

I have applied the ingress modification to simple.qos, keeping the original version., and tested both.

I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.

I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.

This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…

Telnet, however, allows txqueuelen to be reduced from 1000 to 0.

None of these changes affect the problematic uplink delay.


On 24 Aug 2013, at 21:51, Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Dave,
> 
> 
> On Aug 23, 2013, at 22:29 , Dave Taht <dave.taht@gmail.com> wrote:
> 
>> 
>> 
>> 
>> On Fri, Aug 23, 2013 at 12:56 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> Hi Dave,
>> 
>> I guess I found the culprit:
>> 
>> once I added $ADSLL to the ingress() in simple.qos:
>> 
>> 
>> 
>> I had that in there originally. I ripped it out because it seemed to help with ADSL at the time - as I was unaware the extent that the whole subsystem was busted!
> 
> 	Ah, and I had added my stab based version to both ingress() and egress() assuming that both links need to be kept under control. So with fixed htb link layer adjustment (LLA) it only worked on the uplink and in retrospect if I look at my initial test data I actually see one of the hallmarks of a working LLA for the upstream. (The upstream good-put was reduced compared to the no LLA test, caused by LLA making the actually sent packets larger so fewer packets fit through the shaped link). But since I was not expecting only half a working system I overlooked that in the data.
> 	But looking at the latency of the ping RTT probes it becomes quite clear that only doing link layer adjustments on the uplink is even worse than not doing it all (because the latency is still almost as bad as without LLA but the up-link bandwidth is reduced).
> 
>> 
>> I like to think of the process we've just gone through as "wow, we just fixed the uk, and a few other countries". :) Feels kind of good, doesn't it? (Too bad the pay sucks.)
> 
> 	Oh, I can not complain about pay, I have a day job in totally different field, so this is more of a hobby for me :) 
> 
>> I mean, jeeze, chopping another 30+ms off the latency of that many systems should get medals from economists worldwide monitoring productivity. 
>> 
>> Does anyone have a date/kernel version on when linklayer overhead compensation stopped working? There was a bug even prior to 3.8 that looked bad. (and RED was busted for 3 years).
>> 
>> Another step would be trying to improve openwrt's native qos system somewhat in the DSL case. They don't use this subsystem (probably because it didn't work), and it's also broke on ipv6. (They use conn track)
> 
> 	Oh, in the bql-40 time frame I hacked the stab based LLA into their generate.sh and it worked quite well, even though at time my measurements were quite crude. SInce their qos scripts are HFSC based the HTB private implementation is not going to do them any good. Luckily now that does not seem to matter as both methods now perform identically as they should. (Well, now Jespers last changes are nicer than the old table lookup, but it should be relatively say to implant the same for stab, heck once I got my linux machine up I might take this as my first attempt at making local changes to the kernel :) ). So adding it to openwrt proper should be a piece of cake. Do you know by any chance who would be the best person to contact for that, ?
> 
>> 
>> At some point I'd like to have a mechanism for saner diffserv classification on egress, and to clamp ingress values to egress ones. There is a ton of work going on on finding sane codepoints on webrtc in the ietf….
>> 
>> 
>> 
>> 
>> 
>> 
>> ingress() {
>> 
>> CEIL=$DOWNLINK
>> PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
>> BE_RATE=`expr $CEIL / 6`   # Min for best effort
>> BK_RATE=`expr $CEIL / 6`   # Min for background
>> BE_CEIL=`expr $CEIL - 64`  # A little slop at the top
>> 
>> LQ="quantum `get_mtu $IFACE`"
>> 
>> $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
>> $TC qdisc add dev $IFACE handle ffff: ingress
>> 
>> $TC qdisc del dev $DEV root  2> /dev/null
>> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
>> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
>> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
>> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
>> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
>> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL
>> 
>> # I'd prefer to use a pre-nat filter but that causes permutation...
>> 
>> $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN `get_quantum 500` `get_flows ${PRIO_RATE}`
>> $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BE_RATE}`
>> $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BK_RATE}`
>> 
>> diffserv $DEV
>> 
>> ifconfig $DEV up
>> 
>> # redirect all IP packets arriving in $IFACE to ifb0
>> 
>> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
>> match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV
>> 
>> }
>> 
>> I get basically the same RRUL ping RTTs for htb_private as for tc_stab. So Jesper was right the patch seems to fix the issue. I guess I should send out my current version of yours and Toke's AQM scripts soon.
>> 
>> 
>> 
>> Best
>>  Sebastian
>> 
>> P.S.: I am not sure whether I want to tackle the PIE issue today...
>> 
>> 
>> 
>> On Aug 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> quick note: running this script requires that you
>>> 
>>> ifconfig ifb0 up
>>> 
>>> at some point.
>> 
>>  In my case on cerowrt you took care of that already...
>> 
>> 
>>> 
>>> 
>>> On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> Hi Dave,
>>> 
>>> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
>>> 
>>>> 
>>>> 
>>>> 
>>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>> Hi List, hi Jesper,
>>>> 
>>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>>>>  Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>>>>  I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
>>>> 
>>>> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
>>> 
>>>  So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):
>>> 
>>> tc qdisc del dev ge00 root
>>> tc qdisc add dev ge00 root handle 1: htb default 12
>>> tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>>> tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>>> tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>>> tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>>> tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>>> tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn quantum 300
>>> tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn quantum 300
>>> tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum 300
>>> tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>> tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
>>> tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
>>> tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13
>>> tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11
>>> tc qdisc del dev ge00 handle ffff: ingress
>>> tc qdisc add dev ge00 handle ffff: ingress
>>> tc qdisc del dev ifb0 root
>>> tc qdisc add dev ifb0 root handle 1: htb default 12
>>> tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
>>> tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
>>> tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
>>> tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
>>> tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
>>> tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn quantum 500
>>> tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn quantum 1500
>>> tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
>>> tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc classid 1:12
>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos 0x20 0xfc classid 1:13
>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid 1:13
>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos 0x10 0xfc classid 1:11
>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 0xfc classid 1:11
>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc classid 1:11
>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6 priority 0xc0 0xfc classid 1:11
>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid 1:11
>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6 priority 0xe0 0xfc classid 1:11
>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid 1:11
>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6 priority 0x90 0xfc classid 1:11
>>> tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
>>> tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0
>>> 
>>> I notice it seem this only shows up for egress(), but looking at simple.qos ingress() is not addend ${ADSLL} at all so that is to be expected. There is nothing in dmesg at all.
>>> 
>>> So I am off to add ADSLL to ingress() as well and then test RRUL again...
>>> 
>>> 
>>> Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
>>> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
>>> but got no output even though debugs was already mounted…
>>> 
>>> Best
>>>  Sebastian
>>> 
>>>> 
>>>>  Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
>>>> 
>>>> It does.
>>>> 
>>>> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
>>>> 
>>>> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time.
>>>> 
>>>> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
>>>> 
>>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
>>>> 
>>>> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
>>>> 
>>>> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
>>>> 
>>>> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
>>>> 
>>>> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
>>>> 
>>>> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
>>>> 
>>>> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
>>>> 
>>>> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
>>>> 
>>>> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
>>>> 
>>>> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3. The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from.
>>>> 
>>>> Multiple parties have the delusion that 20ms is "good enough".
>>>> 
>>>> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
>>>> 
>>>> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
>>>> 
>>>> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs.
>>>> 
>>>> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested.
>>>> 
>>>> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github)
>>>> 
>>>> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems.
>>>> 
>>>> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
>>>> 
>>>> ... and find funding to get through the winter.
>>>> 
>>>> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
>>>> 
>>>> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
>>>> 
>>>> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
>>>> 
>>>> tc -s qdisc show dev ge00
>>>> tc -s qdisc show dev ifb0
>>>> 
>>>> would be useful info to have in general after each test.
>>>> 
>>>> TIA.
>>>> 
>>>> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
>>>> 
>>>> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
>>>> 
>>>> This is some of the stuff I know that needs fixing in userspace:
>>>> 
>>>> * TODO readlink not found
>>>> * TODO netdev user missing
>>>> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
>>>> * TODO [   18.480468] Mirror/redirect action on
>>>> [   18.539062] Failed to load ipt action
>>>> * upload and download are reversed in aqm
>>>> * BCP38
>>>> * Squash CS values
>>>> * Replace ntp
>>>> * Make ahcp client mode
>>>> * Drop more privs for polipo
>>>> * upnp
>>>> * priv separation
>>>> * Review FW rules
>>>> * dhcpv6 support
>>>> * uci-defaults/make-cert.sh uses a bad path for px5g
>>>> * Doesn't configure the web browser either
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Best
>>>>  Sebastian
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Dave Täht
>>>> 
>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Dave Täht
>>> 
>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>> 
>> 
>> 
>> 
>> -- 
>> Dave Täht
>> 
>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25  9:21                     ` Fred Stratton
@ 2013-08-25 10:17                       ` Fred Stratton
  2013-08-25 13:59                         ` Sebastian Moeller
  0 siblings, 1 reply; 43+ messages in thread
From: Fred Stratton @ 2013-08-25 10:17 UTC (permalink / raw)
  To: cerowrt-devel


On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:

> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
> 
> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this. 
> 
> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
> 
> YouTube has no problems.
> 
> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
> 
> The current ISP connection is IPoA LLC.

Correction - Bridged LLC. 

> Whatever byte value is used for tc-stab makes no change.
> 
> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
> 
> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
> 
> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
> 
> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
> 
> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
> 
> None of these changes affect the problematic uplink delay.
> 
> 
> On 24 Aug 2013, at 21:51, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
>> Hi Dave,
>> 
>> 
>> On Aug 23, 2013, at 22:29 , Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> 
>>> 
>>> 
>>> On Fri, Aug 23, 2013 at 12:56 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> Hi Dave,
>>> 
>>> I guess I found the culprit:
>>> 
>>> once I added $ADSLL to the ingress() in simple.qos:
>>> 
>>> 
>>> 
>>> I had that in there originally. I ripped it out because it seemed to help with ADSL at the time - as I was unaware the extent that the whole subsystem was busted!
>> 
>> 	Ah, and I had added my stab based version to both ingress() and egress() assuming that both links need to be kept under control. So with fixed htb link layer adjustment (LLA) it only worked on the uplink and in retrospect if I look at my initial test data I actually see one of the hallmarks of a working LLA for the upstream. (The upstream good-put was reduced compared to the no LLA test, caused by LLA making the actually sent packets larger so fewer packets fit through the shaped link). But since I was not expecting only half a working system I overlooked that in the data.
>> 	But looking at the latency of the ping RTT probes it becomes quite clear that only doing link layer adjustments on the uplink is even worse than not doing it all (because the latency is still almost as bad as without LLA but the up-link bandwidth is reduced).
>> 
>>> 
>>> I like to think of the process we've just gone through as "wow, we just fixed the uk, and a few other countries". :) Feels kind of good, doesn't it? (Too bad the pay sucks.)
>> 
>> 	Oh, I can not complain about pay, I have a day job in totally different field, so this is more of a hobby for me :) 
>> 
>>> I mean, jeeze, chopping another 30+ms off the latency of that many systems should get medals from economists worldwide monitoring productivity. 
>>> 
>>> Does anyone have a date/kernel version on when linklayer overhead compensation stopped working? There was a bug even prior to 3.8 that looked bad. (and RED was busted for 3 years).
>>> 
>>> Another step would be trying to improve openwrt's native qos system somewhat in the DSL case. They don't use this subsystem (probably because it didn't work), and it's also broke on ipv6. (They use conn track)
>> 
>> 	Oh, in the bql-40 time frame I hacked the stab based LLA into their generate.sh and it worked quite well, even though at time my measurements were quite crude. SInce their qos scripts are HFSC based the HTB private implementation is not going to do them any good. Luckily now that does not seem to matter as both methods now perform identically as they should. (Well, now Jespers last changes are nicer than the old table lookup, but it should be relatively say to implant the same for stab, heck once I got my linux machine up I might take this as my first attempt at making local changes to the kernel :) ). So adding it to openwrt proper should be a piece of cake. Do you know by any chance who would be the best person to contact for that, ?
>> 
>>> 
>>> At some point I'd like to have a mechanism for saner diffserv classification on egress, and to clamp ingress values to egress ones. There is a ton of work going on on finding sane codepoints on webrtc in the ietf….
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ingress() {
>>> 
>>> CEIL=$DOWNLINK
>>> PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
>>> BE_RATE=`expr $CEIL / 6`   # Min for best effort
>>> BK_RATE=`expr $CEIL / 6`   # Min for background
>>> BE_CEIL=`expr $CEIL - 64`  # A little slop at the top
>>> 
>>> LQ="quantum `get_mtu $IFACE`"
>>> 
>>> $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
>>> $TC qdisc add dev $IFACE handle ffff: ingress
>>> 
>>> $TC qdisc del dev $DEV root  2> /dev/null
>>> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
>>> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
>>> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
>>> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
>>> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
>>> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL
>>> 
>>> # I'd prefer to use a pre-nat filter but that causes permutation...
>>> 
>>> $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN `get_quantum 500` `get_flows ${PRIO_RATE}`
>>> $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BE_RATE}`
>>> $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BK_RATE}`
>>> 
>>> diffserv $DEV
>>> 
>>> ifconfig $DEV up
>>> 
>>> # redirect all IP packets arriving in $IFACE to ifb0
>>> 
>>> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
>>> match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV
>>> 
>>> }
>>> 
>>> I get basically the same RRUL ping RTTs for htb_private as for tc_stab. So Jesper was right the patch seems to fix the issue. I guess I should send out my current version of yours and Toke's AQM scripts soon.
>>> 
>>> 
>>> 
>>> Best
>>> Sebastian
>>> 
>>> P.S.: I am not sure whether I want to tackle the PIE issue today...
>>> 
>>> 
>>> 
>>> On Aug 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> wrote:
>>> 
>>>> quick note: running this script requires that you
>>>> 
>>>> ifconfig ifb0 up
>>>> 
>>>> at some point.
>>> 
>>> In my case on cerowrt you took care of that already...
>>> 
>>> 
>>>> 
>>>> 
>>>> On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>> Hi Dave,
>>>> 
>>>> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>> Hi List, hi Jesper,
>>>>> 
>>>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>>>>> Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>>>>> I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
>>>>> 
>>>>> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
>>>> 
>>>> So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):
>>>> 
>>>> tc qdisc del dev ge00 root
>>>> tc qdisc add dev ge00 root handle 1: htb default 12
>>>> tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>>>> tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>> tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>> tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>> tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>> tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn quantum 300
>>>> tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn quantum 300
>>>> tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum 300
>>>> tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13
>>>> tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11
>>>> tc qdisc del dev ge00 handle ffff: ingress
>>>> tc qdisc add dev ge00 handle ffff: ingress
>>>> tc qdisc del dev ifb0 root
>>>> tc qdisc add dev ifb0 root handle 1: htb default 12
>>>> tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
>>>> tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
>>>> tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
>>>> tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
>>>> tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
>>>> tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn quantum 500
>>>> tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn quantum 1500
>>>> tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
>>>> tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc classid 1:12
>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos 0x20 0xfc classid 1:13
>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid 1:13
>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos 0x10 0xfc classid 1:11
>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 0xfc classid 1:11
>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc classid 1:11
>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6 priority 0xc0 0xfc classid 1:11
>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid 1:11
>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6 priority 0xe0 0xfc classid 1:11
>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid 1:11
>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6 priority 0x90 0xfc classid 1:11
>>>> tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
>>>> tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0
>>>> 
>>>> I notice it seem this only shows up for egress(), but looking at simple.qos ingress() is not addend ${ADSLL} at all so that is to be expected. There is nothing in dmesg at all.
>>>> 
>>>> So I am off to add ADSLL to ingress() as well and then test RRUL again...
>>>> 
>>>> 
>>>> Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
>>>> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
>>>> but got no output even though debugs was already mounted…
>>>> 
>>>> Best
>>>> Sebastian
>>>> 
>>>>> 
>>>>> Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
>>>>> 
>>>>> It does.
>>>>> 
>>>>> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
>>>>> 
>>>>> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time.
>>>>> 
>>>>> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
>>>>> 
>>>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
>>>>> 
>>>>> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
>>>>> 
>>>>> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
>>>>> 
>>>>> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
>>>>> 
>>>>> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
>>>>> 
>>>>> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
>>>>> 
>>>>> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
>>>>> 
>>>>> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
>>>>> 
>>>>> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
>>>>> 
>>>>> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3. The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from.
>>>>> 
>>>>> Multiple parties have the delusion that 20ms is "good enough".
>>>>> 
>>>>> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
>>>>> 
>>>>> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
>>>>> 
>>>>> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs.
>>>>> 
>>>>> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested.
>>>>> 
>>>>> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github)
>>>>> 
>>>>> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems.
>>>>> 
>>>>> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
>>>>> 
>>>>> ... and find funding to get through the winter.
>>>>> 
>>>>> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
>>>>> 
>>>>> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
>>>>> 
>>>>> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
>>>>> 
>>>>> tc -s qdisc show dev ge00
>>>>> tc -s qdisc show dev ifb0
>>>>> 
>>>>> would be useful info to have in general after each test.
>>>>> 
>>>>> TIA.
>>>>> 
>>>>> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
>>>>> 
>>>>> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
>>>>> 
>>>>> This is some of the stuff I know that needs fixing in userspace:
>>>>> 
>>>>> * TODO readlink not found
>>>>> * TODO netdev user missing
>>>>> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
>>>>> * TODO [   18.480468] Mirror/redirect action on
>>>>> [   18.539062] Failed to load ipt action
>>>>> * upload and download are reversed in aqm
>>>>> * BCP38
>>>>> * Squash CS values
>>>>> * Replace ntp
>>>>> * Make ahcp client mode
>>>>> * Drop more privs for polipo
>>>>> * upnp
>>>>> * priv separation
>>>>> * Review FW rules
>>>>> * dhcpv6 support
>>>>> * uci-defaults/make-cert.sh uses a bad path for px5g
>>>>> * Doesn't configure the web browser either
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Best
>>>>> Sebastian
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Dave Täht
>>>>> 
>>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Dave Täht
>>>> 
>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Dave Täht
>>> 
>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>> 
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 10:17                       ` Fred Stratton
@ 2013-08-25 13:59                         ` Sebastian Moeller
  2013-08-25 14:26                           ` Fred Stratton
  0 siblings, 1 reply; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-25 13:59 UTC (permalink / raw)
  To: Fred Stratton; +Cc: cerowrt-devel

Hi Fred,


On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:

> 
> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
> 
>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.

	Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...

>> 
>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this. 
>> 
>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.

	Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
	Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.


>> 
>> YouTube has no problems.
>> 
>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.

	Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice


>> 
>> The current ISP connection is IPoA LLC.
> 
> Correction - Bridged LLC. 

	Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:

#! /bin/bash
# TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)

# Telekom Tuebingen Moltkestrasse 6
TECH=ADSL2
# finding a proper target IP is somewhat of an art, just traceroute a remote site 
# and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
TARGET=87.186.197.70		# T
DATESTR=`date +%Y%m%d_%H%M%S`	# to allow multiple sequential records
LOG=ping_sweep_${TECH}_${DATESTR}.txt


# by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
# empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
PINGPERIOD=0.01		# in seconds
PINGSPERSIZE=10000

# Start, needed to find the per packet overhead dependent on the ATM encapsulation
# to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
SWEEPMINSIZE=16		# 64bit systems seem to require 16 bytes of payload to include a timestamp...
SWEEPMAXSIZE=116
    

n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`


i_sweep=0
i_size=0

while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
do
    (( i_sweep++ ))
    echo "Current iteration: ${i_sweep}"
    # now loop from sweepmin to sweepmax
    i_size=${SWEEPMINSIZE}
    while [ ${i_size} -le ${SWEEPMAXSIZE} ]
    do
	echo "${i_sweep}. repetition of ping size ${i_size}"
	ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
	(( i_size++ ))
	# we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
	sleep ${PINGPERIOD}
    done
done

#tail -f ${LOG}

echo "Done... ($0)"


Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).


> 
>> Whatever byte value is used for tc-stab makes no change.

	I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.

>> 
>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.

	For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.

>> 
>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
>> 
>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>> 
>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
>> 
>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>> 
>> None of these changes affect the problematic uplink delay.

	So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?

>> 
>> 
>> On 24 Aug 2013, at 21:51, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>>> Hi Dave,
>>> 
>>> 
>>> On Aug 23, 2013, at 22:29 , Dave Taht <dave.taht@gmail.com> wrote:
>>> 
>>>> 
>>>> 
>>>> 
>>>> On Fri, Aug 23, 2013 at 12:56 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>> Hi Dave,
>>>> 
>>>> I guess I found the culprit:
>>>> 
>>>> once I added $ADSLL to the ingress() in simple.qos:
>>>> 
>>>> 
>>>> 
>>>> I had that in there originally. I ripped it out because it seemed to help with ADSL at the time - as I was unaware the extent that the whole subsystem was busted!
>>> 
>>> 	Ah, and I had added my stab based version to both ingress() and egress() assuming that both links need to be kept under control. So with fixed htb link layer adjustment (LLA) it only worked on the uplink and in retrospect if I look at my initial test data I actually see one of the hallmarks of a working LLA for the upstream. (The upstream good-put was reduced compared to the no LLA test, caused by LLA making the actually sent packets larger so fewer packets fit through the shaped link). But since I was not expecting only half a working system I overlooked that in the data.
>>> 	But looking at the latency of the ping RTT probes it becomes quite clear that only doing link layer adjustments on the uplink is even worse than not doing it all (because the latency is still almost as bad as without LLA but the up-link bandwidth is reduced).
>>> 
>>>> 
>>>> I like to think of the process we've just gone through as "wow, we just fixed the uk, and a few other countries". :) Feels kind of good, doesn't it? (Too bad the pay sucks.)
>>> 
>>> 	Oh, I can not complain about pay, I have a day job in totally different field, so this is more of a hobby for me :) 
>>> 
>>>> I mean, jeeze, chopping another 30+ms off the latency of that many systems should get medals from economists worldwide monitoring productivity. 
>>>> 
>>>> Does anyone have a date/kernel version on when linklayer overhead compensation stopped working? There was a bug even prior to 3.8 that looked bad. (and RED was busted for 3 years).
>>>> 
>>>> Another step would be trying to improve openwrt's native qos system somewhat in the DSL case. They don't use this subsystem (probably because it didn't work), and it's also broke on ipv6. (They use conn track)
>>> 
>>> 	Oh, in the bql-40 time frame I hacked the stab based LLA into their generate.sh and it worked quite well, even though at time my measurements were quite crude. SInce their qos scripts are HFSC based the HTB private implementation is not going to do them any good. Luckily now that does not seem to matter as both methods now perform identically as they should. (Well, now Jespers last changes are nicer than the old table lookup, but it should be relatively say to implant the same for stab, heck once I got my linux machine up I might take this as my first attempt at making local changes to the kernel :) ). So adding it to openwrt proper should be a piece of cake. Do you know by any chance who would be the best person to contact for that, ?
>>> 
>>>> 
>>>> At some point I'd like to have a mechanism for saner diffserv classification on egress, and to clamp ingress values to egress ones. There is a ton of work going on on finding sane codepoints on webrtc in the ietf….
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ingress() {
>>>> 
>>>> CEIL=$DOWNLINK
>>>> PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
>>>> BE_RATE=`expr $CEIL / 6`   # Min for best effort
>>>> BK_RATE=`expr $CEIL / 6`   # Min for background
>>>> BE_CEIL=`expr $CEIL - 64`  # A little slop at the top
>>>> 
>>>> LQ="quantum `get_mtu $IFACE`"
>>>> 
>>>> $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
>>>> $TC qdisc add dev $IFACE handle ffff: ingress
>>>> 
>>>> $TC qdisc del dev $DEV root  2> /dev/null
>>>> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
>>>> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
>>>> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
>>>> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
>>>> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
>>>> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL
>>>> 
>>>> # I'd prefer to use a pre-nat filter but that causes permutation...
>>>> 
>>>> $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN `get_quantum 500` `get_flows ${PRIO_RATE}`
>>>> $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BE_RATE}`
>>>> $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BK_RATE}`
>>>> 
>>>> diffserv $DEV
>>>> 
>>>> ifconfig $DEV up
>>>> 
>>>> # redirect all IP packets arriving in $IFACE to ifb0
>>>> 
>>>> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
>>>> match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV
>>>> 
>>>> }
>>>> 
>>>> I get basically the same RRUL ping RTTs for htb_private as for tc_stab. So Jesper was right the patch seems to fix the issue. I guess I should send out my current version of yours and Toke's AQM scripts soon.
>>>> 
>>>> 
>>>> 
>>>> Best
>>>> Sebastian
>>>> 
>>>> P.S.: I am not sure whether I want to tackle the PIE issue today...
>>>> 
>>>> 
>>>> 
>>>> On Aug 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> wrote:
>>>> 
>>>>> quick note: running this script requires that you
>>>>> 
>>>>> ifconfig ifb0 up
>>>>> 
>>>>> at some point.
>>>> 
>>>> In my case on cerowrt you took care of that already...
>>>> 
>>>> 
>>>>> 
>>>>> 
>>>>> On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>> Hi Dave,
>>>>> 
>>>>> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>>> Hi List, hi Jesper,
>>>>>> 
>>>>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>>>>>> Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>>>>>> I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
>>>>>> 
>>>>>> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
>>>>> 
>>>>> So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):
>>>>> 
>>>>> tc qdisc del dev ge00 root
>>>>> tc qdisc add dev ge00 root handle 1: htb default 12
>>>>> tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>> tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>> tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>> tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>> tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>> tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn quantum 300
>>>>> tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn quantum 300
>>>>> tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum 300
>>>>> tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13
>>>>> tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11
>>>>> tc qdisc del dev ge00 handle ffff: ingress
>>>>> tc qdisc add dev ge00 handle ffff: ingress
>>>>> tc qdisc del dev ifb0 root
>>>>> tc qdisc add dev ifb0 root handle 1: htb default 12
>>>>> tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
>>>>> tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
>>>>> tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
>>>>> tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
>>>>> tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
>>>>> tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn quantum 500
>>>>> tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn quantum 1500
>>>>> tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
>>>>> tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc classid 1:12
>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos 0x20 0xfc classid 1:13
>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid 1:13
>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos 0x10 0xfc classid 1:11
>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 0xfc classid 1:11
>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc classid 1:11
>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6 priority 0xc0 0xfc classid 1:11
>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid 1:11
>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6 priority 0xe0 0xfc classid 1:11
>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid 1:11
>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6 priority 0x90 0xfc classid 1:11
>>>>> tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
>>>>> tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0
>>>>> 
>>>>> I notice it seem this only shows up for egress(), but looking at simple.qos ingress() is not addend ${ADSLL} at all so that is to be expected. There is nothing in dmesg at all.
>>>>> 
>>>>> So I am off to add ADSLL to ingress() as well and then test RRUL again...
>>>>> 
>>>>> 
>>>>> Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
>>>>> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
>>>>> but got no output even though debugs was already mounted…
>>>>> 
>>>>> Best
>>>>> Sebastian
>>>>> 
>>>>>> 
>>>>>> Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
>>>>>> 
>>>>>> It does.
>>>>>> 
>>>>>> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
>>>>>> 
>>>>>> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time.
>>>>>> 
>>>>>> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
>>>>>> 
>>>>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
>>>>>> 
>>>>>> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
>>>>>> 
>>>>>> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
>>>>>> 
>>>>>> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
>>>>>> 
>>>>>> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
>>>>>> 
>>>>>> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
>>>>>> 
>>>>>> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
>>>>>> 
>>>>>> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
>>>>>> 
>>>>>> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
>>>>>> 
>>>>>> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3. The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from.
>>>>>> 
>>>>>> Multiple parties have the delusion that 20ms is "good enough".
>>>>>> 
>>>>>> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
>>>>>> 
>>>>>> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
>>>>>> 
>>>>>> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs.
>>>>>> 
>>>>>> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested.
>>>>>> 
>>>>>> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github)
>>>>>> 
>>>>>> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems.
>>>>>> 
>>>>>> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
>>>>>> 
>>>>>> ... and find funding to get through the winter.
>>>>>> 
>>>>>> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
>>>>>> 
>>>>>> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
>>>>>> 
>>>>>> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
>>>>>> 
>>>>>> tc -s qdisc show dev ge00
>>>>>> tc -s qdisc show dev ifb0
>>>>>> 
>>>>>> would be useful info to have in general after each test.
>>>>>> 
>>>>>> TIA.
>>>>>> 
>>>>>> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
>>>>>> 
>>>>>> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
>>>>>> 
>>>>>> This is some of the stuff I know that needs fixing in userspace:
>>>>>> 
>>>>>> * TODO readlink not found
>>>>>> * TODO netdev user missing
>>>>>> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
>>>>>> * TODO [   18.480468] Mirror/redirect action on
>>>>>> [   18.539062] Failed to load ipt action
>>>>>> * upload and download are reversed in aqm
>>>>>> * BCP38
>>>>>> * Squash CS values
>>>>>> * Replace ntp
>>>>>> * Make ahcp client mode
>>>>>> * Drop more privs for polipo
>>>>>> * upnp
>>>>>> * priv separation
>>>>>> * Review FW rules
>>>>>> * dhcpv6 support
>>>>>> * uci-defaults/make-cert.sh uses a bad path for px5g
>>>>>> * Doesn't configure the web browser either
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Best
>>>>>> Sebastian
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Dave Täht
>>>>>> 
>>>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Dave Täht
>>>>> 
>>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Dave Täht
>>>> 
>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>> 
>>> _______________________________________________
>>> Cerowrt-devel mailing list
>>> Cerowrt-devel@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> 
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 13:59                         ` Sebastian Moeller
@ 2013-08-25 14:26                           ` Fred Stratton
  2013-08-25 14:31                             ` Fred Stratton
  2013-08-25 17:53                             ` Sebastian Moeller
  0 siblings, 2 replies; 43+ messages in thread
From: Fred Stratton @ 2013-08-25 14:26 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 28532 bytes --]

Thank you.

This is an initial response.

Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active. 
On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Fred,
> 
> 
> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
> 
>> 
>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
>> 
>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
> 
> 	Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
> 
>>> 
>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this. 
>>> 
>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
> 
> 	Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).

Uptime 100655
downstream 12162 kbits/s
CRC errors 10154
FEC Errors 464
hEC Errors 758

upstream 1122 kbits/s
no errors in period.

> 	Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.

Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.
> 
> 
>>> 
>>> YouTube has no problems.
>>> 
>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
> 
> 	Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice


> See initial comments.
> 
>>> 
>>> The current ISP connection is IPoA LLC.
>> 
>> Correction - Bridged LLC. 
> 
> 	Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
> 
> #! /bin/bash
> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
> 
> # Telekom Tuebingen Moltkestrasse 6
> TECH=ADSL2
> # finding a proper target IP is somewhat of an art, just traceroute a remote site 
> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
> TARGET=87.186.197.70		# T
> DATESTR=`date +%Y%m%d_%H%M%S`	# to allow multiple sequential records
> LOG=ping_sweep_${TECH}_${DATESTR}.txt
> 
> 
> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
> PINGPERIOD=0.01		# in seconds
> PINGSPERSIZE=10000
> 
> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
> SWEEPMINSIZE=16		# 64bit systems seem to require 16 bytes of payload to include a timestamp...
> SWEEPMAXSIZE=116
> 
> 
> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
> 
> 
> i_sweep=0
> i_size=0
> 
> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
> do
>    (( i_sweep++ ))
>    echo "Current iteration: ${i_sweep}"
>    # now loop from sweepmin to sweepmax
>    i_size=${SWEEPMINSIZE}
>    while [ ${i_size} -le ${SWEEPMAXSIZE} ]
>    do
> 	echo "${i_sweep}. repetition of ping size ${i_size}"
> 	ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
> 	(( i_size++ ))
> 	# we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
> 	sleep ${PINGPERIOD}
>    done
> done
> 
> #tail -f ${LOG}
> 
> echo "Done... ($0)"
> 
> 
> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).

To follow at some point.
> 
> 
>> 
>>> Whatever byte value is used for tc-stab makes no change.
> 
> 	I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.

The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.
> 
>>> 
>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
> 
> 	For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.

Using 3.10.9-2

> 
>>> 
>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
>>> 
>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>>> 
>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
>>> 
>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>>> 
>>> None of these changes affect the problematic uplink delay.
> 
> 	So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?

Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.
> 
>>> 
>>> 
>>> On 24 Aug 2013, at 21:51, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> 
>>>> Hi Dave,
>>>> 
>>>> 
>>>> On Aug 23, 2013, at 22:29 , Dave Taht <dave.taht@gmail.com> wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Fri, Aug 23, 2013 at 12:56 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>> Hi Dave,
>>>>> 
>>>>> I guess I found the culprit:
>>>>> 
>>>>> once I added $ADSLL to the ingress() in simple.qos:
>>>>> 
>>>>> 
>>>>> 
>>>>> I had that in there originally. I ripped it out because it seemed to help with ADSL at the time - as I was unaware the extent that the whole subsystem was busted!
>>>> 
>>>> 	Ah, and I had added my stab based version to both ingress() and egress() assuming that both links need to be kept under control. So with fixed htb link layer adjustment (LLA) it only worked on the uplink and in retrospect if I look at my initial test data I actually see one of the hallmarks of a working LLA for the upstream. (The upstream good-put was reduced compared to the no LLA test, caused by LLA making the actually sent packets larger so fewer packets fit through the shaped link). But since I was not expecting only half a working system I overlooked that in the data.
>>>> 	But looking at the latency of the ping RTT probes it becomes quite clear that only doing link layer adjustments on the uplink is even worse than not doing it all (because the latency is still almost as bad as without LLA but the up-link bandwidth is reduced).
>>>> 
>>>>> 
>>>>> I like to think of the process we've just gone through as "wow, we just fixed the uk, and a few other countries". :) Feels kind of good, doesn't it? (Too bad the pay sucks.)
>>>> 
>>>> 	Oh, I can not complain about pay, I have a day job in totally different field, so this is more of a hobby for me :) 
>>>> 
>>>>> I mean, jeeze, chopping another 30+ms off the latency of that many systems should get medals from economists worldwide monitoring productivity. 
>>>>> 
>>>>> Does anyone have a date/kernel version on when linklayer overhead compensation stopped working? There was a bug even prior to 3.8 that looked bad. (and RED was busted for 3 years).
>>>>> 
>>>>> Another step would be trying to improve openwrt's native qos system somewhat in the DSL case. They don't use this subsystem (probably because it didn't work), and it's also broke on ipv6. (They use conn track)
>>>> 
>>>> 	Oh, in the bql-40 time frame I hacked the stab based LLA into their generate.sh and it worked quite well, even though at time my measurements were quite crude. SInce their qos scripts are HFSC based the HTB private implementation is not going to do them any good. Luckily now that does not seem to matter as both methods now perform identically as they should. (Well, now Jespers last changes are nicer than the old table lookup, but it should be relatively say to implant the same for stab, heck once I got my linux machine up I might take this as my first attempt at making local changes to the kernel :) ). So adding it to openwrt proper should be a piece of cake. Do you know by any chance who would be the best person to contact for that, ?
>>>> 
>>>>> 
>>>>> At some point I'd like to have a mechanism for saner diffserv classification on egress, and to clamp ingress values to egress ones. There is a ton of work going on on finding sane codepoints on webrtc in the ietf….
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ingress() {
>>>>> 
>>>>> CEIL=$DOWNLINK
>>>>> PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
>>>>> BE_RATE=`expr $CEIL / 6`   # Min for best effort
>>>>> BK_RATE=`expr $CEIL / 6`   # Min for background
>>>>> BE_CEIL=`expr $CEIL - 64`  # A little slop at the top
>>>>> 
>>>>> LQ="quantum `get_mtu $IFACE`"
>>>>> 
>>>>> $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
>>>>> $TC qdisc add dev $IFACE handle ffff: ingress
>>>>> 
>>>>> $TC qdisc del dev $DEV root  2> /dev/null
>>>>> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
>>>>> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
>>>>> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
>>>>> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
>>>>> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
>>>>> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL
>>>>> 
>>>>> # I'd prefer to use a pre-nat filter but that causes permutation...
>>>>> 
>>>>> $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN `get_quantum 500` `get_flows ${PRIO_RATE}`
>>>>> $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BE_RATE}`
>>>>> $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BK_RATE}`
>>>>> 
>>>>> diffserv $DEV
>>>>> 
>>>>> ifconfig $DEV up
>>>>> 
>>>>> # redirect all IP packets arriving in $IFACE to ifb0
>>>>> 
>>>>> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
>>>>> match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV
>>>>> 
>>>>> }
>>>>> 
>>>>> I get basically the same RRUL ping RTTs for htb_private as for tc_stab. So Jesper was right the patch seems to fix the issue. I guess I should send out my current version of yours and Toke's AQM scripts soon.
>>>>> 
>>>>> 
>>>>> 
>>>>> Best
>>>>> Sebastian
>>>>> 
>>>>> P.S.: I am not sure whether I want to tackle the PIE issue today...
>>>>> 
>>>>> 
>>>>> 
>>>>> On Aug 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> wrote:
>>>>> 
>>>>>> quick note: running this script requires that you
>>>>>> 
>>>>>> ifconfig ifb0 up
>>>>>> 
>>>>>> at some point.
>>>>> 
>>>>> In my case on cerowrt you took care of that already...
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>>> Hi Dave,
>>>>>> 
>>>>>> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>>>> Hi List, hi Jesper,
>>>>>>> 
>>>>>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>>>>>>> Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>>>>>>> I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
>>>>>>> 
>>>>>>> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
>>>>>> 
>>>>>> So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):
>>>>>> 
>>>>>> tc qdisc del dev ge00 root
>>>>>> tc qdisc add dev ge00 root handle 1: htb default 12
>>>>>> tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>> tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>> tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>> tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>> tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>> tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn quantum 300
>>>>>> tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn quantum 300
>>>>>> tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum 300
>>>>>> tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
>>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
>>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
>>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
>>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
>>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13
>>>>>> tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11
>>>>>> tc qdisc del dev ge00 handle ffff: ingress
>>>>>> tc qdisc add dev ge00 handle ffff: ingress
>>>>>> tc qdisc del dev ifb0 root
>>>>>> tc qdisc add dev ifb0 root handle 1: htb default 12
>>>>>> tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
>>>>>> tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
>>>>>> tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
>>>>>> tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
>>>>>> tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
>>>>>> tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn quantum 500
>>>>>> tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn quantum 1500
>>>>>> tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
>>>>>> tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc classid 1:12
>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos 0x20 0xfc classid 1:13
>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid 1:13
>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos 0x10 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6 priority 0xc0 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6 priority 0xe0 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6 priority 0x90 0xfc classid 1:11
>>>>>> tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
>>>>>> tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0
>>>>>> 
>>>>>> I notice it seem this only shows up for egress(), but looking at simple.qos ingress() is not addend ${ADSLL} at all so that is to be expected. There is nothing in dmesg at all.
>>>>>> 
>>>>>> So I am off to add ADSLL to ingress() as well and then test RRUL again...
>>>>>> 
>>>>>> 
>>>>>> Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
>>>>>> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
>>>>>> but got no output even though debugs was already mounted…
>>>>>> 
>>>>>> Best
>>>>>> Sebastian
>>>>>> 
>>>>>>> 
>>>>>>> Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
>>>>>>> 
>>>>>>> It does.
>>>>>>> 
>>>>>>> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
>>>>>>> 
>>>>>>> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time.
>>>>>>> 
>>>>>>> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
>>>>>>> 
>>>>>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
>>>>>>> 
>>>>>>> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
>>>>>>> 
>>>>>>> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
>>>>>>> 
>>>>>>> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
>>>>>>> 
>>>>>>> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
>>>>>>> 
>>>>>>> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
>>>>>>> 
>>>>>>> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
>>>>>>> 
>>>>>>> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
>>>>>>> 
>>>>>>> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
>>>>>>> 
>>>>>>> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3. The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from.
>>>>>>> 
>>>>>>> Multiple parties have the delusion that 20ms is "good enough".
>>>>>>> 
>>>>>>> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
>>>>>>> 
>>>>>>> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
>>>>>>> 
>>>>>>> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs.
>>>>>>> 
>>>>>>> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested.
>>>>>>> 
>>>>>>> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github)
>>>>>>> 
>>>>>>> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems.
>>>>>>> 
>>>>>>> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
>>>>>>> 
>>>>>>> ... and find funding to get through the winter.
>>>>>>> 
>>>>>>> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
>>>>>>> 
>>>>>>> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
>>>>>>> 
>>>>>>> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
>>>>>>> 
>>>>>>> tc -s qdisc show dev ge00
>>>>>>> tc -s qdisc show dev ifb0
>>>>>>> 
>>>>>>> would be useful info to have in general after each test.
>>>>>>> 
>>>>>>> TIA.
>>>>>>> 
>>>>>>> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
>>>>>>> 
>>>>>>> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
>>>>>>> 
>>>>>>> This is some of the stuff I know that needs fixing in userspace:
>>>>>>> 
>>>>>>> * TODO readlink not found
>>>>>>> * TODO netdev user missing
>>>>>>> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
>>>>>>> * TODO [   18.480468] Mirror/redirect action on
>>>>>>> [   18.539062] Failed to load ipt action
>>>>>>> * upload and download are reversed in aqm
>>>>>>> * BCP38
>>>>>>> * Squash CS values
>>>>>>> * Replace ntp
>>>>>>> * Make ahcp client mode
>>>>>>> * Drop more privs for polipo
>>>>>>> * upnp
>>>>>>> * priv separation
>>>>>>> * Review FW rules
>>>>>>> * dhcpv6 support
>>>>>>> * uci-defaults/make-cert.sh uses a bad path for px5g
>>>>>>> * Doesn't configure the web browser either
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Best
>>>>>>> Sebastian
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Dave Täht
>>>>>>> 
>>>>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Dave Täht
>>>>>> 
>>>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Dave Täht
>>>>> 
>>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>>> 
>>>> _______________________________________________
>>>> Cerowrt-devel mailing list
>>>> Cerowrt-devel@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>> 
>>> _______________________________________________
>>> Cerowrt-devel mailing list
>>> Cerowrt-devel@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> 
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 


[-- Attachment #2: Type: text/html, Size: 31317 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 14:26                           ` Fred Stratton
@ 2013-08-25 14:31                             ` Fred Stratton
  2013-08-25 17:53                             ` Sebastian Moeller
  1 sibling, 0 replies; 43+ messages in thread
From: Fred Stratton @ 2013-08-25 14:31 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 29311 bytes --]

Correction.

Using 3.10.9-2


On 25 Aug 2013, at 15:26, Fred Stratton <fredstratton@imap.cc> wrote:

> Thank you.
> 
> This is an initial response.
> 
> Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active. 
> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
>> Hi Fred,
>> 
>> 
>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
>> 
>>> 
>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
>>> 
>>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
>> 
>> 	Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
>> 
>>>> 
>>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this. 
>>>> 
>>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
>> 
>> 	Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
> 
> Uptime 100655
> downstream 12162 kbits/s
> CRC errors 10154
> FEC Errors 464
> hEC Errors 758
> 
> upstream 1122 kbits/s
> no errors in period.
> 
>> 	Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.
> 
> Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.
>> 
>> 
>>>> 
>>>> YouTube has no problems.
>>>> 
>>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
>> 
>> 	Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice
> 
> 
>> See initial comments.
>> 
>>>> 
>>>> The current ISP connection is IPoA LLC.
>>> 
>>> Correction - Bridged LLC. 
>> 
>> 	Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
>> 
>> #! /bin/bash
>> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
>> 
>> # Telekom Tuebingen Moltkestrasse 6
>> TECH=ADSL2
>> # finding a proper target IP is somewhat of an art, just traceroute a remote site 
>> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
>> TARGET=87.186.197.70		# T
>> DATESTR=`date +%Y%m%d_%H%M%S`	# to allow multiple sequential records
>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
>> 
>> 
>> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
>> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
>> PINGPERIOD=0.01		# in seconds
>> PINGSPERSIZE=10000
>> 
>> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
>> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
>> SWEEPMINSIZE=16		# 64bit systems seem to require 16 bytes of payload to include a timestamp...
>> SWEEPMAXSIZE=116
>> 
>> 
>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
>> 
>> 
>> i_sweep=0
>> i_size=0
>> 
>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
>> do
>>    (( i_sweep++ ))
>>    echo "Current iteration: ${i_sweep}"
>>    # now loop from sweepmin to sweepmax
>>    i_size=${SWEEPMINSIZE}
>>    while [ ${i_size} -le ${SWEEPMAXSIZE} ]
>>    do
>> 	echo "${i_sweep}. repetition of ping size ${i_size}"
>> 	ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
>> 	(( i_size++ ))
>> 	# we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
>> 	sleep ${PINGPERIOD}
>>    done
>> done
>> 
>> #tail -f ${LOG}
>> 
>> echo "Done... ($0)"
>> 
>> 
>> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).
> 
> To follow at some point.
>> 
>> 
>>> 
>>>> Whatever byte value is used for tc-stab makes no change.
>> 
>> 	I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.
> 
> The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.
>> 
>>>> 
>>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
>> 
>> 	For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.
> 
> Using 3.10.9-2
> 
>> 
>>>> 
>>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
>>>> 
>>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>>>> 
>>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
>>>> 
>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>>>> 
>>>> None of these changes affect the problematic uplink delay.
>> 
>> 	So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?
> 
> Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.
>> 
>>>> 
>>>> 
>>>> On 24 Aug 2013, at 21:51, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>> 
>>>>> Hi Dave,
>>>>> 
>>>>> 
>>>>> On Aug 23, 2013, at 22:29 , Dave Taht <dave.taht@gmail.com> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Aug 23, 2013 at 12:56 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>>> Hi Dave,
>>>>>> 
>>>>>> I guess I found the culprit:
>>>>>> 
>>>>>> once I added $ADSLL to the ingress() in simple.qos:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> I had that in there originally. I ripped it out because it seemed to help with ADSL at the time - as I was unaware the extent that the whole subsystem was busted!
>>>>> 
>>>>> 	Ah, and I had added my stab based version to both ingress() and egress() assuming that both links need to be kept under control. So with fixed htb link layer adjustment (LLA) it only worked on the uplink and in retrospect if I look at my initial test data I actually see one of the hallmarks of a working LLA for the upstream. (The upstream good-put was reduced compared to the no LLA test, caused by LLA making the actually sent packets larger so fewer packets fit through the shaped link). But since I was not expecting only half a working system I overlooked that in the data.
>>>>> 	But looking at the latency of the ping RTT probes it becomes quite clear that only doing link layer adjustments on the uplink is even worse than not doing it all (because the latency is still almost as bad as without LLA but the up-link bandwidth is reduced).
>>>>> 
>>>>>> 
>>>>>> I like to think of the process we've just gone through as "wow, we just fixed the uk, and a few other countries". :) Feels kind of good, doesn't it? (Too bad the pay sucks.)
>>>>> 
>>>>> 	Oh, I can not complain about pay, I have a day job in totally different field, so this is more of a hobby for me :) 
>>>>> 
>>>>>> I mean, jeeze, chopping another 30+ms off the latency of that many systems should get medals from economists worldwide monitoring productivity. 
>>>>>> 
>>>>>> Does anyone have a date/kernel version on when linklayer overhead compensation stopped working? There was a bug even prior to 3.8 that looked bad. (and RED was busted for 3 years).
>>>>>> 
>>>>>> Another step would be trying to improve openwrt's native qos system somewhat in the DSL case. They don't use this subsystem (probably because it didn't work), and it's also broke on ipv6. (They use conn track)
>>>>> 
>>>>> 	Oh, in the bql-40 time frame I hacked the stab based LLA into their generate.sh and it worked quite well, even though at time my measurements were quite crude. SInce their qos scripts are HFSC based the HTB private implementation is not going to do them any good. Luckily now that does not seem to matter as both methods now perform identically as they should. (Well, now Jespers last changes are nicer than the old table lookup, but it should be relatively say to implant the same for stab, heck once I got my linux machine up I might take this as my first attempt at making local changes to the kernel :) ). So adding it to openwrt proper should be a piece of cake. Do you know by any chance who would be the best person to contact for that, ?
>>>>> 
>>>>>> 
>>>>>> At some point I'd like to have a mechanism for saner diffserv classification on egress, and to clamp ingress values to egress ones. There is a ton of work going on on finding sane codepoints on webrtc in the ietf….
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ingress() {
>>>>>> 
>>>>>> CEIL=$DOWNLINK
>>>>>> PRIO_RATE=`expr $CEIL / 3` # Ceiling for prioirty
>>>>>> BE_RATE=`expr $CEIL / 6`   # Min for best effort
>>>>>> BK_RATE=`expr $CEIL / 6`   # Min for background
>>>>>> BE_CEIL=`expr $CEIL - 64`  # A little slop at the top
>>>>>> 
>>>>>> LQ="quantum `get_mtu $IFACE`"
>>>>>> 
>>>>>> $TC qdisc del dev $IFACE handle ffff: ingress 2> /dev/null
>>>>>> $TC qdisc add dev $IFACE handle ffff: ingress
>>>>>> 
>>>>>> $TC qdisc del dev $DEV root  2> /dev/null
>>>>>> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
>>>>>> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
>>>>>> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
>>>>>> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
>>>>>> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
>>>>>> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL
>>>>>> 
>>>>>> # I'd prefer to use a pre-nat filter but that causes permutation...
>>>>>> 
>>>>>> $TC qdisc add dev $DEV parent 1:11 handle 110: $QDISC limit 1000 $ECN `get_quantum 500` `get_flows ${PRIO_RATE}`
>>>>>> $TC qdisc add dev $DEV parent 1:12 handle 120: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BE_RATE}`
>>>>>> $TC qdisc add dev $DEV parent 1:13 handle 130: $QDISC limit 1000 $ECN `get_quantum 1500` `get_flows ${BK_RATE}`
>>>>>> 
>>>>>> diffserv $DEV
>>>>>> 
>>>>>> ifconfig $DEV up
>>>>>> 
>>>>>> # redirect all IP packets arriving in $IFACE to ifb0
>>>>>> 
>>>>>> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
>>>>>> match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV
>>>>>> 
>>>>>> }
>>>>>> 
>>>>>> I get basically the same RRUL ping RTTs for htb_private as for tc_stab. So Jesper was right the patch seems to fix the issue. I guess I should send out my current version of yours and Toke's AQM scripts soon.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Best
>>>>>> Sebastian
>>>>>> 
>>>>>> P.S.: I am not sure whether I want to tackle the PIE issue today...
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Aug 23, 2013, at 21:47 , Dave Taht <dave.taht@gmail.com> wrote:
>>>>>> 
>>>>>>> quick note: running this script requires that you
>>>>>>> 
>>>>>>> ifconfig ifb0 up
>>>>>>> 
>>>>>>> at some point.
>>>>>> 
>>>>>> In my case on cerowrt you took care of that already...
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>>>> Hi Dave,
>>>>>>> 
>>>>>>> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>>>>>> Hi List, hi Jesper,
>>>>>>>> 
>>>>>>>> So I tested 3.10.9-1 to assess the status of the HTB atm link layer adjustments to see whether the recent changes resurrected this feature.
>>>>>>>> Unfortunately the htb_private link layer adjustments still is broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same as without link layer adjustments). On the bright side the tc_stab method still works as well as before (ping RTT around 40ms).
>>>>>>>> I would like to humbly propose to use the tc stab method in cerowrt to perform ATM link layer adjustments as default. To repeat myself, simply telling the kernel a lie about the packet size seems more robust than fudging HTB's rate tables. Especially since the kernel already fudges the packet size to account for the ethernet header and then some, so this path should receive more scrutiny by virtue of having more users?
>>>>>>>> 
>>>>>>>> It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
>>>>>>> 
>>>>>>> So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):
>>>>>>> 
>>>>>>> tc qdisc del dev ge00 root
>>>>>>> tc qdisc add dev ge00 root handle 1: htb default 12
>>>>>>> tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>>> tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>>> tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>>> tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>>> tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>>>>>>> tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn quantum 300
>>>>>>> tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn quantum 300
>>>>>>> tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn quantum 300
>>>>>>> tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid 1:11
>>>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid 1:12
>>>>>>> tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid 1:13
>>>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw classid 1:11
>>>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw classid 1:12
>>>>>>> tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw classid 1:13
>>>>>>> tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw classid 1:11
>>>>>>> tc qdisc del dev ge00 handle ffff: ingress
>>>>>>> tc qdisc add dev ge00 handle ffff: ingress
>>>>>>> tc qdisc del dev ifb0 root
>>>>>>> tc qdisc add dev ifb0 root handle 1: htb default 12
>>>>>>> tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
>>>>>>> tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
>>>>>>> tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
>>>>>>> tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
>>>>>>> tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
>>>>>>> tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn quantum 500
>>>>>>> tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn quantum 1500
>>>>>>> tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn quantum 1500
>>>>>>> tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip protocol 0 0x00 flowid 1:12
>>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos 0x00 0xfc classid 1:12
>>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6 priority 0x00 0xfc classid 1:12
>>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos 0x20 0xfc classid 1:13
>>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6 priority 0x20 0xfc classid 1:13
>>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos 0x10 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6 priority 0x10 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos 0xb8 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6 priority 0xb8 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos 0xc0 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6 priority 0xc0 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos 0xe0 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6 priority 0xe0 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos 0x90 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6 priority 0x90 0xfc classid 1:11
>>>>>>> tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw classid 1:11
>>>>>>> tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb0
>>>>>>> 
>>>>>>> I notice it seem this only shows up for egress(), but looking at simple.qos ingress() is not addend ${ADSLL} at all so that is to be expected. There is nothing in dmesg at all.
>>>>>>> 
>>>>>>> So I am off to add ADSLL to ingress() as well and then test RRUL again...
>>>>>>> 
>>>>>>> 
>>>>>>> Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
>>>>>>> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
>>>>>>> but got no output even though debugs was already mounted…
>>>>>>> 
>>>>>>> Best
>>>>>>> Sebastian
>>>>>>> 
>>>>>>>> 
>>>>>>>> Now, I have been testing this using Dave's most recent cerowrt alpha version with a 3.10.9 kernel on mips hardware, I think this kernel should contain all htb fixes including commit 8a8e3d84b17 (net_sched: restore "linklayer atm" handling) but am not fully sure.
>>>>>>>> 
>>>>>>>> It does.
>>>>>>>> 
>>>>>>>> `@Dave is there an easy way to find which patches you applied to the kernels of the cerowrt (testing-)releases?
>>>>>>>> 
>>>>>>>> Normally I DO commit stuff that is in testing, but my big push this time around was to get everything important into mainline 3.10, as it will be the "stable" release for a good long time.
>>>>>>>> 
>>>>>>>> So I am still mostly working the x86 side at the moment. I WAS kind of hoping that everything I just landed would make it up to 3.10. But for your perusal:
>>>>>>>> 
>>>>>>>> http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped out due to another weird bug I'm looking at. (It also has support for ipv6 nat thx to the ever prolific stephen walker heeding the call for patches...). 100% totally untested, I have this weird bug to figure out how to fix next:
>>>>>>>> 
>>>>>>>> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
>>>>>>>> 
>>>>>>>> I fear it's a comparison gone south, maybe in bradley's optimizations for not kernel trapping, don't know.
>>>>>>>> 
>>>>>>>> 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE losing the close naming integration, but, had to try this....
>>>>>>>> 
>>>>>>>> If you guys want me to start committing and pushing patches again, I'll do it, but most of that stuff will end up in 3.10.10, I think, in a couple days. The rest might make 3.12. Pie has to survive scrutiny on the netdev list in particular.
>>>>>>>> 
>>>>>>>> While I have you r attention :) I also tested 3.10.9-1's pie and it is way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms) but still worse than fq_codel (ping RTTs around 40ms with proper atm link layer adjustments).
>>>>>>>> 
>>>>>>>> This is with simple.qos I imagine? Simplest should do better than that with pie. Judging from how its estimator works I think it will do badly with multiple queues. But testing will tell...
>>>>>>>> 
>>>>>>>> But, yea, this pie is actually usable, and the previous wasn't. Thank you for looking at it!
>>>>>>>> 
>>>>>>>> It is different from cisco's last pie drop in that it can do ecn, does local congestion notification, has a better use of net_random, it's mostly KernelStyle, and I forget what else.
>>>>>>>> 
>>>>>>>> There is still a major rounding error in the code, and I'd like cisco to fix the api so it uses identical syntax to codel. Right now you specify "target 8" to get "target 7", and the "ms" is implied. target 5 becomes target 3. The default target is a whopping 20 (rounded to 19), which is in part where your 70+ms of extra delay came from.
>>>>>>>> 
>>>>>>>> Multiple parties have the delusion that 20ms is "good enough".
>>>>>>>> 
>>>>>>>> Part of the remaining delay may also be rounding error. Cisco uses kernels with HZ=1000, cero uses HZ=250.....
>>>>>>>> 
>>>>>>>> Anyway, to get more comparable tests... you can fiddle with the two $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms config, but that would break a codel config which treats target 8 as target 8us.
>>>>>>>> 
>>>>>>>> I MIGHT, if I get energetic enough, fix the API, the time accounting, and a few other things in pie, the problem is, that ns2_codel seems still more effective on most workloads and *fq_codel smokes absolutely everything. There are a few places where pie is a win over straight codel, notably on packet floods. And it may well be easier to retrofit into existing hardware fast path designs.
>>>>>>>> 
>>>>>>>> I worry about interactions between pie and other stuff. It seems inevitable at this point that some form of pie will be widely deployed, and I simply haven't tried enough traffic types and RTTs to draw a firm conclusion, period. Long RTTs are the last big place where codel and pie and fq_codel have to be seriously tested.
>>>>>>>> 
>>>>>>>> ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A big problem I have is getting decent long RTT emulation out of netem (some preliminary code is up at github)
>>>>>>>> 
>>>>>>>> ... and getting cero stable enough for others to actually use - next up is fixing the userspace problems.
>>>>>>>> 
>>>>>>>> ... and trying to make a small dent in the wifi problem along the way (couple commits coming up)
>>>>>>>> 
>>>>>>>> ... and find funding to get through the winter.
>>>>>>>> 
>>>>>>>> There's probably a few other things that are on that list but I forget. Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit smoking.
>>>>>>>> 
>>>>>>>> While I am not able to build kernels, it seems that I am able to quickly test whether link layer adjustments work or not. SO aim happy to help where I can :)
>>>>>>>> 
>>>>>>>> Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and target 7ms, too. fq_codel, same....
>>>>>>>> 
>>>>>>>> tc -s qdisc show dev ge00
>>>>>>>> tc -s qdisc show dev ifb0
>>>>>>>> 
>>>>>>>> would be useful info to have in general after each test.
>>>>>>>> 
>>>>>>>> TIA.
>>>>>>>> 
>>>>>>>> There are also things like tcp_upload and tcp_download and tcp_bidirectional that are useful tests in the rrul suite.
>>>>>>>> 
>>>>>>>> Thank you for your efforts on these early alpha releases. I hope things will stablize more soon, and I'll fold your aqm stuff into my next attempt this weekend.
>>>>>>>> 
>>>>>>>> This is some of the stuff I know that needs fixing in userspace:
>>>>>>>> 
>>>>>>>> * TODO readlink not found
>>>>>>>> * TODO netdev user missing
>>>>>>>> * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already running DHCP-server on interface 'se00' refusing to start, use 'option force 1' to override
>>>>>>>> * TODO [   18.480468] Mirror/redirect action on
>>>>>>>> [   18.539062] Failed to load ipt action
>>>>>>>> * upload and download are reversed in aqm
>>>>>>>> * BCP38
>>>>>>>> * Squash CS values
>>>>>>>> * Replace ntp
>>>>>>>> * Make ahcp client mode
>>>>>>>> * Drop more privs for polipo
>>>>>>>> * upnp
>>>>>>>> * priv separation
>>>>>>>> * Review FW rules
>>>>>>>> * dhcpv6 support
>>>>>>>> * uci-defaults/make-cert.sh uses a bad path for px5g
>>>>>>>> * Doesn't configure the web browser either
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Best
>>>>>>>> Sebastian
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Dave Täht
>>>>>>>> 
>>>>>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Dave Täht
>>>>>>> 
>>>>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Dave Täht
>>>>>> 
>>>>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>>>>> 
>>>>> _______________________________________________
>>>>> Cerowrt-devel mailing list
>>>>> Cerowrt-devel@lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>>> 
>>>> _______________________________________________
>>>> Cerowrt-devel mailing list
>>>> Cerowrt-devel@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>> 
>>> _______________________________________________
>>> Cerowrt-devel mailing list
>>> Cerowrt-devel@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> 
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


[-- Attachment #2: Type: text/html, Size: 32292 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 14:26                           ` Fred Stratton
  2013-08-25 14:31                             ` Fred Stratton
@ 2013-08-25 17:53                             ` Sebastian Moeller
  2013-08-25 17:55                               ` Dave Taht
  2013-08-25 18:30                               ` Fred Stratton
  1 sibling, 2 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-25 17:53 UTC (permalink / raw)
  To: Fred Stratton; +Cc: cerowrt-devel

Hi Fred,


On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:

> Thank you.
> 
> This is an initial response.
> 
> Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active. 

	I have seen your follow-up mail that you actually used 3.10.9-2. I think that has the first cut of the script modifications that still allow to select both. Since I have not tested it any other way I would recommend to enable just one of them at the same time. Since the implementation of both is somewhat orthogonal and htb_private actually works in 3.10.9, best case you might actually get the link layer adjustments (LLA) and the overhead applied twice, wasting bandwidth. So please either use the last set of modified files I send around or wait for Dave to include them in ceropackages...

> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
>> Hi Fred,
>> 
>> 
>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
>> 
>>> 
>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
>>> 
>>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
>> 
>> 	Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
>> 
>>>> 
>>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this. 
>>>> 
>>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
>> 
>> 	Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
> 
> Uptime 100655
> downstream 12162 kbits/s
> CRC errors 10154
> FEC Errors 464
> hEC Errors 758
> 
> upstream 1122 kbits/s
> no errors in period.

	Ah, I think you told me in the past that "Target snr upped to 12 deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.at lower snr. " so sync at 12162 might be too aggressive, no? But the point is that as I understand iPlayer works fine without competing download traffic? To my eye the error numbers look small enough to not be concerned about. Do you know how long the error correction period is?


> 
>> 	Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.
> 
> Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.

	That sounds weird, if you shape to below 500 upload stops working or just gets choppier? Looking at your sync data 561 would fit the ~50% and above 500 requirements.


>> 
>> 
>>>> 
>>>> YouTube has no problems.
>>>> 
>>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
>> 
>> 	Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice
> 
> 
>> See initial comments.
>> 
>>>> 
>>>> The current ISP connection is IPoA LLC.
>>> 
>>> Correction - Bridged LLC. 
>> 
>> 	Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
>> 
>> #! /bin/bash
>> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
>> 
>> # Telekom Tuebingen Moltkestrasse 6
>> TECH=ADSL2
>> # finding a proper target IP is somewhat of an art, just traceroute a remote site 
>> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
>> TARGET=87.186.197.70		# T
>> DATESTR=`date +%Y%m%d_%H%M%S`	# to allow multiple sequential records
>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
>> 
>> 
>> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
>> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
>> PINGPERIOD=0.01		# in seconds
>> PINGSPERSIZE=10000
>> 
>> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
>> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
>> SWEEPMINSIZE=16		# 64bit systems seem to require 16 bytes of payload to include a timestamp...
>> SWEEPMAXSIZE=116
>> 
>> 
>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
>> 
>> 
>> i_sweep=0
>> i_size=0
>> 
>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
>> do
>>    (( i_sweep++ ))
>>    echo "Current iteration: ${i_sweep}"
>>    # now loop from sweepmin to sweepmax
>>    i_size=${SWEEPMINSIZE}
>>    while [ ${i_size} -le ${SWEEPMAXSIZE} ]
>>    do
>> 	echo "${i_sweep}. repetition of ping size ${i_size}"
>> 	ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
>> 	(( i_size++ ))
>> 	# we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
>> 	sleep ${PINGPERIOD}
>>    done
>> done
>> 
>> #tail -f ${LOG}
>> 
>> echo "Done... ($0)"
>> 
>> 
>> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).
> 
> To follow at some point.

	Oh, I failed to mention at the given parameters the script takes almost 3 hours, during which the link should be otherwise idle...

>> 
>> 
>>> 
>>>> Whatever byte value is used for tc-stab makes no change.
>> 
>> 	I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.
> 
> The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.

	I guess figuring out you exact overhead empirically is going to be fun.

>> 
>>>> 
>>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
>> 
>> 	For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.
> 
> Using 3.10.9-2

	Yeah as stated above, I would recommend to use either or, not both. If you took RRUL data you might be able to compare the three conditions. I would estimate the most interesting part would be in the sustained ravager up and download rates here.


> 
>> 
>>>> 
>>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
>>>> 
>>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>>>> 
>>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
>>>> 
>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>>>> 
>>>> None of these changes affect the problematic uplink delay.
>> 
>> 	So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?
> 
> Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.

	Well, both use the same method which is not to meaningful if you use fq_codel on a shaped link (unless you want to optimize your system for UDP floods :) )

[snipp]


Best Regards
	Sebastian

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 17:53                             ` Sebastian Moeller
@ 2013-08-25 17:55                               ` Dave Taht
  2013-08-25 18:00                                 ` Fred Stratton
  2013-08-25 18:30                               ` Fred Stratton
  1 sibling, 1 reply; 43+ messages in thread
From: Dave Taht @ 2013-08-25 17:55 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 11195 bytes --]

Netanalyzer is not useful in a fq_codel'ed system.


On Sun, Aug 25, 2013 at 10:53 AM, Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Fred,
>
>
> On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
>
> > Thank you.
> >
> > This is an initial response.
> >
> > Am using 3.10.2-1 currently, with the standard AQM interface. This does
> not have the pull down menu of your interface, which is why I ask if both
> are active.
>
>         I have seen your follow-up mail that you actually used 3.10.9-2. I
> think that has the first cut of the script modifications that still allow
> to select both. Since I have not tested it any other way I would recommend
> to enable just one of them at the same time. Since the implementation of
> both is somewhat orthogonal and htb_private actually works in 3.10.9, best
> case you might actually get the link layer adjustments (LLA) and the
> overhead applied twice, wasting bandwidth. So please either use the last
> set of modified files I send around or wait for Dave to include them in
> ceropackages...
>
> > On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
> >
> >> Hi Fred,
> >>
> >>
> >> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
> >>
> >>>
> >>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
> >>>
> >>>> As the person with the most flaky ADSL link, I point out that None of
> these recent, welcome, changes, are having any effect here, with an uplink
> sped of circa 950 kbits/s.
> >>
> >>      Okay, how flaky is you link? What rate of Errors do you have while
> testing? I am especially interested in CRC errors and ES SES and HEC, just
> to get an idea how flaky the line is...
> >>
> >>>>
> >>>> The reason I mention this is that it is still impossible to watch
> iPlayer Flash streaming video and download at the same time, The iPlayer
> stream fails. The point of the exercise was to achieve this.
> >>>>
> >>>> The uplink delay is consistently around 650ms, which appears to be
> too high for effective streaming. In addition, the uplink stream has
> multiple breaks, presumably outages, if the uplink rate is capped at, say,
> 700 kbits/s.
> >>
> >>      Well, watching video is going to stress your downlink so the
> uplink should not saturate by the ACKs and the concurrent downloads also do
> not stress your uplink except for the ACKs, so this points to downlink
> errors as far as I can tell from the data you have given. If the up link
> has repeated outages however, your problems might be unfixable because
> these, if long enough, will cause lost ACKs and will probably trigger
> retransmission, independent of whether the link layer adjustments work or
> not. (You could test this by shaping you up and downlink to <= 50% of the
> link rates and disable all link layer adjustments, 50% is larger than the
> ATM worst case so should have you covered. Well unless you del link has an
> excessive number of tones reserved for forward error correction (FEC)).
> >
> > Uptime 100655
> > downstream 12162 kbits/s
> > CRC errors 10154
> > FEC Errors 464
> > hEC Errors 758
> >
> > upstream 1122 kbits/s
> > no errors in period.
>
>         Ah, I think you told me in the past that "Target snr upped to 12
> deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.atlower snr. " so sync at 12162 might be too aggressive, no? But the point is
> that as I understand iPlayer works fine without competing download traffic?
> To my eye the error numbers look small enough to not be concerned about. Do
> you know how long the error correction period is?
>
>
> >
> >>      Could you perform the following test by any chance: state iPlayer
> and yor typical downloads and then have a look at http://gw.home.lan:81undthe following tab chain Status -> Realtime Graphs -> Traffic -> Realtime
> Traffic. If during your test the Outbound rate stays well below you shaped
> limit and you still encounter the stream failure I would say it is save to
> ignore the link layer adjustments as cause of your issues.
> >
> > Am happy reducing rate to fifty per cent, but the uplink appears to have
> difficulty operating below circa 500 kbits/s. This should not be so. I
> shall try a fourth time.
>
>         That sounds weird, if you shape to below 500 upload stops working
> or just gets choppier? Looking at your sync data 561 would fit the ~50% and
> above 500 requirements.
>
>
> >>
> >>
> >>>>
> >>>> YouTube has no problems.
> >>>>
> >>>> I remain unclear whether the use of tc-stab and htb are mutually
> exclusive options, using the present stock interface.
> >>
> >>      Well, depending on the version of the cerowrt you use, <3.10.9-1 I
> believe lacks a functional HTB link layer adjustment mechanism, so you
> should select tc_stab. My most recent modifications to Toke and Dave's AQM
> package does only allow you to select one or the other. In any case
> selecting BOTH is not a reasonable thing to do, because best case it will
> only apply overhead twice, worst case it would also do the (link layer
> adjustments) LLA twice
> >
> >
> >> See initial comments.
> >>
> >>>>
> >>>> The current ISP connection is IPoA LLC.
> >>>
> >>> Correction - Bridged LLC.
> >>
> >>      Well, I think you should try to figure out your overhead
> empirically and check the encapsulation. I would recommend you run the
> following script on you r link over night and send me the log file it
> produces:
> >>
> >> #! /bin/bash
> >> # TODO use seq or bash to generate a list of the requested sizes (to
> alow for non-equdistantly spaced sizes)
> >>
> >> # Telekom Tuebingen Moltkestrasse 6
> >> TECH=ADSL2
> >> # finding a proper target IP is somewhat of an art, just traceroute a
> remote site
> >> # and find the nearest host reliably responding to pings showing the
> smallet variation of pingtimes
> >> TARGET=87.186.197.70         # T
> >> DATESTR=`date +%Y%m%d_%H%M%S`        # to allow multiple sequential
> records
> >> LOG=ping_sweep_${TECH}_${DATESTR}.txt
> >>
> >>
> >> # by default non-root ping will only end one packet per second, so work
> around that by calling ping independently for each package
> >> # empirically figure out the shortest period still giving the standard
> ping time (to avoid being slow-pathed by our host)
> >> PINGPERIOD=0.01              # in seconds
> >> PINGSPERSIZE=10000
> >>
> >> # Start, needed to find the per packet overhead dependent on the ATM
> encapsulation
> >> # to reliably show ATM quantization one would like to see at least two
> steps, so cover a range > 2 ATM cells (so > 96 bytes)
> >> SWEEPMINSIZE=16              # 64bit systems seem to require 16 bytes
> of payload to include a timestamp...
> >> SWEEPMAXSIZE=116
> >>
> >>
> >> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
> >>
> >>
> >> i_sweep=0
> >> i_size=0
> >>
> >> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
> >> do
> >>    (( i_sweep++ ))
> >>    echo "Current iteration: ${i_sweep}"
> >>    # now loop from sweepmin to sweepmax
> >>    i_size=${SWEEPMINSIZE}
> >>    while [ ${i_size} -le ${SWEEPMAXSIZE} ]
> >>    do
> >>      echo "${i_sweep}. repetition of ping size ${i_size}"
> >>      ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
> >>      (( i_size++ ))
> >>      # we need a sleep binary that allows non integer times (GNU sleep
> is fine as is sleep of macosx 10.8.4)
> >>      sleep ${PINGPERIOD}
> >>    done
> >> done
> >>
> >> #tail -f ${LOG}
> >>
> >> echo "Done... ($0)"
> >>
> >>
> >> Please set TARGET to the closest IP host on the ISP side of your link
> that gives reliable ping RTTs (using "ping -c 100 -s 16
> your.best.host.ip"). Also test whether the RTTs are in the same ballpark
> when you reduce the ping period to 0.01 (you might have to increase the
> period until the RTTs are close to the standard 1 ping per second case). I
> can then run this through my matlab code to detect the actual overhead. (I
> am happy to share the code as well, if you have matlab available; it might
> even run under octave but I have not tested that since the last major
> changes).
> >
> > To follow at some point.
>
>         Oh, I failed to mention at the given parameters the script takes
> almost 3 hours, during which the link should be otherwise idle...
>
> >>
> >>
> >>>
> >>>> Whatever byte value is used for tc-stab makes no change.
> >>
> >>      I assume you talk about the overhead? Missing link layer
> adjustment will eat between 50% and 10% of your link bandwidth, while
> missing overhead values will be more benign. The only advise I can give is
> to pick the overhead that actually describes your link. I am willing to
> help you figure this out.
> >
> > The link is bridged LLC. Have been using 18 and 32 for test purposes. I
> shall move to PPPoA VC-MUX in 4 months.
>
>         I guess figuring out you exact overhead empirically is going to be
> fun.
>
> >>
> >>>>
> >>>> I have applied the ingress modification to simple.qos, keeping the
> original version., and tested both.
> >>
> >>      For which cerowrt version? It is only expected to do something for
> 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT
> work.
> >
> > Using 3.10.9-2
>
>         Yeah as stated above, I would recommend to use either or, not
> both. If you took RRUL data you might be able to compare the three
> conditions. I would estimate the most interesting part would be in the
> sustained ravager up and download rates here.
>
>
> >
> >>
> >>>>
> >>>> I have changed the Powerline adaptors I use to ones with known
> smaller buffers, though this is unlikely to be a ate-limiting step.
> >>>>
> >>>> I have changed the 2Wire gateway, known to be heavily buffered, with
> a bridged Huawei HG612, with a Broadcom 6368 SoC.
> >>>>
> >>>> This device has a permanently on telnet interface, with a simple
> password, which cannot be changed other than by firmware recompilation…
> >>>>
> >>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
> >>>>
> >>>> None of these changes affect the problematic uplink delay.
> >>
> >>      So how did you measure the uplink delay? The RRUL plots you sent
> me show an increase in ping RTT from around 50ms to 80ms with tc_stab and
> fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay,
> netalyzr?
> >
> > Max Planck and Netalyzr produce the same figure. I use both, but Max
> Planck gives you circa 3 tries per IP address per 24 hours.
>
>         Well, both use the same method which is not to meaningful if you
> use fq_codel on a shaped link (unless you want to optimize your system for
> UDP floods :) )
>
> [snipp]
>
>
> Best Regards
>         Sebastian
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>



-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html

[-- Attachment #2: Type: text/html, Size: 13239 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 17:55                               ` Dave Taht
@ 2013-08-25 18:00                                 ` Fred Stratton
  0 siblings, 0 replies; 43+ messages in thread
From: Fred Stratton @ 2013-08-25 18:00 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 11075 bytes --]


On 25 Aug 2013, at 18:55, Dave Taht <dave.taht@gmail.com> wrote:

> Netanalyzer is not useful in a fq_codel'ed system.

Thank you. I shall stop using it.


> 
> 
> On Sun, Aug 25, 2013 at 10:53 AM, Sebastian Moeller <moeller0@gmx.de> wrote:
> Hi Fred,
> 
> 
> On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
> 
> > Thank you.
> >
> > This is an initial response.
> >
> > Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active.
> 
>         I have seen your follow-up mail that you actually used 3.10.9-2. I think that has the first cut of the script modifications that still allow to select both. Since I have not tested it any other way I would recommend to enable just one of them at the same time. Since the implementation of both is somewhat orthogonal and htb_private actually works in 3.10.9, best case you might actually get the link layer adjustments (LLA) and the overhead applied twice, wasting bandwidth. So please either use the last set of modified files I send around or wait for Dave to include them in ceropackages...
> 
> > On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
> >
> >> Hi Fred,
> >>
> >>
> >> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
> >>
> >>>
> >>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
> >>>
> >>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
> >>
> >>      Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
> >>
> >>>>
> >>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this.
> >>>>
> >>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
> >>
> >>      Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
> >
> > Uptime 100655
> > downstream 12162 kbits/s
> > CRC errors 10154
> > FEC Errors 464
> > hEC Errors 758
> >
> > upstream 1122 kbits/s
> > no errors in period.
> 
>         Ah, I think you told me in the past that "Target snr upped to 12 deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.at lower snr. " so sync at 12162 might be too aggressive, no? But the point is that as I understand iPlayer works fine without competing download traffic? To my eye the error numbers look small enough to not be concerned about. Do you know how long the error correction period is?
> 
> 
> >
> >>      Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.
> >
> > Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.
> 
>         That sounds weird, if you shape to below 500 upload stops working or just gets choppier? Looking at your sync data 561 would fit the ~50% and above 500 requirements.
> 
> 
> >>
> >>
> >>>>
> >>>> YouTube has no problems.
> >>>>
> >>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
> >>
> >>      Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice
> >
> >
> >> See initial comments.
> >>
> >>>>
> >>>> The current ISP connection is IPoA LLC.
> >>>
> >>> Correction - Bridged LLC.
> >>
> >>      Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
> >>
> >> #! /bin/bash
> >> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
> >>
> >> # Telekom Tuebingen Moltkestrasse 6
> >> TECH=ADSL2
> >> # finding a proper target IP is somewhat of an art, just traceroute a remote site
> >> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
> >> TARGET=87.186.197.70         # T
> >> DATESTR=`date +%Y%m%d_%H%M%S`        # to allow multiple sequential records
> >> LOG=ping_sweep_${TECH}_${DATESTR}.txt
> >>
> >>
> >> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
> >> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
> >> PINGPERIOD=0.01              # in seconds
> >> PINGSPERSIZE=10000
> >>
> >> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
> >> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
> >> SWEEPMINSIZE=16              # 64bit systems seem to require 16 bytes of payload to include a timestamp...
> >> SWEEPMAXSIZE=116
> >>
> >>
> >> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
> >>
> >>
> >> i_sweep=0
> >> i_size=0
> >>
> >> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
> >> do
> >>    (( i_sweep++ ))
> >>    echo "Current iteration: ${i_sweep}"
> >>    # now loop from sweepmin to sweepmax
> >>    i_size=${SWEEPMINSIZE}
> >>    while [ ${i_size} -le ${SWEEPMAXSIZE} ]
> >>    do
> >>      echo "${i_sweep}. repetition of ping size ${i_size}"
> >>      ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
> >>      (( i_size++ ))
> >>      # we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
> >>      sleep ${PINGPERIOD}
> >>    done
> >> done
> >>
> >> #tail -f ${LOG}
> >>
> >> echo "Done... ($0)"
> >>
> >>
> >> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).
> >
> > To follow at some point.
> 
>         Oh, I failed to mention at the given parameters the script takes almost 3 hours, during which the link should be otherwise idle...
> 
> >>
> >>
> >>>
> >>>> Whatever byte value is used for tc-stab makes no change.
> >>
> >>      I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.
> >
> > The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.
> 
>         I guess figuring out you exact overhead empirically is going to be fun.
> 
> >>
> >>>>
> >>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
> >>
> >>      For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.
> >
> > Using 3.10.9-2
> 
>         Yeah as stated above, I would recommend to use either or, not both. If you took RRUL data you might be able to compare the three conditions. I would estimate the most interesting part would be in the sustained ravager up and download rates here.
> 
> 
> >
> >>
> >>>>
> >>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
> >>>>
> >>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
> >>>>
> >>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
> >>>>
> >>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
> >>>>
> >>>> None of these changes affect the problematic uplink delay.
> >>
> >>      So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?
> >
> > Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.
> 
>         Well, both use the same method which is not to meaningful if you use fq_codel on a shaped link (unless you want to optimize your system for UDP floods :) )
> 
> [snipp]
> 
> 
> Best Regards
>         Sebastian
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


[-- Attachment #2: Type: text/html, Size: 14427 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 17:53                             ` Sebastian Moeller
  2013-08-25 17:55                               ` Dave Taht
@ 2013-08-25 18:30                               ` Fred Stratton
  2013-08-25 18:41                                 ` Dave Taht
  2013-08-25 21:50                                 ` Sebastian Moeller
  1 sibling, 2 replies; 43+ messages in thread
From: Fred Stratton @ 2013-08-25 18:30 UTC (permalink / raw)
  To: cerowrt-devel


On 25 Aug 2013, at 18:53, Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Fred,
> 
> 
> On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
> 
>> Thank you.
>> 
>> This is an initial response.
>> 
>> Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active. 
> 
> 	I have seen your follow-up mail that you actually used 3.10.9-2. I think that has the first cut of the script modifications that still allow to select both. Since I have not tested it any other way I would recommend to enable just one of them at the same time. Since the implementation of both is somewhat orthogonal and htb_private actually works in 3.10.9, best case you might actually get the link layer adjustments (LLA) and the overhead applied twice, wasting bandwidth. So please either use the last set of modified files I send around or wait for Dave to include them in ceropackages…

I have retained the unmodified script. I shall return to that.


> 
>> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>>> Hi Fred,
>>> 
>>> 
>>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
>>> 
>>>> 
>>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
>>>> 
>>>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
>>> 
>>> 	Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
>>> 
>>>>> 
>>>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this. 
>>>>> 
>>>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
>>> 
>>> 	Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
>> 
>> Uptime 100655
>> downstream 12162 kbits/s
>> CRC errors 10154
>> FEC Errors 464
>> hEC Errors 758
>> 
>> upstream 1122 kbits/s
>> no errors in period.
> 
> 	Ah, I think you told me in the past that "Target snr upped to 12 deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.at lower snr. " so sync at 12162 might be too aggressive, no? But the point is that as I understand iPlayer works fine without competing download traffic? To my eye the error numbers look small enough to not be concerned about. Do you know how long the error correction period is?

The correction period is probably circa 28 hours. Have moved to using the HG612. This is uses the Broadcom 6368 SoC. Like most of the devices I use, it fell out of a BT van and on to ebay. It is the standard device used for connecting FTTC installations in the UK. With a simple modification, it will work stably with ADSL2+.

Ihe sync rate has gone up considerably, not because I have changed the Target SNR from 12 Decibel, but because I am now using a Broadcom chipset and software blob with a DSLAM which returns BDCM when interrogated.
> 
> 
>> 
>>> 	Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.
>> 
>> Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.
> 
> 	That sounds weird, if you shape to below 500 upload stops working or just gets choppier? Looking at your sync data 561 would fit the ~50% and above 500 requirements.

I was basing the judgment on Netalyzr data. DT and you now say this is suspect. However, netsurf-wrapper traces are discontinuous. The actual real time trace looks perfectly normal.

iPlayer is a Flash based player which is web page embedded.  The ipv4 user address is parsed to see if it is in the UK. It plays BBC TV programs. It most likely is badly designed and written. It is the way I watch TV. Like all UK residents, I pay the bloated bureaucracy of the BBC a yearly fee of about 200 euro. If I do not pay, I will be fined. You will be surprised that I am not a fan of the BBC. iPlayer starts and runs fine, but if a download is commenced whilst it is running, so I can watch the propaganda put out as national news, the video will stall and the continue, but most commonly will stop.
> 
> 
>>> 
>>> 
>>>>> 
>>>>> YouTube has no problems.
>>>>> 
>>>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
>>> 
>>> 	Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice
>> 
>> 
>>> See initial comments.
>>> 
>>>>> 
>>>>> The current ISP connection is IPoA LLC.
>>>> 
>>>> Correction - Bridged LLC. 
>>> 
>>> 	Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
>>> 
>>> #! /bin/bash
>>> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
>>> 
>>> # Telekom Tuebingen Moltkestrasse 6
>>> TECH=ADSL2
>>> # finding a proper target IP is somewhat of an art, just traceroute a remote site 
>>> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
>>> TARGET=87.186.197.70		# T
>>> DATESTR=`date +%Y%m%d_%H%M%S`	# to allow multiple sequential records
>>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
>>> 
>>> 
>>> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
>>> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
>>> PINGPERIOD=0.01		# in seconds
>>> PINGSPERSIZE=10000
>>> 
>>> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
>>> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
>>> SWEEPMINSIZE=16		# 64bit systems seem to require 16 bytes of payload to include a timestamp...
>>> SWEEPMAXSIZE=116
>>> 
>>> 
>>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
>>> 
>>> 
>>> i_sweep=0
>>> i_size=0
>>> 
>>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
>>> do
>>>   (( i_sweep++ ))
>>>   echo "Current iteration: ${i_sweep}"
>>>   # now loop from sweepmin to sweepmax
>>>   i_size=${SWEEPMINSIZE}
>>>   while [ ${i_size} -le ${SWEEPMAXSIZE} ]
>>>   do
>>> 	echo "${i_sweep}. repetition of ping size ${i_size}"
>>> 	ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
>>> 	(( i_size++ ))
>>> 	# we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
>>> 	sleep ${PINGPERIOD}
>>>   done
>>> done
>>> 
>>> #tail -f ${LOG}
>>> 
>>> echo "Done... ($0)"
>>> 
>>> 
>>> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).
>> 
>> To follow at some point.
> 
> 	Oh, I failed to mention at the given parameters the script takes almost 3 hours, during which the link should be otherwise idle...
> 
>>> 
>>> 
>>>> 
>>>>> Whatever byte value is used for tc-stab makes no change.
>>> 
>>> 	I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.
>> 
>> The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.
> 
> 	I guess figuring out you exact overhead empirically is going to be fun.
> 
>>> 
>>>>> 
>>>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
>>> 
>>> 	For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.
>> 
>> Using 3.10.9-2
> 
> 	Yeah as stated above, I would recommend to use either or, not both. If you took RRUL data you might be able to compare the three conditions. I would estimate the most interesting part would be in the sustained ravager up and download rates here.

How do you obtain an average i.e. mean rate from the RRUL graph?
> 
> 
>> 
>>> 
>>>>> 
>>>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
>>>>> 
>>>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>>>>> 
>>>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
>>>>> 
>>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>>>>> 
>>>>> None of these changes affect the problematic uplink delay.
>>> 
>>> 	So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?
>> 
>> Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.
> 
> 	Well, both use the same method which is not to meaningful if you use fq_codel on a shaped link (unless you want to optimize your system for UDP floods :) )
> 
> [snipp]
> 
> 
> Best Regards
> 	Sebastian


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 18:30                               ` Fred Stratton
@ 2013-08-25 18:41                                 ` Dave Taht
  2013-08-25 19:08                                   ` Fred Stratton
  2013-08-25 21:50                                 ` Sebastian Moeller
  1 sibling, 1 reply; 43+ messages in thread
From: Dave Taht @ 2013-08-25 18:41 UTC (permalink / raw)
  To: Fred Stratton; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 13779 bytes --]

So it sounds like you need a lower setting for the download than what you
are using? It's not the upload that is your problem.

Netanalyzer sends one packet stream and thus measures 1 queue only.
fq_codel will happily give it one big queue for a while, while still
interleaving other flows's packets into the stream at every opportunity.

as for parsing rrul I generally draw a line with my hand and multiply by 4,
then fudge in the numbers for the reverse ack and measurement streams.

As written it was targetted at 4Mbit and up which is why the samples are
discontinuous in your much lower bandwidth situation.

I do agree that rrul could use a simpler implementation, perhaps one that
tested two download streams only, and provided an estimate as to the actual
bandwidth usage, and scale below 4Mbit better.


On Sun, Aug 25, 2013 at 11:30 AM, Fred Stratton <fredstratton@imap.cc>wrote:

>
> On 25 Aug 2013, at 18:53, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> > Hi Fred,
> >
> >
> > On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
> >
> >> Thank you.
> >>
> >> This is an initial response.
> >>
> >> Am using 3.10.2-1 currently, with the standard AQM interface. This does
> not have the pull down menu of your interface, which is why I ask if both
> are active.
> >
> >       I have seen your follow-up mail that you actually used 3.10.9-2. I
> think that has the first cut of the script modifications that still allow
> to select both. Since I have not tested it any other way I would recommend
> to enable just one of them at the same time. Since the implementation of
> both is somewhat orthogonal and htb_private actually works in 3.10.9, best
> case you might actually get the link layer adjustments (LLA) and the
> overhead applied twice, wasting bandwidth. So please either use the last
> set of modified files I send around or wait for Dave to include them in
> ceropackages…
>
> I have retained the unmodified script. I shall return to that.
>
>
> >
> >> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
> >>
> >>> Hi Fred,
> >>>
> >>>
> >>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc>
> wrote:
> >>>
> >>>>
> >>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
> >>>>
> >>>>> As the person with the most flaky ADSL link, I point out that None
> of these recent, welcome, changes, are having any effect here, with an
> uplink sped of circa 950 kbits/s.
> >>>
> >>>     Okay, how flaky is you link? What rate of Errors do you have while
> testing? I am especially interested in CRC errors and ES SES and HEC, just
> to get an idea how flaky the line is...
> >>>
> >>>>>
> >>>>> The reason I mention this is that it is still impossible to watch
> iPlayer Flash streaming video and download at the same time, The iPlayer
> stream fails. The point of the exercise was to achieve this.
> >>>>>
> >>>>> The uplink delay is consistently around 650ms, which appears to be
> too high for effective streaming. In addition, the uplink stream has
> multiple breaks, presumably outages, if the uplink rate is capped at, say,
> 700 kbits/s.
> >>>
> >>>     Well, watching video is going to stress your downlink so the
> uplink should not saturate by the ACKs and the concurrent downloads also do
> not stress your uplink except for the ACKs, so this points to downlink
> errors as far as I can tell from the data you have given. If the up link
> has repeated outages however, your problems might be unfixable because
> these, if long enough, will cause lost ACKs and will probably trigger
> retransmission, independent of whether the link layer adjustments work or
> not. (You could test this by shaping you up and downlink to <= 50% of the
> link rates and disable all link layer adjustments, 50% is larger than the
> ATM worst case so should have you covered. Well unless you del link has an
> excessive number of tones reserved for forward error correction (FEC)).
> >>
> >> Uptime 100655
> >> downstream 12162 kbits/s
> >> CRC errors 10154
> >> FEC Errors 464
> >> hEC Errors 758
> >>
> >> upstream 1122 kbits/s
> >> no errors in period.
> >
> >       Ah, I think you told me in the past that "Target snr upped to 12
> deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.atlower snr. " so sync at 12162 might be too aggressive, no? But the point is
> that as I understand iPlayer works fine without competing download traffic?
> To my eye the error numbers look small enough to not be concerned about. Do
> you know how long the error correction period is?
>
> The correction period is probably circa 28 hours. Have moved to using the
> HG612. This is uses the Broadcom 6368 SoC. Like most of the devices I use,
> it fell out of a BT van and on to ebay. It is the standard device used for
> connecting FTTC installations in the UK. With a simple modification, it
> will work stably with ADSL2+.
>
> Ihe sync rate has gone up considerably, not because I have changed the
> Target SNR from 12 Decibel, but because I am now using a Broadcom chipset
> and software blob with a DSLAM which returns BDCM when interrogated.
> >
> >
> >>
> >>>     Could you perform the following test by any chance: state iPlayer
> and yor typical downloads and then have a look at http://gw.home.lan:81undthe following tab chain Status -> Realtime Graphs -> Traffic -> Realtime
> Traffic. If during your test the Outbound rate stays well below you shaped
> limit and you still encounter the stream failure I would say it is save to
> ignore the link layer adjustments as cause of your issues.
> >>
> >> Am happy reducing rate to fifty per cent, but the uplink appears to
> have difficulty operating below circa 500 kbits/s. This should not be so. I
> shall try a fourth time.
> >
> >       That sounds weird, if you shape to below 500 upload stops working
> or just gets choppier? Looking at your sync data 561 would fit the ~50% and
> above 500 requirements.
>
> I was basing the judgment on Netalyzr data. DT and you now say this is
> suspect. However, netsurf-wrapper traces are discontinuous. The actual real
> time trace looks perfectly normal.
>
> iPlayer is a Flash based player which is web page embedded.  The ipv4 user
> address is parsed to see if it is in the UK. It plays BBC TV programs. It
> most likely is badly designed and written. It is the way I watch TV. Like
> all UK residents, I pay the bloated bureaucracy of the BBC a yearly fee of
> about 200 euro. If I do not pay, I will be fined. You will be surprised
> that I am not a fan of the BBC. iPlayer starts and runs fine, but if a
> download is commenced whilst it is running, so I can watch the propaganda
> put out as national news, the video will stall and the continue, but most
> commonly will stop.
> >
> >
> >>>
> >>>
> >>>>>
> >>>>> YouTube has no problems.
> >>>>>
> >>>>> I remain unclear whether the use of tc-stab and htb are mutually
> exclusive options, using the present stock interface.
> >>>
> >>>     Well, depending on the version of the cerowrt you use, <3.10.9-1 I
> believe lacks a functional HTB link layer adjustment mechanism, so you
> should select tc_stab. My most recent modifications to Toke and Dave's AQM
> package does only allow you to select one or the other. In any case
> selecting BOTH is not a reasonable thing to do, because best case it will
> only apply overhead twice, worst case it would also do the (link layer
> adjustments) LLA twice
> >>
> >>
> >>> See initial comments.
> >>>
> >>>>>
> >>>>> The current ISP connection is IPoA LLC.
> >>>>
> >>>> Correction - Bridged LLC.
> >>>
> >>>     Well, I think you should try to figure out your overhead
> empirically and check the encapsulation. I would recommend you run the
> following script on you r link over night and send me the log file it
> produces:
> >>>
> >>> #! /bin/bash
> >>> # TODO use seq or bash to generate a list of the requested sizes (to
> alow for non-equdistantly spaced sizes)
> >>>
> >>> # Telekom Tuebingen Moltkestrasse 6
> >>> TECH=ADSL2
> >>> # finding a proper target IP is somewhat of an art, just traceroute a
> remote site
> >>> # and find the nearest host reliably responding to pings showing the
> smallet variation of pingtimes
> >>> TARGET=87.186.197.70                # T
> >>> DATESTR=`date +%Y%m%d_%H%M%S`       # to allow multiple sequential
> records
> >>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
> >>>
> >>>
> >>> # by default non-root ping will only end one packet per second, so
> work around that by calling ping independently for each package
> >>> # empirically figure out the shortest period still giving the standard
> ping time (to avoid being slow-pathed by our host)
> >>> PINGPERIOD=0.01             # in seconds
> >>> PINGSPERSIZE=10000
> >>>
> >>> # Start, needed to find the per packet overhead dependent on the ATM
> encapsulation
> >>> # to reliably show ATM quantization one would like to see at least two
> steps, so cover a range > 2 ATM cells (so > 96 bytes)
> >>> SWEEPMINSIZE=16             # 64bit systems seem to require 16 bytes
> of payload to include a timestamp...
> >>> SWEEPMAXSIZE=116
> >>>
> >>>
> >>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
> >>>
> >>>
> >>> i_sweep=0
> >>> i_size=0
> >>>
> >>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
> >>> do
> >>>   (( i_sweep++ ))
> >>>   echo "Current iteration: ${i_sweep}"
> >>>   # now loop from sweepmin to sweepmax
> >>>   i_size=${SWEEPMINSIZE}
> >>>   while [ ${i_size} -le ${SWEEPMAXSIZE} ]
> >>>   do
> >>>     echo "${i_sweep}. repetition of ping size ${i_size}"
> >>>     ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
> >>>     (( i_size++ ))
> >>>     # we need a sleep binary that allows non integer times (GNU sleep
> is fine as is sleep of macosx 10.8.4)
> >>>     sleep ${PINGPERIOD}
> >>>   done
> >>> done
> >>>
> >>> #tail -f ${LOG}
> >>>
> >>> echo "Done... ($0)"
> >>>
> >>>
> >>> Please set TARGET to the closest IP host on the ISP side of your link
> that gives reliable ping RTTs (using "ping -c 100 -s 16
> your.best.host.ip"). Also test whether the RTTs are in the same ballpark
> when you reduce the ping period to 0.01 (you might have to increase the
> period until the RTTs are close to the standard 1 ping per second case). I
> can then run this through my matlab code to detect the actual overhead. (I
> am happy to share the code as well, if you have matlab available; it might
> even run under octave but I have not tested that since the last major
> changes).
> >>
> >> To follow at some point.
> >
> >       Oh, I failed to mention at the given parameters the script takes
> almost 3 hours, during which the link should be otherwise idle...
> >
> >>>
> >>>
> >>>>
> >>>>> Whatever byte value is used for tc-stab makes no change.
> >>>
> >>>     I assume you talk about the overhead? Missing link layer
> adjustment will eat between 50% and 10% of your link bandwidth, while
> missing overhead values will be more benign. The only advise I can give is
> to pick the overhead that actually describes your link. I am willing to
> help you figure this out.
> >>
> >> The link is bridged LLC. Have been using 18 and 32 for test purposes. I
> shall move to PPPoA VC-MUX in 4 months.
> >
> >       I guess figuring out you exact overhead empirically is going to be
> fun.
> >
> >>>
> >>>>>
> >>>>> I have applied the ingress modification to simple.qos, keeping the
> original version., and tested both.
> >>>
> >>>     For which cerowrt version? It is only expected to do something for
> 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT
> work.
> >>
> >> Using 3.10.9-2
> >
> >       Yeah as stated above, I would recommend to use either or, not
> both. If you took RRUL data you might be able to compare the three
> conditions. I would estimate the most interesting part would be in the
> sustained ravager up and download rates here.
>
> How do you obtain an average i.e. mean rate from the RRUL graph?
> >
> >
> >>
> >>>
> >>>>>
> >>>>> I have changed the Powerline adaptors I use to ones with known
> smaller buffers, though this is unlikely to be a ate-limiting step.
> >>>>>
> >>>>> I have changed the 2Wire gateway, known to be heavily buffered, with
> a bridged Huawei HG612, with a Broadcom 6368 SoC.
> >>>>>
> >>>>> This device has a permanently on telnet interface, with a simple
> password, which cannot be changed other than by firmware recompilation…
> >>>>>
> >>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
> >>>>>
> >>>>> None of these changes affect the problematic uplink delay.
> >>>
> >>>     So how did you measure the uplink delay? The RRUL plots you sent
> me show an increase in ping RTT from around 50ms to 80ms with tc_stab and
> fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay,
> netalyzr?
> >>
> >> Max Planck and Netalyzr produce the same figure. I use both, but Max
> Planck gives you circa 3 tries per IP address per 24 hours.
> >
> >       Well, both use the same method which is not to meaningful if you
> use fq_codel on a shaped link (unless you want to optimize your system for
> UDP floods :) )
> >
> > [snipp]
> >
> >
> > Best Regards
> >       Sebastian
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>



-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html

[-- Attachment #2: Type: text/html, Size: 16330 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 18:41                                 ` Dave Taht
@ 2013-08-25 19:08                                   ` Fred Stratton
  2013-08-25 19:31                                     ` Fred Stratton
  2013-08-25 20:28                                     ` Dave Taht
  0 siblings, 2 replies; 43+ messages in thread
From: Fred Stratton @ 2013-08-25 19:08 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 14012 bytes --]

That is very helpful.

With a sync rate of about 12000 kbits/s, and a download rate of about 10900 kbits/s. I have set the download rate to 5000 kbits/s. For upload similarly 1200/970/500, all kbits/s.

I can now mostly watch video in iPlayer and download at circa 300 - 400 kbits/s simultaneously, using htb, with tc-stab disabled.

QED


On 25 Aug 2013, at 19:41, Dave Taht <dave.taht@gmail.com> wrote:

> So it sounds like you need a lower setting for the download than what you are using? It's not the upload that is your problem. 
> 
> Netanalyzer sends one packet stream and thus measures 1 queue only. fq_codel will happily give it one big queue for a while, while still interleaving other flows's packets into the stream at every opportunity. 
> 
> as for parsing rrul I generally draw a line with my hand and multiply by 4, then fudge in the numbers for the reverse ack and measurement streams. 

You are saying that you judge the result solely by eye. presumably.
> 
> As written it was targetted at 4Mbit and up which is why the samples are discontinuous in your much lower bandwidth situation. 

Aha. Problem solved.
> 
> I do agree that rrul could use a simpler implementation, perhaps one that tested two download streams only, and provided an estimate as to the actual bandwidth usage, and scale below 4Mbit better.
> 
> 
> On Sun, Aug 25, 2013 at 11:30 AM, Fred Stratton <fredstratton@imap.cc> wrote:
> 
> On 25 Aug 2013, at 18:53, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
> > Hi Fred,
> >
> >
> > On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
> >
> >> Thank you.
> >>
> >> This is an initial response.
> >>
> >> Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active.
> >
> >       I have seen your follow-up mail that you actually used 3.10.9-2. I think that has the first cut of the script modifications that still allow to select both. Since I have not tested it any other way I would recommend to enable just one of them at the same time. Since the implementation of both is somewhat orthogonal and htb_private actually works in 3.10.9, best case you might actually get the link layer adjustments (LLA) and the overhead applied twice, wasting bandwidth. So please either use the last set of modified files I send around or wait for Dave to include them in ceropackages…
> 
> I have retained the unmodified script. I shall return to that.
> 
> 
> >
> >> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
> >>
> >>> Hi Fred,
> >>>
> >>>
> >>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
> >>>
> >>>>
> >>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
> >>>>
> >>>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
> >>>
> >>>     Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
> >>>
> >>>>>
> >>>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this.
> >>>>>
> >>>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
> >>>
> >>>     Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
> >>
> >> Uptime 100655
> >> downstream 12162 kbits/s
> >> CRC errors 10154
> >> FEC Errors 464
> >> hEC Errors 758
> >>
> >> upstream 1122 kbits/s
> >> no errors in period.
> >
> >       Ah, I think you told me in the past that "Target snr upped to 12 deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.at lower snr. " so sync at 12162 might be too aggressive, no? But the point is that as I understand iPlayer works fine without competing download traffic? To my eye the error numbers look small enough to not be concerned about. Do you know how long the error correction period is?
> 
> The correction period is probably circa 28 hours. Have moved to using the HG612. This is uses the Broadcom 6368 SoC. Like most of the devices I use, it fell out of a BT van and on to ebay. It is the standard device used for connecting FTTC installations in the UK. With a simple modification, it will work stably with ADSL2+.
> 
> Ihe sync rate has gone up considerably, not because I have changed the Target SNR from 12 Decibel, but because I am now using a Broadcom chipset and software blob with a DSLAM which returns BDCM when interrogated.
> >
> >
> >>
> >>>     Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.
> >>
> >> Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.
> >
> >       That sounds weird, if you shape to below 500 upload stops working or just gets choppier? Looking at your sync data 561 would fit the ~50% and above 500 requirements.
> 
> I was basing the judgment on Netalyzr data. DT and you now say this is suspect. However, netsurf-wrapper traces are discontinuous. The actual real time trace looks perfectly normal.
> 
> iPlayer is a Flash based player which is web page embedded.  The ipv4 user address is parsed to see if it is in the UK. It plays BBC TV programs. It most likely is badly designed and written. It is the way I watch TV. Like all UK residents, I pay the bloated bureaucracy of the BBC a yearly fee of about 200 euro. If I do not pay, I will be fined. You will be surprised that I am not a fan of the BBC. iPlayer starts and runs fine, but if a download is commenced whilst it is running, so I can watch the propaganda put out as national news, the video will stall and the continue, but most commonly will stop.
> >
> >
> >>>
> >>>
> >>>>>
> >>>>> YouTube has no problems.
> >>>>>
> >>>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
> >>>
> >>>     Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice
> >>
> >>
> >>> See initial comments.
> >>>
> >>>>>
> >>>>> The current ISP connection is IPoA LLC.
> >>>>
> >>>> Correction - Bridged LLC.
> >>>
> >>>     Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
> >>>
> >>> #! /bin/bash
> >>> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
> >>>
> >>> # Telekom Tuebingen Moltkestrasse 6
> >>> TECH=ADSL2
> >>> # finding a proper target IP is somewhat of an art, just traceroute a remote site
> >>> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
> >>> TARGET=87.186.197.70                # T
> >>> DATESTR=`date +%Y%m%d_%H%M%S`       # to allow multiple sequential records
> >>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
> >>>
> >>>
> >>> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
> >>> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
> >>> PINGPERIOD=0.01             # in seconds
> >>> PINGSPERSIZE=10000
> >>>
> >>> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
> >>> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
> >>> SWEEPMINSIZE=16             # 64bit systems seem to require 16 bytes of payload to include a timestamp...
> >>> SWEEPMAXSIZE=116
> >>>
> >>>
> >>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
> >>>
> >>>
> >>> i_sweep=0
> >>> i_size=0
> >>>
> >>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
> >>> do
> >>>   (( i_sweep++ ))
> >>>   echo "Current iteration: ${i_sweep}"
> >>>   # now loop from sweepmin to sweepmax
> >>>   i_size=${SWEEPMINSIZE}
> >>>   while [ ${i_size} -le ${SWEEPMAXSIZE} ]
> >>>   do
> >>>     echo "${i_sweep}. repetition of ping size ${i_size}"
> >>>     ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
> >>>     (( i_size++ ))
> >>>     # we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
> >>>     sleep ${PINGPERIOD}
> >>>   done
> >>> done
> >>>
> >>> #tail -f ${LOG}
> >>>
> >>> echo "Done... ($0)"
> >>>
> >>>
> >>> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).
> >>
> >> To follow at some point.
> >
> >       Oh, I failed to mention at the given parameters the script takes almost 3 hours, during which the link should be otherwise idle...
> >
> >>>
> >>>
> >>>>
> >>>>> Whatever byte value is used for tc-stab makes no change.
> >>>
> >>>     I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.
> >>
> >> The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.
> >
> >       I guess figuring out you exact overhead empirically is going to be fun.
> >
> >>>
> >>>>>
> >>>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
> >>>
> >>>     For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.
> >>
> >> Using 3.10.9-2
> >
> >       Yeah as stated above, I would recommend to use either or, not both. If you took RRUL data you might be able to compare the three conditions. I would estimate the most interesting part would be in the sustained ravager up and download rates here.
> 
> How do you obtain an average i.e. mean rate from the RRUL graph?
> >
> >
> >>
> >>>
> >>>>>
> >>>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
> >>>>>
> >>>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
> >>>>>
> >>>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
> >>>>>
> >>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
> >>>>>
> >>>>> None of these changes affect the problematic uplink delay.
> >>>
> >>>     So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?
> >>
> >> Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.
> >
> >       Well, both use the same method which is not to meaningful if you use fq_codel on a shaped link (unless you want to optimize your system for UDP floods :) )
> >
> > [snipp]
> >
> >
> > Best Regards
> >       Sebastian
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html


[-- Attachment #2: Type: text/html, Size: 17967 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 19:08                                   ` Fred Stratton
@ 2013-08-25 19:31                                     ` Fred Stratton
  2013-08-25 21:54                                       ` Sebastian Moeller
  2013-08-25 20:28                                     ` Dave Taht
  1 sibling, 1 reply; 43+ messages in thread
From: Fred Stratton @ 2013-08-25 19:31 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 14593 bytes --]

Re-reading your comment, I have reset the upload rate higher to 900 kbits/s.


On 25 Aug 2013, at 20:08, Fred Stratton <fredstratton@imap.cc> wrote:

> That is very helpful.
> 
> With a sync rate of about 12000 kbits/s, and a download rate of about 10900 kbits/s. I have set the download rate to 5000 kbits/s. For upload similarly 1200/970/500, all kbits/s.
> 
> I can now mostly watch video in iPlayer and download at circa 300 - 400 kbits/s simultaneously, using htb, with tc-stab disabled.
> 
> QED
> 
> 
> On 25 Aug 2013, at 19:41, Dave Taht <dave.taht@gmail.com> wrote:
> 
>> So it sounds like you need a lower setting for the download than what you are using? It's not the upload that is your problem. 
>> 
>> Netanalyzer sends one packet stream and thus measures 1 queue only. fq_codel will happily give it one big queue for a while, while still interleaving other flows's packets into the stream at every opportunity. 
>> 
>> as for parsing rrul I generally draw a line with my hand and multiply by 4, then fudge in the numbers for the reverse ack and measurement streams. 
> 
> You are saying that you judge the result solely by eye. presumably.
>> 
>> As written it was targetted at 4Mbit and up which is why the samples are discontinuous in your much lower bandwidth situation. 
> 
> Aha. Problem solved.
>> 
>> I do agree that rrul could use a simpler implementation, perhaps one that tested two download streams only, and provided an estimate as to the actual bandwidth usage, and scale below 4Mbit better.
>> 
>> 
>> On Sun, Aug 25, 2013 at 11:30 AM, Fred Stratton <fredstratton@imap.cc> wrote:
>> 
>> On 25 Aug 2013, at 18:53, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>> > Hi Fred,
>> >
>> >
>> > On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
>> >
>> >> Thank you.
>> >>
>> >> This is an initial response.
>> >>
>> >> Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active.
>> >
>> >       I have seen your follow-up mail that you actually used 3.10.9-2. I think that has the first cut of the script modifications that still allow to select both. Since I have not tested it any other way I would recommend to enable just one of them at the same time. Since the implementation of both is somewhat orthogonal and htb_private actually works in 3.10.9, best case you might actually get the link layer adjustments (LLA) and the overhead applied twice, wasting bandwidth. So please either use the last set of modified files I send around or wait for Dave to include them in ceropackages…
>> 
>> I have retained the unmodified script. I shall return to that.
>> 
>> 
>> >
>> >> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
>> >>
>> >>> Hi Fred,
>> >>>
>> >>>
>> >>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
>> >>>
>> >>>>
>> >>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
>> >>>>
>> >>>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
>> >>>
>> >>>     Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
>> >>>
>> >>>>>
>> >>>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this.
>> >>>>>
>> >>>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
>> >>>
>> >>>     Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
>> >>
>> >> Uptime 100655
>> >> downstream 12162 kbits/s
>> >> CRC errors 10154
>> >> FEC Errors 464
>> >> hEC Errors 758
>> >>
>> >> upstream 1122 kbits/s
>> >> no errors in period.
>> >
>> >       Ah, I think you told me in the past that "Target snr upped to 12 deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.at lower snr. " so sync at 12162 might be too aggressive, no? But the point is that as I understand iPlayer works fine without competing download traffic? To my eye the error numbers look small enough to not be concerned about. Do you know how long the error correction period is?
>> 
>> The correction period is probably circa 28 hours. Have moved to using the HG612. This is uses the Broadcom 6368 SoC. Like most of the devices I use, it fell out of a BT van and on to ebay. It is the standard device used for connecting FTTC installations in the UK. With a simple modification, it will work stably with ADSL2+.
>> 
>> Ihe sync rate has gone up considerably, not because I have changed the Target SNR from 12 Decibel, but because I am now using a Broadcom chipset and software blob with a DSLAM which returns BDCM when interrogated.
>> >
>> >
>> >>
>> >>>     Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.
>> >>
>> >> Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.
>> >
>> >       That sounds weird, if you shape to below 500 upload stops working or just gets choppier? Looking at your sync data 561 would fit the ~50% and above 500 requirements.
>> 
>> I was basing the judgment on Netalyzr data. DT and you now say this is suspect. However, netsurf-wrapper traces are discontinuous. The actual real time trace looks perfectly normal.
>> 
>> iPlayer is a Flash based player which is web page embedded.  The ipv4 user address is parsed to see if it is in the UK. It plays BBC TV programs. It most likely is badly designed and written. It is the way I watch TV. Like all UK residents, I pay the bloated bureaucracy of the BBC a yearly fee of about 200 euro. If I do not pay, I will be fined. You will be surprised that I am not a fan of the BBC. iPlayer starts and runs fine, but if a download is commenced whilst it is running, so I can watch the propaganda put out as national news, the video will stall and the continue, but most commonly will stop.
>> >
>> >
>> >>>
>> >>>
>> >>>>>
>> >>>>> YouTube has no problems.
>> >>>>>
>> >>>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
>> >>>
>> >>>     Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice
>> >>
>> >>
>> >>> See initial comments.
>> >>>
>> >>>>>
>> >>>>> The current ISP connection is IPoA LLC.
>> >>>>
>> >>>> Correction - Bridged LLC.
>> >>>
>> >>>     Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
>> >>>
>> >>> #! /bin/bash
>> >>> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
>> >>>
>> >>> # Telekom Tuebingen Moltkestrasse 6
>> >>> TECH=ADSL2
>> >>> # finding a proper target IP is somewhat of an art, just traceroute a remote site
>> >>> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
>> >>> TARGET=87.186.197.70                # T
>> >>> DATESTR=`date +%Y%m%d_%H%M%S`       # to allow multiple sequential records
>> >>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
>> >>>
>> >>>
>> >>> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
>> >>> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
>> >>> PINGPERIOD=0.01             # in seconds
>> >>> PINGSPERSIZE=10000
>> >>>
>> >>> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
>> >>> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
>> >>> SWEEPMINSIZE=16             # 64bit systems seem to require 16 bytes of payload to include a timestamp...
>> >>> SWEEPMAXSIZE=116
>> >>>
>> >>>
>> >>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
>> >>>
>> >>>
>> >>> i_sweep=0
>> >>> i_size=0
>> >>>
>> >>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
>> >>> do
>> >>>   (( i_sweep++ ))
>> >>>   echo "Current iteration: ${i_sweep}"
>> >>>   # now loop from sweepmin to sweepmax
>> >>>   i_size=${SWEEPMINSIZE}
>> >>>   while [ ${i_size} -le ${SWEEPMAXSIZE} ]
>> >>>   do
>> >>>     echo "${i_sweep}. repetition of ping size ${i_size}"
>> >>>     ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
>> >>>     (( i_size++ ))
>> >>>     # we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
>> >>>     sleep ${PINGPERIOD}
>> >>>   done
>> >>> done
>> >>>
>> >>> #tail -f ${LOG}
>> >>>
>> >>> echo "Done... ($0)"
>> >>>
>> >>>
>> >>> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).
>> >>
>> >> To follow at some point.
>> >
>> >       Oh, I failed to mention at the given parameters the script takes almost 3 hours, during which the link should be otherwise idle...
>> >
>> >>>
>> >>>
>> >>>>
>> >>>>> Whatever byte value is used for tc-stab makes no change.
>> >>>
>> >>>     I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.
>> >>
>> >> The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.
>> >
>> >       I guess figuring out you exact overhead empirically is going to be fun.
>> >
>> >>>
>> >>>>>
>> >>>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
>> >>>
>> >>>     For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.
>> >>
>> >> Using 3.10.9-2
>> >
>> >       Yeah as stated above, I would recommend to use either or, not both. If you took RRUL data you might be able to compare the three conditions. I would estimate the most interesting part would be in the sustained ravager up and download rates here.
>> 
>> How do you obtain an average i.e. mean rate from the RRUL graph?
>> >
>> >
>> >>
>> >>>
>> >>>>>
>> >>>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
>> >>>>>
>> >>>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>> >>>>>
>> >>>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
>> >>>>>
>> >>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>> >>>>>
>> >>>>> None of these changes affect the problematic uplink delay.
>> >>>
>> >>>     So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?
>> >>
>> >> Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.
>> >
>> >       Well, both use the same method which is not to meaningful if you use fq_codel on a shaped link (unless you want to optimize your system for UDP floods :) )
>> >
>> > [snipp]
>> >
>> >
>> > Best Regards
>> >       Sebastian
>> 
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> 
>> 
>> 
>> -- 
>> Dave Täht
>> 
>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


[-- Attachment #2: Type: text/html, Size: 18683 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 19:08                                   ` Fred Stratton
  2013-08-25 19:31                                     ` Fred Stratton
@ 2013-08-25 20:28                                     ` Dave Taht
  2013-08-25 21:40                                       ` Sebastian Moeller
  1 sibling, 1 reply; 43+ messages in thread
From: Dave Taht @ 2013-08-25 20:28 UTC (permalink / raw)
  To: Fred Stratton; +Cc: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 15516 bytes --]

The rule of thumb for fixing downloads is to start at 85% of your rated dl
and try to get to 95%. It is unfortunately very subject to the RTT of your
last hop, which on DSL is quite a lot, so I would be surprised if you could
crack 90%.  Cutting it by 50% is a bit much tho! (It would, as always, be
best if the provider used something fq_codel like on their rate limiter,
not yours).

but I'm glad to hear you are making progress!


On Sun, Aug 25, 2013 at 12:08 PM, Fred Stratton <fredstratton@imap.cc>wrote:

> That is very helpful.
>
> With a sync rate of about 12000 kbits/s, and a download rate of about
> 10900 kbits/s. I have set the download rate to 5000 kbits/s. For upload
> similarly 1200/970/500, all kbits/s.
>
> I can now mostly watch video in iPlayer and download at circa 300 - 400
> kbits/s simultaneously, using htb, with tc-stab disabled.
>
> QED
>
>
> On 25 Aug 2013, at 19:41, Dave Taht <dave.taht@gmail.com> wrote:
>
> So it sounds like you need a lower setting for the download than what you
> are using? It's not the upload that is your problem.
>
> Netanalyzer sends one packet stream and thus measures 1 queue only.
> fq_codel will happily give it one big queue for a while, while still
> interleaving other flows's packets into the stream at every opportunity.
>
> as for parsing rrul I generally draw a line with my hand and multiply by
> 4, then fudge in the numbers for the reverse ack and measurement streams.
>
>
> You are saying that you judge the result solely by eye. presumably.
>
>
> As written it was targetted at 4Mbit and up which is why the samples are
> discontinuous in your much lower bandwidth situation.
>
>
> Aha. Problem solved.
>
>
> I do agree that rrul could use a simpler implementation, perhaps one that
> tested two download streams only, and provided an estimate as to the actual
> bandwidth usage, and scale below 4Mbit better.
>
>
> On Sun, Aug 25, 2013 at 11:30 AM, Fred Stratton <fredstratton@imap.cc>wrote:
>
>>
>> On 25 Aug 2013, at 18:53, Sebastian Moeller <moeller0@gmx.de> wrote:
>>
>> > Hi Fred,
>> >
>> >
>> > On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
>> >
>> >> Thank you.
>> >>
>> >> This is an initial response.
>> >>
>> >> Am using 3.10.2-1 currently, with the standard AQM interface. This
>> does not have the pull down menu of your interface, which is why I ask if
>> both are active.
>> >
>> >       I have seen your follow-up mail that you actually used 3.10.9-2.
>> I think that has the first cut of the script modifications that still allow
>> to select both. Since I have not tested it any other way I would recommend
>> to enable just one of them at the same time. Since the implementation of
>> both is somewhat orthogonal and htb_private actually works in 3.10.9, best
>> case you might actually get the link layer adjustments (LLA) and the
>> overhead applied twice, wasting bandwidth. So please either use the last
>> set of modified files I send around or wait for Dave to include them in
>> ceropackages…
>>
>> I have retained the unmodified script. I shall return to that.
>>
>>
>> >
>> >> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
>> >>
>> >>> Hi Fred,
>> >>>
>> >>>
>> >>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc>
>> wrote:
>> >>>
>> >>>>
>> >>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc>
>> wrote:
>> >>>>
>> >>>>> As the person with the most flaky ADSL link, I point out that None
>> of these recent, welcome, changes, are having any effect here, with an
>> uplink sped of circa 950 kbits/s.
>> >>>
>> >>>     Okay, how flaky is you link? What rate of Errors do you have
>> while testing? I am especially interested in CRC errors and ES SES and HEC,
>> just to get an idea how flaky the line is...
>> >>>
>> >>>>>
>> >>>>> The reason I mention this is that it is still impossible to watch
>> iPlayer Flash streaming video and download at the same time, The iPlayer
>> stream fails. The point of the exercise was to achieve this.
>> >>>>>
>> >>>>> The uplink delay is consistently around 650ms, which appears to be
>> too high for effective streaming. In addition, the uplink stream has
>> multiple breaks, presumably outages, if the uplink rate is capped at, say,
>> 700 kbits/s.
>> >>>
>> >>>     Well, watching video is going to stress your downlink so the
>> uplink should not saturate by the ACKs and the concurrent downloads also do
>> not stress your uplink except for the ACKs, so this points to downlink
>> errors as far as I can tell from the data you have given. If the up link
>> has repeated outages however, your problems might be unfixable because
>> these, if long enough, will cause lost ACKs and will probably trigger
>> retransmission, independent of whether the link layer adjustments work or
>> not. (You could test this by shaping you up and downlink to <= 50% of the
>> link rates and disable all link layer adjustments, 50% is larger than the
>> ATM worst case so should have you covered. Well unless you del link has an
>> excessive number of tones reserved for forward error correction (FEC)).
>> >>
>> >> Uptime 100655
>> >> downstream 12162 kbits/s
>> >> CRC errors 10154
>> >> FEC Errors 464
>> >> hEC Errors 758
>> >>
>> >> upstream 1122 kbits/s
>> >> no errors in period.
>> >
>> >       Ah, I think you told me in the past that "Target snr upped to 12
>> deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.atlower snr. " so sync at 12162 might be too aggressive, no? But the point is
>> that as I understand iPlayer works fine without competing download traffic?
>> To my eye the error numbers look small enough to not be concerned about. Do
>> you know how long the error correction period is?
>>
>> The correction period is probably circa 28 hours. Have moved to using the
>> HG612. This is uses the Broadcom 6368 SoC. Like most of the devices I use,
>> it fell out of a BT van and on to ebay. It is the standard device used for
>> connecting FTTC installations in the UK. With a simple modification, it
>> will work stably with ADSL2+.
>>
>> Ihe sync rate has gone up considerably, not because I have changed the
>> Target SNR from 12 Decibel, but because I am now using a Broadcom chipset
>> and software blob with a DSLAM which returns BDCM when interrogated.
>> >
>> >
>> >>
>> >>>     Could you perform the following test by any chance: state iPlayer
>> and yor typical downloads and then have a look at
>> http://gw.home.lan:81und the following tab chain Status -> Realtime
>> Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound
>> rate stays well below you shaped limit and you still encounter the stream
>> failure I would say it is save to ignore the link layer adjustments as
>> cause of your issues.
>> >>
>> >> Am happy reducing rate to fifty per cent, but the uplink appears to
>> have difficulty operating below circa 500 kbits/s. This should not be so. I
>> shall try a fourth time.
>> >
>> >       That sounds weird, if you shape to below 500 upload stops working
>> or just gets choppier? Looking at your sync data 561 would fit the ~50% and
>> above 500 requirements.
>>
>> I was basing the judgment on Netalyzr data. DT and you now say this is
>> suspect. However, netsurf-wrapper traces are discontinuous. The actual real
>> time trace looks perfectly normal.
>>
>> iPlayer is a Flash based player which is web page embedded.  The ipv4
>> user address is parsed to see if it is in the UK. It plays BBC TV programs.
>> It most likely is badly designed and written. It is the way I watch TV.
>> Like all UK residents, I pay the bloated bureaucracy of the BBC a yearly
>> fee of about 200 euro. If I do not pay, I will be fined. You will be
>> surprised that I am not a fan of the BBC. iPlayer starts and runs fine, but
>> if a download is commenced whilst it is running, so I can watch the
>> propaganda put out as national news, the video will stall and the continue,
>> but most commonly will stop.
>> >
>> >
>> >>>
>> >>>
>> >>>>>
>> >>>>> YouTube has no problems.
>> >>>>>
>> >>>>> I remain unclear whether the use of tc-stab and htb are mutually
>> exclusive options, using the present stock interface.
>> >>>
>> >>>     Well, depending on the version of the cerowrt you use, <3.10.9-1
>> I believe lacks a functional HTB link layer adjustment mechanism, so you
>> should select tc_stab. My most recent modifications to Toke and Dave's AQM
>> package does only allow you to select one or the other. In any case
>> selecting BOTH is not a reasonable thing to do, because best case it will
>> only apply overhead twice, worst case it would also do the (link layer
>> adjustments) LLA twice
>> >>
>> >>
>> >>> See initial comments.
>> >>>
>> >>>>>
>> >>>>> The current ISP connection is IPoA LLC.
>> >>>>
>> >>>> Correction - Bridged LLC.
>> >>>
>> >>>     Well, I think you should try to figure out your overhead
>> empirically and check the encapsulation. I would recommend you run the
>> following script on you r link over night and send me the log file it
>> produces:
>> >>>
>> >>> #! /bin/bash
>> >>> # TODO use seq or bash to generate a list of the requested sizes (to
>> alow for non-equdistantly spaced sizes)
>> >>>
>> >>> # Telekom Tuebingen Moltkestrasse 6
>> >>> TECH=ADSL2
>> >>> # finding a proper target IP is somewhat of an art, just traceroute a
>> remote site
>> >>> # and find the nearest host reliably responding to pings showing the
>> smallet variation of pingtimes
>> >>> TARGET=87.186.197.70                # T
>> >>> DATESTR=`date +%Y%m%d_%H%M%S`       # to allow multiple sequential
>> records
>> >>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
>> >>>
>> >>>
>> >>> # by default non-root ping will only end one packet per second, so
>> work around that by calling ping independently for each package
>> >>> # empirically figure out the shortest period still giving the
>> standard ping time (to avoid being slow-pathed by our host)
>> >>> PINGPERIOD=0.01             # in seconds
>> >>> PINGSPERSIZE=10000
>> >>>
>> >>> # Start, needed to find the per packet overhead dependent on the ATM
>> encapsulation
>> >>> # to reliably show ATM quantization one would like to see at least
>> two steps, so cover a range > 2 ATM cells (so > 96 bytes)
>> >>> SWEEPMINSIZE=16             # 64bit systems seem to require 16 bytes
>> of payload to include a timestamp...
>> >>> SWEEPMAXSIZE=116
>> >>>
>> >>>
>> >>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
>> >>>
>> >>>
>> >>> i_sweep=0
>> >>> i_size=0
>> >>>
>> >>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
>> >>> do
>> >>>   (( i_sweep++ ))
>> >>>   echo "Current iteration: ${i_sweep}"
>> >>>   # now loop from sweepmin to sweepmax
>> >>>   i_size=${SWEEPMINSIZE}
>> >>>   while [ ${i_size} -le ${SWEEPMAXSIZE} ]
>> >>>   do
>> >>>     echo "${i_sweep}. repetition of ping size ${i_size}"
>> >>>     ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
>> >>>     (( i_size++ ))
>> >>>     # we need a sleep binary that allows non integer times (GNU sleep
>> is fine as is sleep of macosx 10.8.4)
>> >>>     sleep ${PINGPERIOD}
>> >>>   done
>> >>> done
>> >>>
>> >>> #tail -f ${LOG}
>> >>>
>> >>> echo "Done... ($0)"
>> >>>
>> >>>
>> >>> Please set TARGET to the closest IP host on the ISP side of your link
>> that gives reliable ping RTTs (using "ping -c 100 -s 16
>> your.best.host.ip"). Also test whether the RTTs are in the same ballpark
>> when you reduce the ping period to 0.01 (you might have to increase the
>> period until the RTTs are close to the standard 1 ping per second case). I
>> can then run this through my matlab code to detect the actual overhead. (I
>> am happy to share the code as well, if you have matlab available; it might
>> even run under octave but I have not tested that since the last major
>> changes).
>> >>
>> >> To follow at some point.
>> >
>> >       Oh, I failed to mention at the given parameters the script takes
>> almost 3 hours, during which the link should be otherwise idle...
>> >
>> >>>
>> >>>
>> >>>>
>> >>>>> Whatever byte value is used for tc-stab makes no change.
>> >>>
>> >>>     I assume you talk about the overhead? Missing link layer
>> adjustment will eat between 50% and 10% of your link bandwidth, while
>> missing overhead values will be more benign. The only advise I can give is
>> to pick the overhead that actually describes your link. I am willing to
>> help you figure this out.
>> >>
>> >> The link is bridged LLC. Have been using 18 and 32 for test purposes.
>> I shall move to PPPoA VC-MUX in 4 months.
>> >
>> >       I guess figuring out you exact overhead empirically is going to
>> be fun.
>> >
>> >>>
>> >>>>>
>> >>>>> I have applied the ingress modification to simple.qos, keeping the
>> original version., and tested both.
>> >>>
>> >>>     For which cerowrt version? It is only expected to do something
>> for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT
>> work.
>> >>
>> >> Using 3.10.9-2
>> >
>> >       Yeah as stated above, I would recommend to use either or, not
>> both. If you took RRUL data you might be able to compare the three
>> conditions. I would estimate the most interesting part would be in the
>> sustained ravager up and download rates here.
>>
>> How do you obtain an average i.e. mean rate from the RRUL graph?
>> >
>> >
>> >>
>> >>>
>> >>>>>
>> >>>>> I have changed the Powerline adaptors I use to ones with known
>> smaller buffers, though this is unlikely to be a ate-limiting step.
>> >>>>>
>> >>>>> I have changed the 2Wire gateway, known to be heavily buffered,
>> with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>> >>>>>
>> >>>>> This device has a permanently on telnet interface, with a simple
>> password, which cannot be changed other than by firmware recompilation…
>> >>>>>
>> >>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>> >>>>>
>> >>>>> None of these changes affect the problematic uplink delay.
>> >>>
>> >>>     So how did you measure the uplink delay? The RRUL plots you sent
>> me show an increase in ping RTT from around 50ms to 80ms with tc_stab and
>> fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay,
>> netalyzr?
>> >>
>> >> Max Planck and Netalyzr produce the same figure. I use both, but Max
>> Planck gives you circa 3 tries per IP address per 24 hours.
>> >
>> >       Well, both use the same method which is not to meaningful if you
>> use fq_codel on a shaped link (unless you want to optimize your system for
>> UDP floods :) )
>> >
>> > [snipp]
>> >
>> >
>> > Best Regards
>> >       Sebastian
>>
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>
>
>
>
> --
> Dave Täht
>
> Fixing bufferbloat with cerowrt:
> http://www.teklibre.com/cerowrt/subscribe.html
>
>
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
>


-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html

[-- Attachment #2: Type: text/html, Size: 18949 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 20:28                                     ` Dave Taht
@ 2013-08-25 21:40                                       ` Sebastian Moeller
  0 siblings, 0 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-25 21:40 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

Hi Dave hi list,


On Aug 25, 2013, at 22:28 , Dave Taht <dave.taht@gmail.com> wrote:

> The rule of thumb for fixing downloads is to start at 85% of your rated dl and try to get to 95%. It is unfortunately very subject to the RTT of your last hop, which on DSL is quite a lot, so I would be surprised if you could crack 90%.  Cutting it by 50% is a bit much tho!

	Well, 50% is close to the worst case for ATM, if you need two ATM cells instead of 1. So starting at 50% rules out the ATM encapsulation. So starting at 50% is not a bad idea, staying there though would be ;) . Start at 50% to gat a glimpse of what the connection should be able to do and then bisect your way up again...

> (It would, as always, be best if the provider used something fq_codel like on their rate limiter, not yours). 

	That would be sweet. My hopes for Germany are quite low, but I have heard about the UK it might be in the books (PPPoA or IPoA, and baby jumbo frames to allow a MTU of 1500 in spite of PPP overhead are good signs in my book)

> 
> but I'm glad to hear you are making progress!

	Yepp, I hope that reduced downlink rates make Fred's connection useable again.

Best Regards
	Sebastian

> 
> 
> On Sun, Aug 25, 2013 at 12:08 PM, Fred Stratton <fredstratton@imap.cc> wrote:
> That is very helpful.
> 
> With a sync rate of about 12000 kbits/s, and a download rate of about 10900 kbits/s. I have set the download rate to 5000 kbits/s. For upload similarly 1200/970/500, all kbits/s.
> 
> I can now mostly watch video in iPlayer and download at circa 300 - 400 kbits/s simultaneously, using htb, with tc-stab disabled.
> 
> QED
> 
> 
> On 25 Aug 2013, at 19:41, Dave Taht <dave.taht@gmail.com> wrote:
> 
>> So it sounds like you need a lower setting for the download than what you are using? It's not the upload that is your problem. 
>> 
>> Netanalyzer sends one packet stream and thus measures 1 queue only. fq_codel will happily give it one big queue for a while, while still interleaving other flows's packets into the stream at every opportunity. 
>> 
>> as for parsing rrul I generally draw a line with my hand and multiply by 4, then fudge in the numbers for the reverse ack and measurement streams. 
> 
> You are saying that you judge the result solely by eye. presumably.
> 
>> 
>> As written it was targetted at 4Mbit and up which is why the samples are discontinuous in your much lower bandwidth situation. 
> 
> Aha. Problem solved.
> 
>> 
>> I do agree that rrul could use a simpler implementation, perhaps one that tested two download streams only, and provided an estimate as to the actual bandwidth usage, and scale below 4Mbit better.
>> 
>> 
>> On Sun, Aug 25, 2013 at 11:30 AM, Fred Stratton <fredstratton@imap.cc> wrote:
>> 
>> On 25 Aug 2013, at 18:53, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>> > Hi Fred,
>> >
>> >
>> > On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
>> >
>> >> Thank you.
>> >>
>> >> This is an initial response.
>> >>
>> >> Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active.
>> >
>> >       I have seen your follow-up mail that you actually used 3.10.9-2. I think that has the first cut of the script modifications that still allow to select both. Since I have not tested it any other way I would recommend to enable just one of them at the same time. Since the implementation of both is somewhat orthogonal and htb_private actually works in 3.10.9, best case you might actually get the link layer adjustments (LLA) and the overhead applied twice, wasting bandwidth. So please either use the last set of modified files I send around or wait for Dave to include them in ceropackages…
>> 
>> I have retained the unmodified script. I shall return to that.
>> 
>> 
>> >
>> >> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
>> >>
>> >>> Hi Fred,
>> >>>
>> >>>
>> >>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
>> >>>
>> >>>>
>> >>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
>> >>>>
>> >>>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
>> >>>
>> >>>     Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
>> >>>
>> >>>>>
>> >>>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this.
>> >>>>>
>> >>>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
>> >>>
>> >>>     Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
>> >>
>> >> Uptime 100655
>> >> downstream 12162 kbits/s
>> >> CRC errors 10154
>> >> FEC Errors 464
>> >> hEC Errors 758
>> >>
>> >> upstream 1122 kbits/s
>> >> no errors in period.
>> >
>> >       Ah, I think you told me in the past that "Target snr upped to 12 deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.at lower snr. " so sync at 12162 might be too aggressive, no? But the point is that as I understand iPlayer works fine without competing download traffic? To my eye the error numbers look small enough to not be concerned about. Do you know how long the error correction period is?
>> 
>> The correction period is probably circa 28 hours. Have moved to using the HG612. This is uses the Broadcom 6368 SoC. Like most of the devices I use, it fell out of a BT van and on to ebay. It is the standard device used for connecting FTTC installations in the UK. With a simple modification, it will work stably with ADSL2+.
>> 
>> Ihe sync rate has gone up considerably, not because I have changed the Target SNR from 12 Decibel, but because I am now using a Broadcom chipset and software blob with a DSLAM which returns BDCM when interrogated.
>> >
>> >
>> >>
>> >>>     Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.
>> >>
>> >> Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.
>> >
>> >       That sounds weird, if you shape to below 500 upload stops working or just gets choppier? Looking at your sync data 561 would fit the ~50% and above 500 requirements.
>> 
>> I was basing the judgment on Netalyzr data. DT and you now say this is suspect. However, netsurf-wrapper traces are discontinuous. The actual real time trace looks perfectly normal.
>> 
>> iPlayer is a Flash based player which is web page embedded.  The ipv4 user address is parsed to see if it is in the UK. It plays BBC TV programs. It most likely is badly designed and written. It is the way I watch TV. Like all UK residents, I pay the bloated bureaucracy of the BBC a yearly fee of about 200 euro. If I do not pay, I will be fined. You will be surprised that I am not a fan of the BBC. iPlayer starts and runs fine, but if a download is commenced whilst it is running, so I can watch the propaganda put out as national news, the video will stall and the continue, but most commonly will stop.
>> >
>> >
>> >>>
>> >>>
>> >>>>>
>> >>>>> YouTube has no problems.
>> >>>>>
>> >>>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
>> >>>
>> >>>     Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice
>> >>
>> >>
>> >>> See initial comments.
>> >>>
>> >>>>>
>> >>>>> The current ISP connection is IPoA LLC.
>> >>>>
>> >>>> Correction - Bridged LLC.
>> >>>
>> >>>     Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
>> >>>
>> >>> #! /bin/bash
>> >>> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
>> >>>
>> >>> # Telekom Tuebingen Moltkestrasse 6
>> >>> TECH=ADSL2
>> >>> # finding a proper target IP is somewhat of an art, just traceroute a remote site
>> >>> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
>> >>> TARGET=87.186.197.70                # T
>> >>> DATESTR=`date +%Y%m%d_%H%M%S`       # to allow multiple sequential records
>> >>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
>> >>>
>> >>>
>> >>> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
>> >>> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
>> >>> PINGPERIOD=0.01             # in seconds
>> >>> PINGSPERSIZE=10000
>> >>>
>> >>> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
>> >>> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
>> >>> SWEEPMINSIZE=16             # 64bit systems seem to require 16 bytes of payload to include a timestamp...
>> >>> SWEEPMAXSIZE=116
>> >>>
>> >>>
>> >>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
>> >>>
>> >>>
>> >>> i_sweep=0
>> >>> i_size=0
>> >>>
>> >>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
>> >>> do
>> >>>   (( i_sweep++ ))
>> >>>   echo "Current iteration: ${i_sweep}"
>> >>>   # now loop from sweepmin to sweepmax
>> >>>   i_size=${SWEEPMINSIZE}
>> >>>   while [ ${i_size} -le ${SWEEPMAXSIZE} ]
>> >>>   do
>> >>>     echo "${i_sweep}. repetition of ping size ${i_size}"
>> >>>     ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
>> >>>     (( i_size++ ))
>> >>>     # we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
>> >>>     sleep ${PINGPERIOD}
>> >>>   done
>> >>> done
>> >>>
>> >>> #tail -f ${LOG}
>> >>>
>> >>> echo "Done... ($0)"
>> >>>
>> >>>
>> >>> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).
>> >>
>> >> To follow at some point.
>> >
>> >       Oh, I failed to mention at the given parameters the script takes almost 3 hours, during which the link should be otherwise idle...
>> >
>> >>>
>> >>>
>> >>>>
>> >>>>> Whatever byte value is used for tc-stab makes no change.
>> >>>
>> >>>     I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.
>> >>
>> >> The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.
>> >
>> >       I guess figuring out you exact overhead empirically is going to be fun.
>> >
>> >>>
>> >>>>>
>> >>>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
>> >>>
>> >>>     For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.
>> >>
>> >> Using 3.10.9-2
>> >
>> >       Yeah as stated above, I would recommend to use either or, not both. If you took RRUL data you might be able to compare the three conditions. I would estimate the most interesting part would be in the sustained ravager up and download rates here.
>> 
>> How do you obtain an average i.e. mean rate from the RRUL graph?
>> >
>> >
>> >>
>> >>>
>> >>>>>
>> >>>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
>> >>>>>
>> >>>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>> >>>>>
>> >>>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
>> >>>>>
>> >>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>> >>>>>
>> >>>>> None of these changes affect the problematic uplink delay.
>> >>>
>> >>>     So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?
>> >>
>> >> Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.
>> >
>> >       Well, both use the same method which is not to meaningful if you use fq_codel on a shaped link (unless you want to optimize your system for UDP floods :) )
>> >
>> > [snipp]
>> >
>> >
>> > Best Regards
>> >       Sebastian
>> 
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>> 
>> 
>> 
>> -- 
>> Dave Täht
>> 
>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
> 
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 18:30                               ` Fred Stratton
  2013-08-25 18:41                                 ` Dave Taht
@ 2013-08-25 21:50                                 ` Sebastian Moeller
  1 sibling, 0 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-25 21:50 UTC (permalink / raw)
  To: Fred Stratton; +Cc: cerowrt-devel

Hi Fred,


On Aug 25, 2013, at 20:30 , Fred Stratton <fredstratton@imap.cc> wrote:

> 
> On 25 Aug 2013, at 18:53, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
>> Hi Fred,
>> 
>> 
>> On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
>> 
>>> Thank you.
>>> 
>>> This is an initial response.
>>> 
>>> Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active. 
>> 
>> 	I have seen your follow-up mail that you actually used 3.10.9-2. I think that has the first cut of the script modifications that still allow to select both. Since I have not tested it any other way I would recommend to enable just one of them at the same time. Since the implementation of both is somewhat orthogonal and htb_private actually works in 3.10.9, best case you might actually get the link layer adjustments (LLA) and the overhead applied twice, wasting bandwidth. So please either use the last set of modified files I send around or wait for Dave to include them in ceropackages…
> 
> I have retained the unmodified script. I shall return to that.

	Let me know how you fare (but expect no replies for a week due to holiday)

> 
> 
>> 
>>> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> 
>>>> Hi Fred,
>>>> 
>>>> 
>>>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
>>>> 
>>>>> 
>>>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
>>>>> 
>>>>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
>>>> 
>>>> 	Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
>>>> 
>>>>>> 
>>>>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this. 
>>>>>> 
>>>>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
>>>> 
>>>> 	Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
>>> 
>>> Uptime 100655
>>> downstream 12162 kbits/s
>>> CRC errors 10154
>>> FEC Errors 464
>>> hEC Errors 758
>>> 
>>> upstream 1122 kbits/s
>>> no errors in period.
>> 
>> 	Ah, I think you told me in the past that "Target snr upped to 12 deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.at lower snr. " so sync at 12162 might be too aggressive, no? But the point is that as I understand iPlayer works fine without competing download traffic? To my eye the error numbers look small enough to not be concerned about. Do you know how long the error correction period is?
> 
> The correction period is probably circa 28 hours.

	Okay, if these errors are logged over 28 hours they are not the cause of your troubles...

> Have moved to using the HG612. This is uses the Broadcom 6368 SoC. Like most of the devices I use, it fell out of a BT van and on to ebay. It is the standard device used for connecting FTTC installations in the UK. With a simple modification, it will work stably with ADSL2+.
> 
> Ihe sync rate has gone up considerably, not because I have changed the Target SNR from 12 Decibel, but because I am now using a Broadcom chipset and software blob with a DSLAM which returns BDCM when interrogated.

	Ah, good then...

>> 
>> 
>>> 
>>>> 	Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.
>>> 
>>> Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.
>> 
>> 	That sounds weird, if you shape to below 500 upload stops working or just gets choppier? Looking at your sync data 561 would fit the ~50% and above 500 requirements.
> 
> I was basing the judgment on Netalyzr data. DT and you now say this is suspect. However, netsurf-wrapper traces are discontinuous. The actual real time trace looks perfectly normal.
> 
> iPlayer is a Flash based player which is web page embedded.  The ipv4 user address is parsed to see if it is in the UK. It plays BBC TV programs. It most likely is badly designed and written. It is the way I watch TV. Like all UK residents, I pay the bloated bureaucracy of the BBC a yearly fee of about 200 euro. If I do not pay, I will be fined. You will be surprised that I am not a fan of the BBC. iPlayer starts and runs fine, but if a download is commenced whilst it is running, so I can watch the propaganda put out as national news, the video will stall and the continue, but most commonly will stop.

	So being not in the UK this is something I can not really test, but if you have a single iPlayer instance running and watch something how much traffic does show up in cerowrt's teatime traffic display, or asked differently how much of your link as eaten up by iPlayer in default mode?


>> 
>> 
>>>> 
>>>> 
>>>>>> 
>>>>>> YouTube has no problems.
>>>>>> 
>>>>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
>>>> 
>>>> 	Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice
>>> 
>>> 
>>>> See initial comments.
>>>> 
>>>>>> 
>>>>>> The current ISP connection is IPoA LLC.
>>>>> 
>>>>> Correction - Bridged LLC. 
>>>> 
>>>> 	Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
>>>> 
>>>> #! /bin/bash
>>>> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
>>>> 
>>>> # Telekom Tuebingen Moltkestrasse 6
>>>> TECH=ADSL2
>>>> # finding a proper target IP is somewhat of an art, just traceroute a remote site 
>>>> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
>>>> TARGET=87.186.197.70		# T
>>>> DATESTR=`date +%Y%m%d_%H%M%S`	# to allow multiple sequential records
>>>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
>>>> 
>>>> 
>>>> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
>>>> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
>>>> PINGPERIOD=0.01		# in seconds
>>>> PINGSPERSIZE=10000
>>>> 
>>>> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
>>>> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
>>>> SWEEPMINSIZE=16		# 64bit systems seem to require 16 bytes of payload to include a timestamp...
>>>> SWEEPMAXSIZE=116
>>>> 
>>>> 
>>>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
>>>> 
>>>> 
>>>> i_sweep=0
>>>> i_size=0
>>>> 
>>>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
>>>> do
>>>>  (( i_sweep++ ))
>>>>  echo "Current iteration: ${i_sweep}"
>>>>  # now loop from sweepmin to sweepmax
>>>>  i_size=${SWEEPMINSIZE}
>>>>  while [ ${i_size} -le ${SWEEPMAXSIZE} ]
>>>>  do
>>>> 	echo "${i_sweep}. repetition of ping size ${i_size}"
>>>> 	ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
>>>> 	(( i_size++ ))
>>>> 	# we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
>>>> 	sleep ${PINGPERIOD}
>>>>  done
>>>> done
>>>> 
>>>> #tail -f ${LOG}
>>>> 
>>>> echo "Done... ($0)"
>>>> 
>>>> 
>>>> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).
>>> 
>>> To follow at some point.
>> 
>> 	Oh, I failed to mention at the given parameters the script takes almost 3 hours, during which the link should be otherwise idle...
>> 
>>>> 
>>>> 
>>>>> 
>>>>>> Whatever byte value is used for tc-stab makes no change.
>>>> 
>>>> 	I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.
>>> 
>>> The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.
>> 
>> 	I guess figuring out you exact overhead empirically is going to be fun.
>> 
>>>> 
>>>>>> 
>>>>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
>>>> 
>>>> 	For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.
>>> 
>>> Using 3.10.9-2
>> 
>> 	Yeah as stated above, I would recommend to use either or, not both. If you took RRUL data you might be able to compare the three conditions. I would estimate the most interesting part would be in the sustained ravager up and download rates here.
> 
> How do you obtain an average i.e. mean rate from the RRUL graph?

	So far, I am eyeballing it similarly to Dave (except I do not bother to multiply by 4 most of the time). Experience has taught me that this often is good enough, especially as I can easily ignore some periodic events caused by macosx that should not be included in any statistics. But I a thinking about looking into net-perfr wrapper to get numerical outputs and ideally less choppy upload graphs…


Best
	Sebastian



>> 
>> 
>>> 
>>>> 
>>>>>> 
>>>>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
>>>>>> 
>>>>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>>>>>> 
>>>>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
>>>>>> 
>>>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>>>>>> 
>>>>>> None of these changes affect the problematic uplink delay.
>>>> 
>>>> 	So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?
>>> 
>>> Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.
>> 
>> 	Well, both use the same method which is not to meaningful if you use fq_codel on a shaped link (unless you want to optimize your system for UDP floods :) )
>> 
>> [snipp]
>> 
>> 
>> Best Regards
>> 	Sebastian
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-25 19:31                                     ` Fred Stratton
@ 2013-08-25 21:54                                       ` Sebastian Moeller
  0 siblings, 0 replies; 43+ messages in thread
From: Sebastian Moeller @ 2013-08-25 21:54 UTC (permalink / raw)
  To: Fred Stratton; +Cc: cerowrt-devel

Hi Fred,

since you have a very good test with iPlayer, you can simply repeat your experiments to figure out where shaping becomes to unstable for iPlayer. It would be interesting to see whether RRUL (or other netsurf-wrapper tests) show qualitative differences around the same numerical shaping values.



On Aug 25, 2013, at 21:31 , Fred Stratton <fredstratton@imap.cc> wrote:

> Re-reading your comment, I have reset the upload rate higher to 900 kbits/s.
> 
> 
> On 25 Aug 2013, at 20:08, Fred Stratton <fredstratton@imap.cc> wrote:
> 
>> That is very helpful.
>> 
>> With a sync rate of about 12000 kbits/s, and a download rate of about 10900 kbits/s. I have set the download rate to 5000 kbits/s. For upload similarly 1200/970/500, all kbits/s.
>> 
>> I can now mostly watch video in iPlayer and download at circa 300 - 400 kbits/s simultaneously, using htb, with tc-stab disabled.
>> 
>> QED

	So slowly increase both shaped rates until iPlayer becomes "unhappy" to better define the threshold?

Best
	Sebastian

>> 
>> 
>> On 25 Aug 2013, at 19:41, Dave Taht <dave.taht@gmail.com> wrote:
>> 
>>> So it sounds like you need a lower setting for the download than what you are using? It's not the upload that is your problem. 
>>> 
>>> Netanalyzer sends one packet stream and thus measures 1 queue only. fq_codel will happily give it one big queue for a while, while still interleaving other flows's packets into the stream at every opportunity. 
>>> 
>>> as for parsing rrul I generally draw a line with my hand and multiply by 4, then fudge in the numbers for the reverse ack and measurement streams. 
>> 
>> You are saying that you judge the result solely by eye. presumably.
>>> 
>>> As written it was targetted at 4Mbit and up which is why the samples are discontinuous in your much lower bandwidth situation. 
>> 
>> Aha. Problem solved.
>>> 
>>> I do agree that rrul could use a simpler implementation, perhaps one that tested two download streams only, and provided an estimate as to the actual bandwidth usage, and scale below 4Mbit better.
>>> 
>>> 
>>> On Sun, Aug 25, 2013 at 11:30 AM, Fred Stratton <fredstratton@imap.cc> wrote:
>>> 
>>> On 25 Aug 2013, at 18:53, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> 
>>> > Hi Fred,
>>> >
>>> >
>>> > On Aug 25, 2013, at 16:26 , Fred Stratton <fredstratton@imap.cc> wrote:
>>> >
>>> >> Thank you.
>>> >>
>>> >> This is an initial response.
>>> >>
>>> >> Am using 3.10.2-1 currently, with the standard AQM interface. This does not have the pull down menu of your interface, which is why I ask if both are active.
>>> >
>>> >       I have seen your follow-up mail that you actually used 3.10.9-2. I think that has the first cut of the script modifications that still allow to select both. Since I have not tested it any other way I would recommend to enable just one of them at the same time. Since the implementation of both is somewhat orthogonal and htb_private actually works in 3.10.9, best case you might actually get the link layer adjustments (LLA) and the overhead applied twice, wasting bandwidth. So please either use the last set of modified files I send around or wait for Dave to include them in ceropackages…
>>> 
>>> I have retained the unmodified script. I shall return to that.
>>> 
>>> 
>>> >
>>> >> On 25 Aug 2013, at 14:59, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> >>
>>> >>> Hi Fred,
>>> >>>
>>> >>>
>>> >>> On Aug 25, 2013, at 12:17 , Fred Stratton <fredstratton@imap.cc> wrote:
>>> >>>
>>> >>>>
>>> >>>> On 25 Aug 2013, at 10:21, Fred Stratton <fredstratton@imap.cc> wrote:
>>> >>>>
>>> >>>>> As the person with the most flaky ADSL link, I point out that None of these recent, welcome, changes, are having any effect here, with an uplink sped of circa 950 kbits/s.
>>> >>>
>>> >>>     Okay, how flaky is you link? What rate of Errors do you have while testing? I am especially interested in CRC errors and ES SES and HEC, just to get an idea how flaky the line is...
>>> >>>
>>> >>>>>
>>> >>>>> The reason I mention this is that it is still impossible to watch iPlayer Flash streaming video and download at the same time, The iPlayer stream fails. The point of the exercise was to achieve this.
>>> >>>>>
>>> >>>>> The uplink delay is consistently around 650ms, which appears to be too high for effective streaming. In addition, the uplink stream has multiple breaks, presumably outages, if the uplink rate is capped at, say, 700 kbits/s.
>>> >>>
>>> >>>     Well, watching video is going to stress your downlink so the uplink should not saturate by the ACKs and the concurrent downloads also do not stress your uplink except for the ACKs, so this points to downlink errors as far as I can tell from the data you have given. If the up link has repeated outages however, your problems might be unfixable because these, if long enough, will cause lost ACKs and will probably trigger retransmission, independent of whether the link layer adjustments work or not. (You could test this by shaping you up and downlink to <= 50% of the link rates and disable all link layer adjustments, 50% is larger than the ATM worst case so should have you covered. Well unless you del link has an excessive number of tones reserved for forward error correction (FEC)).
>>> >>
>>> >> Uptime 100655
>>> >> downstream 12162 kbits/s
>>> >> CRC errors 10154
>>> >> FEC Errors 464
>>> >> hEC Errors 758
>>> >>
>>> >> upstream 1122 kbits/s
>>> >> no errors in period.
>>> >
>>> >       Ah, I think you told me in the past that "Target snr upped to 12 deciBel.  Line can sustain 10 megabits/s with repeated loss of sync.at lower snr. " so sync at 12162 might be too aggressive, no? But the point is that as I understand iPlayer works fine without competing download traffic? To my eye the error numbers look small enough to not be concerned about. Do you know how long the error correction period is?
>>> 
>>> The correction period is probably circa 28 hours. Have moved to using the HG612. This is uses the Broadcom 6368 SoC. Like most of the devices I use, it fell out of a BT van and on to ebay. It is the standard device used for connecting FTTC installations in the UK. With a simple modification, it will work stably with ADSL2+.
>>> 
>>> Ihe sync rate has gone up considerably, not because I have changed the Target SNR from 12 Decibel, but because I am now using a Broadcom chipset and software blob with a DSLAM which returns BDCM when interrogated.
>>> >
>>> >
>>> >>
>>> >>>     Could you perform the following test by any chance: state iPlayer and yor typical downloads and then have a look at http://gw.home.lan:81und the following tab chain Status -> Realtime Graphs -> Traffic -> Realtime Traffic. If during your test the Outbound rate stays well below you shaped limit and you still encounter the stream failure I would say it is save to ignore the link layer adjustments as cause of your issues.
>>> >>
>>> >> Am happy reducing rate to fifty per cent, but the uplink appears to have difficulty operating below circa 500 kbits/s. This should not be so. I shall try a fourth time.
>>> >
>>> >       That sounds weird, if you shape to below 500 upload stops working or just gets choppier? Looking at your sync data 561 would fit the ~50% and above 500 requirements.
>>> 
>>> I was basing the judgment on Netalyzr data. DT and you now say this is suspect. However, netsurf-wrapper traces are discontinuous. The actual real time trace looks perfectly normal.
>>> 
>>> iPlayer is a Flash based player which is web page embedded.  The ipv4 user address is parsed to see if it is in the UK. It plays BBC TV programs. It most likely is badly designed and written. It is the way I watch TV. Like all UK residents, I pay the bloated bureaucracy of the BBC a yearly fee of about 200 euro. If I do not pay, I will be fined. You will be surprised that I am not a fan of the BBC. iPlayer starts and runs fine, but if a download is commenced whilst it is running, so I can watch the propaganda put out as national news, the video will stall and the continue, but most commonly will stop.
>>> >
>>> >
>>> >>>
>>> >>>
>>> >>>>>
>>> >>>>> YouTube has no problems.
>>> >>>>>
>>> >>>>> I remain unclear whether the use of tc-stab and htb are mutually exclusive options, using the present stock interface.
>>> >>>
>>> >>>     Well, depending on the version of the cerowrt you use, <3.10.9-1 I believe lacks a functional HTB link layer adjustment mechanism, so you should select tc_stab. My most recent modifications to Toke and Dave's AQM package does only allow you to select one or the other. In any case selecting BOTH is not a reasonable thing to do, because best case it will only apply overhead twice, worst case it would also do the (link layer adjustments) LLA twice
>>> >>
>>> >>
>>> >>> See initial comments.
>>> >>>
>>> >>>>>
>>> >>>>> The current ISP connection is IPoA LLC.
>>> >>>>
>>> >>>> Correction - Bridged LLC.
>>> >>>
>>> >>>     Well, I think you should try to figure out your overhead empirically and check the encapsulation. I would recommend you run the following script on you r link over night and send me the log file it produces:
>>> >>>
>>> >>> #! /bin/bash
>>> >>> # TODO use seq or bash to generate a list of the requested sizes (to alow for non-equdistantly spaced sizes)
>>> >>>
>>> >>> # Telekom Tuebingen Moltkestrasse 6
>>> >>> TECH=ADSL2
>>> >>> # finding a proper target IP is somewhat of an art, just traceroute a remote site
>>> >>> # and find the nearest host reliably responding to pings showing the smallet variation of pingtimes
>>> >>> TARGET=87.186.197.70                # T
>>> >>> DATESTR=`date +%Y%m%d_%H%M%S`       # to allow multiple sequential records
>>> >>> LOG=ping_sweep_${TECH}_${DATESTR}.txt
>>> >>>
>>> >>>
>>> >>> # by default non-root ping will only end one packet per second, so work around that by calling ping independently for each package
>>> >>> # empirically figure out the shortest period still giving the standard ping time (to avoid being slow-pathed by our host)
>>> >>> PINGPERIOD=0.01             # in seconds
>>> >>> PINGSPERSIZE=10000
>>> >>>
>>> >>> # Start, needed to find the per packet overhead dependent on the ATM encapsulation
>>> >>> # to reliably show ATM quantization one would like to see at least two steps, so cover a range > 2 ATM cells (so > 96 bytes)
>>> >>> SWEEPMINSIZE=16             # 64bit systems seem to require 16 bytes of payload to include a timestamp...
>>> >>> SWEEPMAXSIZE=116
>>> >>>
>>> >>>
>>> >>> n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
>>> >>>
>>> >>>
>>> >>> i_sweep=0
>>> >>> i_size=0
>>> >>>
>>> >>> while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
>>> >>> do
>>> >>>   (( i_sweep++ ))
>>> >>>   echo "Current iteration: ${i_sweep}"
>>> >>>   # now loop from sweepmin to sweepmax
>>> >>>   i_size=${SWEEPMINSIZE}
>>> >>>   while [ ${i_size} -le ${SWEEPMAXSIZE} ]
>>> >>>   do
>>> >>>     echo "${i_sweep}. repetition of ping size ${i_size}"
>>> >>>     ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &
>>> >>>     (( i_size++ ))
>>> >>>     # we need a sleep binary that allows non integer times (GNU sleep is fine as is sleep of macosx 10.8.4)
>>> >>>     sleep ${PINGPERIOD}
>>> >>>   done
>>> >>> done
>>> >>>
>>> >>> #tail -f ${LOG}
>>> >>>
>>> >>> echo "Done... ($0)"
>>> >>>
>>> >>>
>>> >>> Please set TARGET to the closest IP host on the ISP side of your link that gives reliable ping RTTs (using "ping -c 100 -s 16 your.best.host.ip"). Also test whether the RTTs are in the same ballpark when you reduce the ping period to 0.01 (you might have to increase the period until the RTTs are close to the standard 1 ping per second case). I can then run this through my matlab code to detect the actual overhead. (I am happy to share the code as well, if you have matlab available; it might even run under octave but I have not tested that since the last major changes).
>>> >>
>>> >> To follow at some point.
>>> >
>>> >       Oh, I failed to mention at the given parameters the script takes almost 3 hours, during which the link should be otherwise idle...
>>> >
>>> >>>
>>> >>>
>>> >>>>
>>> >>>>> Whatever byte value is used for tc-stab makes no change.
>>> >>>
>>> >>>     I assume you talk about the overhead? Missing link layer adjustment will eat between 50% and 10% of your link bandwidth, while missing overhead values will be more benign. The only advise I can give is to pick the overhead that actually describes your link. I am willing to help you figure this out.
>>> >>
>>> >> The link is bridged LLC. Have been using 18 and 32 for test purposes. I shall move to PPPoA VC-MUX in 4 months.
>>> >
>>> >       I guess figuring out you exact overhead empirically is going to be fun.
>>> >
>>> >>>
>>> >>>>>
>>> >>>>> I have applied the ingress modification to simple.qos, keeping the original version., and tested both.
>>> >>>
>>> >>>     For which cerowrt version? It is only expected to do something for 3.10.9-1 and upwards, before that the HTB lionklayer adjustment did NOT work.
>>> >>
>>> >> Using 3.10.9-2
>>> >
>>> >       Yeah as stated above, I would recommend to use either or, not both. If you took RRUL data you might be able to compare the three conditions. I would estimate the most interesting part would be in the sustained ravager up and download rates here.
>>> 
>>> How do you obtain an average i.e. mean rate from the RRUL graph?
>>> >
>>> >
>>> >>
>>> >>>
>>> >>>>>
>>> >>>>> I have changed the Powerline adaptors I use to ones with known smaller buffers, though this is unlikely to be a ate-limiting step.
>>> >>>>>
>>> >>>>> I have changed the 2Wire gateway, known to be heavily buffered, with a bridged Huawei HG612, with a Broadcom 6368 SoC.
>>> >>>>>
>>> >>>>> This device has a permanently on telnet interface, with a simple password, which cannot be changed other than by firmware recompilation…
>>> >>>>>
>>> >>>>> Telnet, however, allows txqueuelen to be reduced from 1000 to 0.
>>> >>>>>
>>> >>>>> None of these changes affect the problematic uplink delay.
>>> >>>
>>> >>>     So how did you measure the uplink delay? The RRUL plots you sent me show an increase in ping RTT from around 50ms to 80ms with tc_stab and fq_codel on simplest.qos, how does that reconcile with 650ms uplink delay, netalyzr?
>>> >>
>>> >> Max Planck and Netalyzr produce the same figure. I use both, but Max Planck gives you circa 3 tries per IP address per 24 hours.
>>> >
>>> >       Well, both use the same method which is not to meaningful if you use fq_codel on a shaped link (unless you want to optimize your system for UDP floods :) )
>>> >
>>> > [snipp]
>>> >
>>> >
>>> > Best Regards
>>> >       Sebastian
>>> 
>>> _______________________________________________
>>> Cerowrt-devel mailing list
>>> Cerowrt-devel@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>> 
>>> 
>>> 
>>> -- 
>>> Dave Täht
>>> 
>>> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
>> 
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 19:38           ` Sebastian Moeller
  2013-08-23 19:47             ` Dave Taht
@ 2013-08-27 10:42             ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 43+ messages in thread
From: Jesper Dangaard Brouer @ 2013-08-27 10:42 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

On Fri, 23 Aug 2013 21:38:10 +0200
Sebastian Moeller <moeller0@gmx.de> wrote:

> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht@gmail.com> wrote:
> 
> > On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> > Hi List, hi Jesper,
[...]
> > It's my hope that the atm code works but is misconfigured. You can output the tc commands by overriding the TC variable with TC="echo tc" and paste here.
> 
> 	So I went for TC="logger tc" and used log read to harvest as I could not find the echo output, but I guess that should not matter. So here is the result (slightly edited to get rid of the log timestamps and log level):
> 
>   tc qdisc del dev ge00 root
>   tc qdisc add dev ge00 root handle 1: htb default 12
>   tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
[...]

Looks good.

>   tc qdisc del dev ifb0 root
>   tc qdisc add dev ifb0 root handle 1: htb default 12
>   tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate 15494kbit ceil 15494kbit
>   tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate 15494kbit ceil 15494kbit prio 0
>   tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate 32kbit ceil 5164kbit prio 1
>   tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 2
>   tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate 2582kbit ceil 15430kbit prio 3
[...]

Looks like the "linklayer adsl" is missing here.

> I notice it seem this only shows up for egress(), but looking at
> simple.qos ingress() is not addend ${ADSLL} at all so that is to be
> expected. There is nothing in dmesg at all.

You should also shape the downstream direction, as an atm/adsl link, or
else you will not be the bottleneck in this direction, and thus cannot
control the delay.  (and yes I do know, that we can be overloaded
downstream by UDP packets, and we depend on TCP to make shaping in this
direction work, but this is the best we can do, when not controlling
the DSLAM).


> So I am off to add ADSLL to ingress() as well and then test RRUL again...
> 
> 
> Jesper please let me know if this looks reasonable, at least to my eye it seems to fit with what "tc disc add htb help" tells me. I tried your:
> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
> but got no output even though debugs was already mounted…

That sounds weird!  Perhaps the kernel were simply not compiled with
CONFIG_DYNAMIC_DEBUG=y, or else my detection code is flawed.

The expected output should have been something like:

 ./tc_htb_linklayer01.sh eth2 atm 2430 0; dmesg -c | grep "TC linklayer:"
BW limit on dev eth2 at rate:2430kbit/s and ceil:2430kbit/s
tc qdisc del dev eth2 root
tc qdisc add dev eth2 root handle 1: htb default 50
tc class add dev eth2 parent 1: classid 1:1 htb rate 2430kbit ceil 2430kbit burst 1600 cburst 1600 linklayer atm mpu 0
tc class add dev eth2 parent 1:3 classid 1:50 htb rate 2430kbit ceil 2430kbit prio 5 burst 1600 cburst 1600 linklayer atm mpu 0
tc class add dev eth2 parent 1:3 classid 1:60 htb rate 2430kbit ceil 2430kbit prio 5 burst 1600 cburst 1600 linklayer atm mpu 0
tc qdisc add dev eth2 parent 1:50 handle 4250: fq_codel
[109653.174322] TC linklayer: Detected ATM, low(0)=high(5)=2718
[109653.174327] TC linklayer: Detected ATM, low(0)=high(5)=2718
[109653.181403] TC linklayer: Detected ATM, low(0)=high(5)=2718
[109653.181408] TC linklayer: Detected ATM, low(0)=high(5)=2718
[109653.190105] TC linklayer: Detected ATM, low(0)=high(5)=2718
[109653.190108] TC linklayer: Detected ATM, low(0)=high(5)=2718



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 19:56               ` Sebastian Moeller
  2013-08-23 20:29                 ` Dave Taht
@ 2013-08-27 10:45                 ` Jesper Dangaard Brouer
  2013-08-30 15:46                 ` [Cerowrt-devel] some kernel updates + new userspace patch Jesper Dangaard Brouer
  2 siblings, 0 replies; 43+ messages in thread
From: Jesper Dangaard Brouer @ 2013-08-27 10:45 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: cerowrt-devel

On Fri, 23 Aug 2013 21:56:02 +0200
Sebastian Moeller <moeller0@gmx.de> wrote:

> Hi Dave,
> 
> I guess I found the culprit:
> 
> once I added $ADSLL to the ingress() in simple.qos:

Yes, exactly!

> ingress() {
[...]
> 
> $TC qdisc del dev $DEV root  2> /dev/null
> $TC qdisc add dev $DEV root handle 1: ${STABSTRING} htb default 12
> $TC class add dev $DEV parent 1: classid 1:1 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:10 htb $LQ rate ${CEIL}kbit ceil ${CEIL}kbit prio 0 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:11 htb $LQ rate 32kbit ceil ${PRIO_RATE}kbit prio 1 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:12 htb $LQ rate ${BE_RATE}kbit ceil ${BE_CEIL}kbit prio 2 $ADSLL
> $TC class add dev $DEV parent 1:1 classid 1:13 htb $LQ rate ${BK_RATE}kbit ceil ${BE_CEIL}kbit prio 3 $ADSLL
[...]
> 
>
> I get basically the same RRUL ping RTTs for htb_private as for tc_stab.
> So Jesper was right the patch seems to fix the issue. I guess I should
> send out my current version of yours and Toke's AQM scripts soon.

Great news, so my patch did work as expected! :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates
  2013-08-23 20:29                 ` Dave Taht
  2013-08-24 20:51                   ` Sebastian Moeller
  2013-08-24 20:51                   ` Sebastian Moeller
@ 2013-08-27 11:10                   ` Jesper Dangaard Brouer
  2 siblings, 0 replies; 43+ messages in thread
From: Jesper Dangaard Brouer @ 2013-08-27 11:10 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

On Fri, 23 Aug 2013 13:29:25 -0700
Dave Taht <dave.taht@gmail.com> wrote:

> Does anyone have a date/kernel version on when linklayer overhead
> compensation stopped working?

ADSL/ATM linklayer shaping were broken in kernel release from 3.8 to
3.10 by commit 56b765b79 ("htb: improved accuracy at high rates") in
v3.8-rc1~139^2~455.

> There was a bug even prior to 3.8 that looked bad.

Eric Dumazet also found a general "linklayer atm" regression
(dating way-back), which could cause rate-tables to get wrongly shared
between the same rates with-and-without linklayer atm settings.

Addressed/fixed in (3.10-rc6:):
 - commit 40edeff6e1c (net_sched: qdisc_get_rtab() must check data[] array)

When configuring two completely equal rates, the kernel detects that
these two equal rates can share the same rate-table.  But the kernel
didn't check if the rate-table data had been modified, which is done
in the linklayer atm case.

In practice, this often isn't a problem, as overhead parameter is
usually combined with the linklayer parameter.

See my description here:
 http://thread.gmane.org/gmane.linux.kernel.stable/62528

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Cerowrt-devel] some kernel updates + new userspace patch
  2013-08-23 19:56               ` Sebastian Moeller
  2013-08-23 20:29                 ` Dave Taht
  2013-08-27 10:45                 ` Jesper Dangaard Brouer
@ 2013-08-30 15:46                 ` Jesper Dangaard Brouer
  2 siblings, 0 replies; 43+ messages in thread
From: Jesper Dangaard Brouer @ 2013-08-30 15:46 UTC (permalink / raw)
  To: Sebastian Moeller, Dave Taht; +Cc: cerowrt-devel


[long-thread-cut]

Sebastian, noted at some point that is was difficult to determine if
the tc configuration were using the linklayer option.

I have posted a patch against iproute2/tc, that implement reading out
the linklayer setting (requires my kernel patch discussed in this
thread).

See:
 http://thread.gmane.org/gmane.linux.network/281876

This patch also sends the linklayer option to the kernel directly,
which mean that the kernel don't need to perform linklayer detection
based on the rate table (which were not accurate at rates above
100Mbit/s).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2013-08-30 15:46 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-21 18:42 [Cerowrt-devel] some kernel updates Dave Taht
     [not found] ` <56B261F1-2277-457C-9A38-FAB89818288F@gmx.de>
     [not found]   ` <CAA93jw6ku0OOXzNcAUtK4UL5uc7R2zVAOKo1+Fwzmr7gCH1pzA@mail.gmail.com>
     [not found]     ` <2148E2EF-A119-4499-BAC1-7E647C53F077@gmx.de>
2013-08-23  0:52       ` Sebastian Moeller
2013-08-23  5:13         ` Dave Taht
2013-08-23  7:27           ` Jesper Dangaard Brouer
2013-08-23 10:15             ` Sebastian Moeller
2013-08-23 11:16               ` Jesper Dangaard Brouer
2013-08-23 12:37                 ` Sebastian Moeller
2013-08-23 13:02                   ` Fred Stratton
2013-08-23 19:49                     ` Sebastian Moeller
2013-08-23 15:05                   ` Jesper Dangaard Brouer
2013-08-23 17:23                   ` Toke Høiland-Jørgensen
2013-08-23 20:09                     ` Sebastian Moeller
2013-08-23 20:46                       ` Toke Høiland-Jørgensen
2013-08-24 20:51                         ` Sebastian Moeller
2013-08-23 19:51             ` Sebastian Moeller
2013-08-23  9:16           ` Sebastian Moeller
2013-08-23 19:38           ` Sebastian Moeller
2013-08-23 19:47             ` Dave Taht
2013-08-23 19:56               ` Sebastian Moeller
2013-08-23 20:29                 ` Dave Taht
2013-08-24 20:51                   ` Sebastian Moeller
2013-08-24 20:51                   ` Sebastian Moeller
2013-08-25  9:21                     ` Fred Stratton
2013-08-25 10:17                       ` Fred Stratton
2013-08-25 13:59                         ` Sebastian Moeller
2013-08-25 14:26                           ` Fred Stratton
2013-08-25 14:31                             ` Fred Stratton
2013-08-25 17:53                             ` Sebastian Moeller
2013-08-25 17:55                               ` Dave Taht
2013-08-25 18:00                                 ` Fred Stratton
2013-08-25 18:30                               ` Fred Stratton
2013-08-25 18:41                                 ` Dave Taht
2013-08-25 19:08                                   ` Fred Stratton
2013-08-25 19:31                                     ` Fred Stratton
2013-08-25 21:54                                       ` Sebastian Moeller
2013-08-25 20:28                                     ` Dave Taht
2013-08-25 21:40                                       ` Sebastian Moeller
2013-08-25 21:50                                 ` Sebastian Moeller
2013-08-27 11:10                   ` Jesper Dangaard Brouer
2013-08-27 10:45                 ` Jesper Dangaard Brouer
2013-08-30 15:46                 ` [Cerowrt-devel] some kernel updates + new userspace patch Jesper Dangaard Brouer
2013-08-27 10:42             ` [Cerowrt-devel] some kernel updates Jesper Dangaard Brouer
2013-08-24 23:08           ` Sebastian Moeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox