[Cerowrt-devel] some kernel updates

Dave Taht dave.taht at gmail.com
Fri Aug 23 15:47:35 EDT 2013


quick note: running this script requires that you

ifconfig ifb0 up

at some point.


On Fri, Aug 23, 2013 at 12:38 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:

> Hi Dave,
>
> On Aug 23, 2013, at 07:13 , Dave Taht <dave.taht at gmail.com> wrote:
>
> >
> >
> >
> > On Thu, Aug 22, 2013 at 5:52 PM, Sebastian Moeller <moeller0 at gmx.de>
> wrote:
> > Hi List, hi Jesper,
> >
> > So I tested 3.10.9-1 to assess the status of the HTB atm link layer
> adjustments to see whether the recent changes resurrected this feature.
> >         Unfortunately the htb_private link layer adjustments still is
> broken (RRUL ping RTT against Toke's netperf host in Germany of ~80ms, same
> as without link layer adjustments). On the bright side the tc_stab method
> still works as well as before (ping RTT around 40ms).
> >         I would like to humbly propose to use the tc stab method in
> cerowrt to perform ATM link layer adjustments as default. To repeat myself,
> simply telling the kernel a lie about the packet size seems more robust
> than fudging HTB's rate tables. Especially since the kernel already fudges
> the packet size to account for the ethernet header and then some, so this
> path should receive more scrutiny by virtue of having more users?
> >
> > It's my hope that the atm code works but is misconfigured. You can
> output the tc commands by overriding the TC variable with TC="echo tc" and
> paste here.
>
>         So I went for TC="logger tc" and used log read to harvest as I
> could not find the echo output, but I guess that should not matter. So here
> is the result (slightly edited to get rid of the log timestamps and log
> level):
>
>   tc qdisc del dev ge00 root
>   tc qdisc add dev ge00 root handle 1: htb default 12
>   tc class add dev ge00 parent 1: classid 1:1 htb quantum 1500 rate
> 2430kbit ceil 2430kbit mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:10 htb quantum 1500 rate
> 2430kbit ceil 2430kbit prio 0 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:11 htb quantum 1500 rate
> 128kbit ceil 810kbit prio 1 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:12 htb quantum 1500 rate
> 405kbit ceil 2366kbit prio 2 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc class add dev ge00 parent 1:1 classid 1:13 htb quantum 1500 rate
> 405kbit ceil 2366kbit prio 3 mpu 0 linklayer adsl overhead 40 mtu 2047
>   tc qdisc add dev ge00 parent 1:11 handle 110: fq_codel limit 600 noecn
> quantum 300
>   tc qdisc add dev ge00 parent 1:12 handle 120: fq_codel limit 600 noecn
> quantum 300
>   tc qdisc add dev ge00 parent 1:13 handle 130: fq_codel limit 600 noecn
> quantum 300
>   tc filter add dev ge00 parent 1:0 protocol all prio 999 u32 match ip
> protocol 0 0x00 flowid 1:12
>   tc filter add dev ge00 parent 1:0 protocol ip prio 1 handle 1 fw classid
> 1:11
>   tc filter add dev ge00 parent 1:0 protocol ip prio 2 handle 2 fw classid
> 1:12
>   tc filter add dev ge00 parent 1:0 protocol ip prio 3 handle 3 fw classid
> 1:13
>   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 4 handle 1 fw
> classid 1:11
>   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 5 handle 2 fw
> classid 1:12
>   tc filter add dev ge00 parent 1:0 protocol ipv6 prio 6 handle 3 fw
> classid 1:13
>   tc filter add dev ge00 parent 1:0 protocol arp prio 7 handle 1 fw
> classid 1:11
>   tc qdisc del dev ge00 handle ffff: ingress
>   tc qdisc add dev ge00 handle ffff: ingress
>   tc qdisc del dev ifb0 root
>   tc qdisc add dev ifb0 root handle 1: htb default 12
>   tc class add dev ifb0 parent 1: classid 1:1 htb quantum 1500 rate
> 15494kbit ceil 15494kbit
>   tc class add dev ifb0 parent 1:1 classid 1:10 htb quantum 1500 rate
> 15494kbit ceil 15494kbit prio 0
>   tc class add dev ifb0 parent 1:1 classid 1:11 htb quantum 1500 rate
> 32kbit ceil 5164kbit prio 1
>   tc class add dev ifb0 parent 1:1 classid 1:12 htb quantum 1500 rate
> 2582kbit ceil 15430kbit prio 2
>   tc class add dev ifb0 parent 1:1 classid 1:13 htb quantum 1500 rate
> 2582kbit ceil 15430kbit prio 3
>   tc qdisc add dev ifb0 parent 1:11 handle 110: fq_codel limit 1000 ecn
> quantum 500
>   tc qdisc add dev ifb0 parent 1:12 handle 120: fq_codel limit 1000 ecn
> quantum 1500
>   tc qdisc add dev ifb0 parent 1:13 handle 130: fq_codel limit 1000 ecn
> quantum 1500
>   tc filter add dev ifb0 parent 1:0 protocol all prio 999 u32 match ip
> protocol 0 0x00 flowid 1:12
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 1 u32 match ip tos
> 0x00 0xfc classid 1:12
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 2 u32 match ip6
> priority 0x00 0xfc classid 1:12
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 3 u32 match ip tos
> 0x20 0xfc classid 1:13
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 4 u32 match ip6
> priority 0x20 0xfc classid 1:13
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 5 u32 match ip tos
> 0x10 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 6 u32 match ip6
> priority 0x10 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 7 u32 match ip tos
> 0xb8 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 8 u32 match ip6
> priority 0xb8 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 9 u32 match ip tos
> 0xc0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 10 u32 match ip6
> priority 0xc0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 11 u32 match ip tos
> 0xe0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 12 u32 match ip6
> priority 0xe0 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ip parent 1:0 prio 13 u32 match ip tos
> 0x90 0xfc classid 1:11
>   tc filter add dev ifb0 protocol ipv6 parent 1:0 prio 14 u32 match ip6
> priority 0x90 0xfc classid 1:11
>   tc filter add dev ifb0 parent 1:0 protocol arp prio 15 handle 1 fw
> classid 1:11
>   tc filter add dev ge00 parent ffff: protocol all prio 10 u32 match u32 0
> 0 flowid 1:1 action mirred egress redirect dev ifb0
>
> I notice it seem this only shows up for egress(), but looking at
> simple.qos ingress() is not addend ${ADSLL} at all so that is to be
> expected. There is nothing in dmesg at all.
>
> So I am off to add ADSLL to ingress() as well and then test RRUL again...
>
>
> Jesper please let me know if this looks reasonable, at least to my eye it
> seems to fit with what "tc disc add htb help" tells me. I tried your:
> echo "func __detect_linklayer +p" /sys/kernel/debug/dynamic_debug/control
> but got no output even though debugs was already mounted…
>
> Best
>         Sebastian
>
> >
> >         Now, I have been testing this using Dave's most recent cerowrt
> alpha version with a 3.10.9 kernel on mips hardware, I think this kernel
> should contain all htb fixes including commit 8a8e3d84b17 (net_sched:
> restore "linklayer atm" handling) but am not fully sure.
> >
> > It does.
> >
> > `@Dave is there an easy way to find which patches you applied to the
> kernels of the cerowrt (testing-)releases?
> >
> > Normally I DO commit stuff that is in testing, but my big push this time
> around was to get everything important into mainline 3.10, as it will be
> the "stable" release for a good long time.
> >
> > So I am still mostly working the x86 side at the moment. I WAS kind of
> hoping that everything I just landed would make it up to 3.10. But for your
> perusal:
> >
> > http://snapon.lab.bufferbloat.net/~cero2/patches/3.10.9-1/ has most of
> the kernel patches I used in it. 3.10.9-2 has the ipv6subtrees patch ripped
> out due to another weird bug I'm looking at. (It also has support for ipv6
> nat thx to the ever prolific stephen walker heeding the call for
> patches...). 100% totally untested, I have this weird bug to figure out how
> to fix next:
> >
> >
> http://lists.alioth.debian.org/pipermail/babel-users/2013-August/001419.html
> >
> > I fear it's a comparison gone south, maybe in bradley's optimizations
> for not kernel trapping, don't know.
> >
> > 3.10.9-2 also disables dnsmasq's dhcpv6 in favor of 6relayd. I HATE
> losing the close naming integration, but, had to try this....
> >
> > If you guys want me to start committing and pushing patches again, I'll
> do it, but most of that stuff will end up in 3.10.10, I think, in a couple
> days. The rest might make 3.12. Pie has to survive scrutiny on the netdev
> list in particular.
> >
> > While I have you r attention :) I also tested 3.10.9-1's pie and it is
> way better than 3.10.6-1's (RRUL ping RTTs around 110 ms instead of 3000ms)
> but still worse than fq_codel (ping RTTs around 40ms with proper atm link
> layer adjustments).
> >
> > This is with simple.qos I imagine? Simplest should do better than that
> with pie. Judging from how its estimator works I think it will do badly
> with multiple queues. But testing will tell...
> >
> > But, yea, this pie is actually usable, and the previous wasn't. Thank
> you for looking at it!
> >
> > It is different from cisco's last pie drop in that it can do ecn, does
> local congestion notification, has a better use of net_random, it's mostly
> KernelStyle, and I forget what else.
> >
> > There is still a major rounding error in the code, and I'd like cisco to
> fix the api so it uses identical syntax to codel. Right now you specify
> "target 8" to get "target 7", and the "ms" is implied. target 5 becomes
> target 3. The default target is a whopping 20 (rounded to 19), which is in
> part where your 70+ms of extra delay came from.
> >
> > Multiple parties have the delusion that 20ms is "good enough".
> >
> > Part of the remaining delay may also be rounding error. Cisco uses
> kernels with HZ=1000, cero uses HZ=250.....
> >
> > Anyway, to get more comparable tests... you can fiddle with the two
> $QDISC lines in simple*.qos to add a target 8 to get closer to a codel 5ms
> config, but that would break a codel config which treats target 8 as target
> 8us.
> >
> > I MIGHT, if I get energetic enough, fix the API, the time accounting,
> and a few other things in pie, the problem is, that ns2_codel seems still
> more effective on most workloads and *fq_codel smokes absolutely
> everything. There are a few places where pie is a win over straight codel,
> notably on packet floods. And it may well be easier to retrofit into
> existing hardware fast path designs.
> >
> > I worry about interactions between pie and other stuff. It seems
> inevitable at this point that some form of pie will be widely deployed, and
> I simply haven't tried enough traffic types and RTTs to draw a firm
> conclusion, period. Long RTTs are the last big place where codel and pie
> and fq_codel have to be seriously tested.
> >
> > ns2_codel is looking pretty good now, at the shorter RTTs I've tried. A
> big problem I have is getting decent long RTT emulation out of netem (some
> preliminary code is up at github)
> >
> > ... and getting cero stable enough for others to actually use - next up
> is fixing the userspace problems.
> >
> > ... and trying to make a small dent in the wifi problem along the way
> (couple commits coming up)
> >
> > ... and find funding to get through the winter.
> >
> > There's probably a few other things that are on that list but I forget.
> Oh, yea, since the aqm wg was voted on to be formed, I decided I could quit
> smoking.
> >
> > While I am not able to build kernels, it seems that I am able to quickly
> test whether link layer adjustments work or not. SO aim happy to help where
> I can :)
> >
> > Give pie target 8 and target 5 a shot, please? ns2_codel target 3ms and
> target 7ms, too. fq_codel, same....
> >
> > tc -s qdisc show dev ge00
> > tc -s qdisc show dev ifb0
> >
> > would be useful info to have in general after each test.
> >
> > TIA.
> >
> > There are also things like tcp_upload and tcp_download and
> tcp_bidirectional that are useful tests in the rrul suite.
> >
> > Thank you for your efforts on these early alpha releases. I hope things
> will stablize more soon, and I'll fold your aqm stuff into my next attempt
> this weekend.
> >
> > This is some of the stuff I know that needs fixing in userspace:
> >
> > * TODO readlink not found
> > * TODO netdev user missing
> > * TODO Wed Dec  5 17:14:46 2012 authpriv.error dnsmasq: found already
> running DHCP-server on interface 'se00' refusing to start, use 'option
> force 1' to override
> > * TODO [   18.480468] Mirror/redirect action on
> > [   18.539062] Failed to load ipt action
> > * upload and download are reversed in aqm
> > * BCP38
> > * Squash CS values
> > * Replace ntp
> > * Make ahcp client mode
> > * Drop more privs for polipo
> > * upnp
> > * priv separation
> > * Review FW rules
> > * dhcpv6 support
> > * uci-defaults/make-cert.sh uses a bad path for px5g
> > * Doesn't configure the web browser either
> >
> >
> >
> >
> > Best
> >         Sebastian
> >
> >
> >
> >
> > --
> > Dave Täht
> >
> > Fixing bufferbloat with cerowrt:
> http://www.teklibre.com/cerowrt/subscribe.html
>
>


-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/cerowrt-devel/attachments/20130823/a8d05100/attachment-0002.html>


More information about the Cerowrt-devel mailing list