[Cerowrt-devel] cerowrt 3.3.8-17 is released
Sebastian Moeller
moeller0 at gmx.de
Thu Aug 16 02:09:47 EDT 2012
Hi Dave,
marvelous.
On Aug 15, 2012, at 9:58 PM, Dave Taht wrote:
> Firstly fq_codel will always stay very flat relative to your workload
> for sparse streamss such as a ping or voip dns or gaming...
>
> It's good stuff.
>
> And, I think the source of your 2.8 second thing is fq_codel's current
> reaction time, the non-responsiveness of the udp flooding netanylzer
> uses
> and huge default queue depth in openwrt's qos scripts.
>
> Try this:
>
> cero1 at snapon:~/src/Cerowrt-3.3.8/package/qos-scripts/files/usr/lib/qos$
> git diff tcrules.awk
> diff --git a/package/qos-scripts/files/usr/lib/qos/tcrules.awk
> b/package/qos-scripts/files/usr/lib/qos/tcrules
> index a19b651..f3e0d3f 100644
> --- a/package/qos-scripts/files/usr/lib/qos/tcrules.awk
> +++ b/package/qos-scripts/files/usr/lib/qos/tcrules.awk
> @@ -79,7 +79,7 @@ END {
> # leaf qdisc
> avpkt = 1200
> for (i = 1; i <= n; i++) {
> - print "tc qdisc add dev "device" parent 1:"class[i]"0
> handle "class[i]"00: fq_codel"
> + print "tc qdisc add dev "device" parent 1:"class[i]"0
> handle "class[i]"00: fq_codel limit 1200
> }
>
> # filter rule
>
So openwrt's qos is still at the 10k packet limit for fq_codel? That means worst case 14.3 MB queue (at 1500 byte packages), best case 0.6103515625 MB (64byte packages), the worst case of which would take around 3 seconds to drain, maybe that is my issue. I will immediately try your patch. Done, now netalyzr reports 1100ms buffering down from 2800ms (and no ath: skbuff alloc of size 1926 failed messages in dmesg, but these did not show up during netalyzr runs). Now the other UDP stress test now works much better (reporting around 1200ms uplink buffering) producing no ath allocation failures. Switching to the hifgr downlink version of the test gave me:
[75755.714843] hostapd: page allocation failure: order:0, mode:0x4020
[75755.714843] Call Trace:
[75755.714843] [<80287200>] dump_stack+0x8/0x34
[75755.714843] [<800b4e28>] warn_alloc_failed+0xe8/0x10c
[75755.714843] [<800b712c>] __alloc_pages_nodemask+0x5a0/0x600
[75755.714843] [<800da950>] new_slab+0xa8/0x280
[75755.714843] [<80288c74>] __slab_alloc.isra.60.constprop.63+0x25c/0x2fc
[75755.714843] [<800db4f8>] kmem_cache_alloc+0x38/0xe0
[75755.714843] [<801d1b68>] ag71xx_fill_rx_buf+0x34/0xd8
[75755.714843] [<801d2458>] ag71xx_poll+0x464/0x5f4
[75755.714843] [<801ea3d0>] net_rx_action+0x88/0x1c8
[75755.714843] [<80077458>] __do_softirq+0xa0/0x154
[75755.714843] [<80077668>] do_softirq+0x48/0x68
[75755.714843] [<8007789c>] irq_exit+0x4c/0xb4
[75755.714843] [<80062f8c>] ret_from_irq+0x0/0x4
[75755.714843] [<801757a8>] lzma_main+0x9ec/0xbec
[75755.714843] [<80175ef4>] xz_dec_lzma2_run+0x54c/0x824
[75755.714843] [<801744bc>] xz_dec_run+0x31c/0x8f4
[75755.714843] [<80132e74>] squashfs_xz_uncompress+0x164/0x274
[75755.714843] [<8012f368>] squashfs_read_data+0x4a8/0x660
[75755.714843] [<8012f6f4>] squashfs_cache_get+0x1d4/0x30c
[75755.714843] [<80130be8>] squashfs_readpage+0x56c/0x804
[75755.714843] [<800ba130>] __do_page_cache_readahead+0x1b0/0x22c
[75755.714843] [<800ba4b4>] ra_submit+0x28/0x34
[75755.714843] [<800b2e68>] filemap_fault+0x184/0x3cc
[75755.714843] [<800c7fd4>] __do_fault+0xcc/0x450
[75755.714843] [<800cad5c>] handle_pte_fault+0x330/0x6d4
[75755.714843] [<800cb1b4>] handle_mm_fault+0xb4/0xe0
[75755.714843] [<8006c210>] do_page_fault+0x110/0x350
[75755.714843] [<80062f80>] ret_from_exception+0x0/0xc
[75755.714843]
[75755.714843] Mem-Info:
[75755.714843] Normal per-cpu:
[75755.714843] CPU 0: hi: 18, btch: 3 usd: 5
[75755.714843] active_anon:1493 inactive_anon:2534 isolated_anon:0
[75755.714843] active_file:1623 inactive_file:1944 isolated_file:0
[75755.714843] unevictable:0 dirty:0 writeback:16 unstable:0
[75755.714843] free:95 slab_reclaimable:589 slab_unreclaimable:4876
[75755.714843] mapped:1030 shmem:25 pagetables:163 bounce:0
[75755.714843] Normal free:380kB min:1016kB low:1268kB high:1524kB active_anon:5972kB inactive_anon:10136kB active_file:6492kB inactive_file:7776kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:65024kB mlocked:0kB dirty:0kB writeback:64kB mapped:4120kB shmem:100kB slab_reclaimable:2356kB slab_unreclaimable:19504kB kernel_stack:552kB pagetables:652kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[75755.714843] lowmem_reserve[]: 0 0
[75755.714843] Normal: 57*4kB 19*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 380kB
[75755.714843] 4204 total pagecache pages
[75755.714843] 611 pages in swap cache
[75755.714843] Swap cache stats: add 1899, delete 1288, find 802/926
[75755.714843] Free swap = 973548kB
[75755.714843] Total swap = 976560kB
[75755.714843] 16384 pages RAM
[75755.714843] 973 pages reserved
[75755.714843] 4143 pages shared
[75755.714843] 13118 pages non-shared
[75755.714843] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[75755.714843] cache: kmalloc-2048, object size: 2048, buffer size: 2048, default order: 2, min order: 0
[75755.714843] node 0: slabs: 0, objs: 0, free: 0
[75755.718750] ge00: out of memory
(I would have loved to try again, but that specific application restricts e to 2 or 3 invocations per 24 hour periode which I already used up; I really need to find another stress tester some of these days).
But bind survived intact. So thanks for the quick surgery on QOS that surely improved things by a lot. Shall I try to request this change in openWRT proper? I think that for most home routers allowing for >14MB queues to build up in the device sure can cause havoc to stability (I shudder while thinking about routers with 32 or even 16MB ram, and even these could/should profit from codel; so my take is the limit needs to be scaled with available memory wit a potential ceiling at 10k, :) )
Thanks again & best regards
Sebastian
>
> --
> Dave Täht
> http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out
> with fq_codel!"
More information about the Cerowrt-devel
mailing list