[Cerowrt-devel] 3.3.6-2
Robert Bradley
robert.bradley1 at gmail.com
Sun Jun 3 18:24:33 EDT 2012
On 02/06/12 08:03, Sebastian Moeller wrote:
> From my totally unscientific testing I am quite convinced that even 16MB of /tmp used will make the router spiral into reboot if used over the 5GHz radio to the wan port. However, if I use one of the wired ports I get plenty of the following (not always hostapd):
>
>
> Jun 1 23:41:08 nacktmulle kern.warn kernel: [185428.417968] hostapd: page allocation failure: order:0, mode:0x4020
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] Call Trace:
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<802850a4>] dump_stack+0x8/0x34
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800b4548>] warn_alloc_failed+0xe8/0x10c
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800b684c>] __alloc_pages_nodemask+0x5a0/0x600
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800da070>] new_slab+0xa8/0x280
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<80286b18>] __slab_alloc.isra.60.constprop.63+0x25c/0x2fc
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800dba48>] __kmalloc_track_caller+0x88/0x140
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e0854>] __alloc_skb+0x80/0x140
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e0930>] dev_alloc_skb+0x1c/0x48
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801d0c74>] ag71xx_poll+0x430/0x65c
> Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e8c10>] net_rx_action+0x88/0x1c8
> Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] hostapd: page allocation failure: order:0, mode:0x4020
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Call Trace:
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<802850a4>] dump_stack+0x8/0x34
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800b4548>] warn_alloc_failed+0xe8/0x10c
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800b684c>] __alloc_pages_nodemask+0x5a0/0x600
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800da070>] new_slab+0xa8/0x280
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<80286b18>] __slab_alloc.isra.60.constprop.63+0x25c/0x2fc
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800dba48>] __kmalloc_track_caller+0x88/0x140
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801e0854>] __alloc_skb+0x80/0x140
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801e0930>] dev_alloc_skb+0x1c/0x48
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801d0c74>] ag71xx_poll+0x430/0x65c
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375]
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Mem-Info:
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal per-cpu:
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] CPU 0: hi: 18, btch: 3 usd: 18
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] active_anon:3826 inactive_anon:63 isolated_anon:0
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] active_file:683 inactive_file:561 isolated_file:0
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] unevictable:0 dirty:0 writeback:0 unstable:0
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] free:96 slab_reclaimable:408 slab_unreclaimable:7706
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] mapped:501 shmem:109 pagetables:142 bounce:0
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal free:384kB min:1016kB low:1268kB high:1524kB active_anon:15304kB inactive_anon:252kB active_file:2732kB inactive_file:2244kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:65024kB mlocked:0k
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] lowmem_reserve[]: 0 0
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal: 42*4kB 15*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 384kB
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 1353 total pagecache pages
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 0 pages in swap cache
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Swap cache stats: add 0, delete 0, find 0/0
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Free swap = 0kB
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Total swap = 0kB
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 16384 pages RAM
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 965 pages reserved
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 1399 pages shared
> Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 14306 pages non-shared
> Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] cache: kmalloc-2048, object size: 2048, buffer size: 2048, default order: 2, min order: 0
> Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] node 0: slabs: 0, objs: 0, free: 0
>
> But the box seems to survive this… Heck this even survives my test case with 16000 KB used of /tmp. Under that amount of memory pressure named and ntpd get killed but the router does go into automatically reboot, it just stays up and running albeit somewhat useless without named.
>
Yes - that stack trace is because the ag71xx driver can't allocate the
memory for a skb structure. Unlike the wireless driver though, the
ag71xx_poll function simply returns immediately with ENOMEM. I had no
real success in tracing what the equivalent is in ath9k.
I noticed a possible issue in ath9k_rx_tasklet, since if
bf->bf_mpdu=NULL (bf being an Atheros-specific buffer type) you could
potentially get an infinite loop. I can't see though if that can ever
occur in reality. I *think* it uses a list of skb structures
preallocated at init-time for incoming frames, but I'm still trying to
interpret that part of the code. (The exact behaviour is
hardware-dependent.)
> The way I interpret my latest test results is that the "assumed leak" should be restricted to the wireless driver, does that sound right to you? Also with cerowrt 3.3.6-2 even 16MB seem to much for /tmp. I will see what happens if I add some swap space to the router, I hope it will be quite happy with 31MB /tmp and actual usage of that space :). Since Dave only recommends full tftp reflashes maybe the update scenario might not be such a big issue for cerowrt?
>
I'll leave that to Dave to say - I was assuming that the firmware would
be stored in memory first and then flashed. (There's always tftp at
boot time as an alternative flashing method.)
--
Robert Bradley
More information about the Cerowrt-devel
mailing list