[Cerowrt-devel] 3.3.6-2

Robert Bradley robert.bradley1 at gmail.com
Sun Jun 3 18:24:33 EDT 2012


On 02/06/12 08:03, Sebastian Moeller wrote:
> 	 From my totally unscientific testing I am quite convinced that even 16MB of /tmp used will make the router spiral into reboot if used over the 5GHz radio to the wan port. However, if I use one of the wired ports I get plenty of the following (not always hostapd):
>
>
> Jun  1 23:41:08 nacktmulle kern.warn kernel: [185428.417968] hostapd: page allocation failure: order:0, mode:0x4020
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] Call Trace:
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<802850a4>] dump_stack+0x8/0x34
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800b4548>] warn_alloc_failed+0xe8/0x10c
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800b684c>] __alloc_pages_nodemask+0x5a0/0x600
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800da070>] new_slab+0xa8/0x280
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<80286b18>] __slab_alloc.isra.60.constprop.63+0x25c/0x2fc
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800dba48>] __kmalloc_track_caller+0x88/0x140
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e0854>] __alloc_skb+0x80/0x140
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e0930>] dev_alloc_skb+0x1c/0x48
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801d0c74>] ag71xx_poll+0x430/0x65c
> Jun  1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e8c10>] net_rx_action+0x88/0x1c8
> Jun  1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] hostapd: page allocation failure: order:0, mode:0x4020
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Call Trace:
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<802850a4>] dump_stack+0x8/0x34
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800b4548>] warn_alloc_failed+0xe8/0x10c
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800b684c>] __alloc_pages_nodemask+0x5a0/0x600
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800da070>] new_slab+0xa8/0x280
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<80286b18>] __slab_alloc.isra.60.constprop.63+0x25c/0x2fc
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800dba48>] __kmalloc_track_caller+0x88/0x140
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801e0854>] __alloc_skb+0x80/0x140
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801e0930>] dev_alloc_skb+0x1c/0x48
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801d0c74>] ag71xx_poll+0x430/0x65c
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375]
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Mem-Info:
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal per-cpu:
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] CPU    0: hi:   18, btch:   3 usd:  18
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] active_anon:3826 inactive_anon:63 isolated_anon:0
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375]  active_file:683 inactive_file:561 isolated_file:0
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375]  unevictable:0 dirty:0 writeback:0 unstable:0
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375]  free:96 slab_reclaimable:408 slab_unreclaimable:7706
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375]  mapped:501 shmem:109 pagetables:142 bounce:0
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal free:384kB min:1016kB low:1268kB high:1524kB active_anon:15304kB inactive_anon:252kB active_file:2732kB inactive_file:2244kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:65024kB mlocked:0k
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] lowmem_reserve[]: 0 0
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal: 42*4kB 15*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 384kB
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 1353 total pagecache pages
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 0 pages in swap cache
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Swap cache stats: add 0, delete 0, find 0/0
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Free swap  = 0kB
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Total swap = 0kB
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 16384 pages RAM
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 965 pages reserved
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 1399 pages shared
> Jun  1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 14306 pages non-shared
> Jun  1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> Jun  1 23:41:09 nacktmulle kern.warn kernel: [185429.484375]   cache: kmalloc-2048, object size: 2048, buffer size: 2048, default order: 2, min order: 0
> Jun  1 23:41:09 nacktmulle kern.warn kernel: [185429.484375]   node 0: slabs: 0, objs: 0, free: 0
>
> But the box seems to survive this… Heck this even survives my test case with 16000 KB used of /tmp. Under that amount of memory pressure named and ntpd get killed but the router does go into automatically reboot, it just stays up and running albeit somewhat useless without named.
>

Yes - that stack trace is because the ag71xx driver can't allocate the 
memory for a skb structure.  Unlike the wireless driver though, the 
ag71xx_poll function simply returns immediately with ENOMEM.  I had no 
real success in tracing what the equivalent is in ath9k.

I noticed a possible issue in ath9k_rx_tasklet, since if 
bf->bf_mpdu=NULL (bf being an Atheros-specific buffer type) you could 
potentially get an infinite loop.  I can't see though if that can ever 
occur in reality.  I *think* it uses a list of skb structures 
preallocated at init-time for incoming frames, but I'm still trying to 
interpret that part of the code.  (The exact behaviour is 
hardware-dependent.)

> 	The way I interpret my latest test results is that the "assumed leak" should be restricted to the wireless driver, does that sound right to you? Also with cerowrt 3.3.6-2 even 16MB seem to much for /tmp. I will see what happens if I add some swap space to the router, I hope it will be quite happy with 31MB /tmp and actual usage of that space :). Since Dave only recommends full tftp reflashes  maybe the update scenario might not be such a big issue for cerowrt?
>

I'll leave that to Dave to say - I was assuming that the firmware would 
be stored in memory first and then flashed.  (There's always tftp at 
boot time as an alternative flashing method.)
-- 
Robert Bradley



More information about the Cerowrt-devel mailing list