From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-we0-f171.google.com (mail-we0-f171.google.com [74.125.82.171]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id D74892006AA for ; Sun, 3 Jun 2012 15:24:39 -0700 (PDT) Received: by wejx9 with SMTP id x9so5147261wej.16 for ; Sun, 03 Jun 2012 15:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=uzhdHJDfwHJC/+60IBjTdMCknoUngZUoEiNJgdWLDEc=; b=sv8Qo6Vkxto+7zqZIUzHg+LBATCNhQIm2DI8h56tYnugIRKKGIxEMhUh0x0qF+OCI+ 4uIIDrJCdoqM22TfTcSPbSdtllLLAuhRXsSA+/LJKDdn2ANBj1D6TNF22TTKpjKTGEJ/ pcC42zf+vr/EhY6u1SPzzEjetHiAyr6DK74e/rkhP5od2iQM8Q3+Sik4Dt7HWVFCEgj5 6ezW1ZWCug9AP3mEkcFCP50ZiOTaB97rXGUdGxY+/7//jgIQuA0r2/dEnN3EUFViOLUX M0cMc3GDYU9X+k7JW9zZCTpoYeBJXyKjLdok+0h2z58GFi4FFtrLmEWJqGnuzgVMBflC hC/g== Received: by 10.216.225.230 with SMTP id z80mr9480077wep.182.1338762277453; Sun, 03 Jun 2012 15:24:37 -0700 (PDT) Received: from [192.168.1.5] (cpc3-seac6-0-0-cust991.7-2.cable.virginmedia.com. [81.105.255.224]) by mx.google.com with ESMTPS id gc6sm15895109wib.0.2012.06.03.15.24.34 (version=SSLv3 cipher=OTHER); Sun, 03 Jun 2012 15:24:36 -0700 (PDT) Message-ID: <4FCBE421.5050400@gmail.com> Date: Sun, 03 Jun 2012 23:24:33 +0100 From: Robert Bradley User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1 MIME-Version: 1.0 To: Sebastian Moeller References: <00404BC8-3761-409D-A1C8-9213D7D9A3DF@gmx.de> <1E435715-5C95-49AF-99D0-E8AD6EAD5B44@gmx.de> <4FBE5767.6080704@gmail.com> <4D0F5C65-2401-470F-A6D8-BE18E8BA25C7@gmx.de> <4FBE6290.9000701@freedesktop.org> <0E4C11DB-2B8A-411B-A61F-34B2A6BF57B9@gmx.de> <4FBE7AAB.5080307@freedesktop.org> <4FBE84C4.80607@gmail.com> <61BEA217-79A6-47C8-888D-101BC0EAFB45@gmx.de> <844EF766-4E37-4B31-AA5D-B51FB22A05A8@gmx.de> <4FC009F6.7070707@gmail.com> <3E3324C9-CF06-4BB3-A7FB-8B2E47A44C0C@gmx.de> In-Reply-To: <3E3324C9-CF06-4BB3-A7FB-8B2E47A44C0C@gmx.de> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Cc: cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] 3.3.6-2 X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jun 2012 22:24:40 -0000 On 02/06/12 08:03, Sebastian Moeller wrote: > From my totally unscientific testing I am quite convinced that even 16MB of /tmp used will make the router spiral into reboot if used over the 5GHz radio to the wan port. However, if I use one of the wired ports I get plenty of the following (not always hostapd): > > > Jun 1 23:41:08 nacktmulle kern.warn kernel: [185428.417968] hostapd: page allocation failure: order:0, mode:0x4020 > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] Call Trace: > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<802850a4>] dump_stack+0x8/0x34 > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800b4548>] warn_alloc_failed+0xe8/0x10c > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800b684c>] __alloc_pages_nodemask+0x5a0/0x600 > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800da070>] new_slab+0xa8/0x280 > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<80286b18>] __slab_alloc.isra.60.constprop.63+0x25c/0x2fc > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800dba48>] __kmalloc_track_caller+0x88/0x140 > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e0854>] __alloc_skb+0x80/0x140 > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e0930>] dev_alloc_skb+0x1c/0x48 > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801d0c74>] ag71xx_poll+0x430/0x65c > Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e8c10>] net_rx_action+0x88/0x1c8 > Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] hostapd: page allocation failure: order:0, mode:0x4020 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Call Trace: > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<802850a4>] dump_stack+0x8/0x34 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800b4548>] warn_alloc_failed+0xe8/0x10c > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800b684c>] __alloc_pages_nodemask+0x5a0/0x600 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800da070>] new_slab+0xa8/0x280 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<80286b18>] __slab_alloc.isra.60.constprop.63+0x25c/0x2fc > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800dba48>] __kmalloc_track_caller+0x88/0x140 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801e0854>] __alloc_skb+0x80/0x140 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801e0930>] dev_alloc_skb+0x1c/0x48 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801d0c74>] ag71xx_poll+0x430/0x65c > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Mem-Info: > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal per-cpu: > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] CPU 0: hi: 18, btch: 3 usd: 18 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] active_anon:3826 inactive_anon:63 isolated_anon:0 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] active_file:683 inactive_file:561 isolated_file:0 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] unevictable:0 dirty:0 writeback:0 unstable:0 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] free:96 slab_reclaimable:408 slab_unreclaimable:7706 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] mapped:501 shmem:109 pagetables:142 bounce:0 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal free:384kB min:1016kB low:1268kB high:1524kB active_anon:15304kB inactive_anon:252kB active_file:2732kB inactive_file:2244kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:65024kB mlocked:0k > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] lowmem_reserve[]: 0 0 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal: 42*4kB 15*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 384kB > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 1353 total pagecache pages > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 0 pages in swap cache > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Swap cache stats: add 0, delete 0, find 0/0 > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Free swap = 0kB > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Total swap = 0kB > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 16384 pages RAM > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 965 pages reserved > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 1399 pages shared > Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 14306 pages non-shared > Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] SLUB: Unable to allocate memory on node -1 (gfp=0x20) > Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] cache: kmalloc-2048, object size: 2048, buffer size: 2048, default order: 2, min order: 0 > Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] node 0: slabs: 0, objs: 0, free: 0 > > But the box seems to survive this… Heck this even survives my test case with 16000 KB used of /tmp. Under that amount of memory pressure named and ntpd get killed but the router does go into automatically reboot, it just stays up and running albeit somewhat useless without named. > Yes - that stack trace is because the ag71xx driver can't allocate the memory for a skb structure. Unlike the wireless driver though, the ag71xx_poll function simply returns immediately with ENOMEM. I had no real success in tracing what the equivalent is in ath9k. I noticed a possible issue in ath9k_rx_tasklet, since if bf->bf_mpdu=NULL (bf being an Atheros-specific buffer type) you could potentially get an infinite loop. I can't see though if that can ever occur in reality. I *think* it uses a list of skb structures preallocated at init-time for incoming frames, but I'm still trying to interpret that part of the code. (The exact behaviour is hardware-dependent.) > The way I interpret my latest test results is that the "assumed leak" should be restricted to the wireless driver, does that sound right to you? Also with cerowrt 3.3.6-2 even 16MB seem to much for /tmp. I will see what happens if I add some swap space to the router, I hope it will be quite happy with 31MB /tmp and actual usage of that space :). Since Dave only recommends full tftp reflashes maybe the update scenario might not be such a big issue for cerowrt? > I'll leave that to Dave to say - I was assuming that the firmware would be stored in memory first and then flashed. (There's always tftp at boot time as an alternative flashing method.) -- Robert Bradley