From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout-de.gmx.net (mailout-de.gmx.net [213.165.64.22]) by huchra.bufferbloat.net (Postfix) with SMTP id 260A121F0C1 for ; Thu, 24 May 2012 17:04:04 -0700 (PDT) Received: (qmail invoked by alias); 25 May 2012 00:04:03 -0000 Received: from 75-142-58-156.static.mtpk.ca.charter.com (EHLO dhcp-112.home.lan) [75.142.58.156] by mail.gmx.net (mp028) with SMTP; 25 May 2012 02:04:03 +0200 X-Authenticated: #24211782 X-Provags-ID: V01U2FsdGVkX182h6snLik2e6uqNuBFOyLB7W7kijrSfayLMSU5T6 ldh3+X5dfqMD8J Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=windows-1252 From: Sebastian Moeller In-Reply-To: <4FBE7AAB.5080307@freedesktop.org> Date: Thu, 24 May 2012 17:04:01 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <00404BC8-3761-409D-A1C8-9213D7D9A3DF@gmx.de> <1E435715-5C95-49AF-99D0-E8AD6EAD5B44@gmx.de> <4FBE5767.6080704@gmail.com> <4D0F5C65-2401-470F-A6D8-BE18E8BA25C7@gmx.de> <4FBE6290.9000701@freedesktop.org> <0E4C11DB-2B8A-411B-A61F-34B2A6BF57B9@gmx.de> <4FBE7AAB.5080307@freedesktop.org> To: Jim Gettys X-Mailer: Apple Mail (2.1278) X-Y-GMX-Trusted: 0 Cc: cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Fwd: 3.3.6-2 X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 May 2012 00:04:05 -0000 Hi Jim, On May 24, 2012, at 11:15 AM, Jim Gettys wrote: > On 05/24/2012 02:12 PM, Sebastian Moeller wrote: >> Hi Jim, >>=20 >> good point, I will go and see whether that is the cause for my = crashes=85 Will return to this post if/when I have new data in either = direction=85 >=20 > If you do, see if you can grab the babeld.conf file and add it to: > https://www.bufferbloat.net/issues/392 Done, attached to your issue. Turns out my babeld.log has grown to a similar size over 16:38 = hours uptime. But: root@nacktmulle:~# df -h Filesystem Size Used Available Use% Mounted on rootfs 5.8M 940.0K 4.9M 16% / /dev/root 8.8M 8.8M 0 100% /rom tmpfs 30.1M 688.0K 29.4M 2% /tmp tmpfs 512.0K 0 512.0K 0% /dev /dev/mtdblock4 5.8M 940.0K 4.9M 16% /overlay overlayfs:/overlay 5.8M 940.0K 4.9M 16% / root@nacktmulle:~# free total used free shared buffers Mem: 61676 59868 1808 0 6388 -/+ buffers: 53480 8196 Swap: 0 0 0 (No allocation failure logged yet) Best Sebastian >=20 >> best >> Sebastian >>=20 >>=20 >>=20 >> On May 24, 2012, at 9:32 AM, Jim Gettys wrote: >>=20 >>> On 05/24/2012 12:18 PM, Sebastian Moeller wrote: >>>> Hi Robert, >>>>=20 >>>> On May 24, 2012, at 8:44 AM, Robert Bradley wrote: >>>>=20 >>>>> On 24/05/12 04:48, Sebastian Moeller wrote: >>>>>> A) under moderate wireless stress I get a lot of allocation = failures from slub, like: >>>>>> [ 1221.664062] ath: skbuff alloc of size 1926 failed >>>>>> In the routers dmesg. And every now and then the router crashes = and reboots (I have not yet found a way to make this happen reliably, it = seems to require some uptime) >>>>> This looks to me like a possible memory leak somewhere, but I'm no = expert. >>>> Not being an expert I concur. >>> My router's /tmp/log/babeld.log had grown to almost 256k. (and my = router >>> had been flaky). >>>=20 >>> So I suspect that's making grim trouble as /tmp is a tmpfs: e.g. = coming >>> out of ram. >>> -rw-r--r-- 1 root root 247936 May 24 12:27 babeld.log >>>=20 >>> Tail on the babeld file had: >>>=20 >>> Couldn't determine channel of interface gw00: Invalid argument. >>> Couldn't determine channel of interface gw10: Invalid argument. >>> Couldn't determine channel of interface gw00: Invalid argument. >>> Couldn't determine channel of interface gw10: Invalid argument. >>> Couldn't determine channel of interface gw00: Invalid argument. >>> Couldn't determine channel of interface gw10: Invalid argument. >>> Couldn't determine channel of interface gw00: Invalid argument. >>> Couldn't determine channel of interface gw10: Invalid argument. >>> Couldn't determine channel of interface gw00: Invalid argument. >>> Couldn't determine channel of interface gw10: Invalid argument. >>>=20 >>> I should probably have grabbed a copy before nuking the file. /me = bad.... >>>=20 >>> Will put into redmine... >>>=20 >>> - Jim >>>=20 >>>>> (Unless cerowrt is using tmpfs and filling up memory with logs, of = course.) =20 >>>> I tried to check that, but since I can nor reproduce the crashes = easily yet I have not been able to test that hypothesis (when I checked = "df -h" on the router there always was some room left, but heck for all = I know it might be the log entries for the allocation failures that = quickly eat up all the remaining memory) I will try to test this = hypothesis. Currently I tried to check dmesg and free in rapid = succession during the test runs that are prone to cause the crash free = memory fluctuates some but I never saw it reach 0 just before crashing. >>>>=20 >>>>> Is UDP from the wired side to the Internet also OK? I'm assuming = it is, but it would be nice to prove that it is actually a leak in ath9k = and/or the wireless stack first! >>>> Actually I have not tested that yet (again with the crash = somewhat hard to reproduce I will have to take the wireless out of use = for 24 to 48 hours to be reasonably sure that the issue does not occur = under wired connections). That said, I will go and work on that. So I = have my testing work charted out and will post again once I have more = data. >>>>=20 >>>> Best >>>> Sebastian >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>>> _______________________________________________ >>>>> Cerowrt-devel mailing list >>>>> Cerowrt-devel@lists.bufferbloat.net >>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel >>>> _______________________________________________ >>>> Cerowrt-devel mailing list >>>> Cerowrt-devel@lists.bufferbloat.net >>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel >=20