From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout-de.gmx.net (mailout-de.gmx.net [213.165.64.22]) by huchra.bufferbloat.net (Postfix) with SMTP id 3C2E721F0BE for ; Fri, 25 May 2012 11:25:37 -0700 (PDT) Received: (qmail invoked by alias); 25 May 2012 18:25:36 -0000 Received: from tsaolab-fw.caltech.edu (EHLO [192.168.50.78]) [131.215.9.89] by mail.gmx.net (mp070) with SMTP; 25 May 2012 20:25:36 +0200 X-Authenticated: #24211782 X-Provags-ID: V01U2FsdGVkX19+UfRV/75+yUG2cEaOq0cE6BUNf5X/9XnCILtW64 FNA/+krsRpFVoh Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=windows-1252 From: Sebastian Moeller In-Reply-To: Date: Fri, 25 May 2012 11:25:34 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <844EF766-4E37-4B31-AA5D-B51FB22A05A8@gmx.de> References: <00404BC8-3761-409D-A1C8-9213D7D9A3DF@gmx.de> <1E435715-5C95-49AF-99D0-E8AD6EAD5B44@gmx.de> <4FBE5767.6080704@gmail.com> <4D0F5C65-2401-470F-A6D8-BE18E8BA25C7@gmx.de> <4FBE6290.9000701@freedesktop.org> <0E4C11DB-2B8A-411B-A61F-34B2A6BF57B9@gmx.de> <4FBE7AAB.5080307@freedesktop.org> <4FBE84C4.80607@gmail.com> <61BEA217-79A6-47C8-888D-101BC0EAFB45@gmx.de> To: Robert Bradley X-Mailer: Apple Mail (2.1278) X-Y-GMX-Trusted: 0 Cc: cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] 3.3.6-2 X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 May 2012 18:25:39 -0000 Hi Robert, On May 25, 2012, at 4:11 AM, Robert Bradley wrote: > On 25 May 2012 07:41, Sebastian Moeller wrote: >>=20 >> Hi Robert, >>=20 >> since I see the same log file on my router as Jim, I just want to = report >> my observations below. >>=20 >> On May 24, 2012, at 11:58 AM, Robert Bradley wrote: > > (re. guest interfaces on wireless) >>> Are these disabled on your routers at the moment? I suppose in the >>> worst case you could try setting an explicit channel for both of the >>> non-mesh guest interfaces and see if the logs clear up (or somehow = pass "-L >>> /dev/null" to babeld). >>=20 >> After setting the 2.4GHz channel to 1 instead of auto >> /tmp/babeld.log still grows with the same entries. And on a = WNDR3700v2 there >> are 30840 KB of tmpfs on /tmp so the babeld.log size of 256KB should = not by >> itself cause the router to crash. That said, while testing this = hypothesis >> by filling most of /tmp (dd if=3D/dev/zero of=3D/tmp/delete_me = bs=3D1024 >> count=3D30000, so that around 340KB stayed free) the router reliably = went >> first into OOM and the rebooted itself. Might it be that the size of = the >> /tmp filesystem is too large if actually used? If I naively add the = VSZs of >> most processes I end up at around 90% of available memory, so worst = case >> there actually only seems to be room for a much smaller /tmp than = 30MB. . >> Maybe restricting /tmp to 6000 KB might make this problem go away (or >> hooking up a swap device). Does this reasoning sound sane? Once I = figure out >> how to reduce the size of /tmp I will test this. >>=20 >=20 > Using "mount -o remount -o size=3D6000k /tmp" should apparently work = for > that. The reasoning sounds good to me, too. =20 I will go and test that. > That said, unless we can > find an obvious reason for /tmp overfilling, I'm not sure we should do > that, since it will cause problems upgrading. =20 But if I create a file of 30000 1KB blocks in /tmp (so that = around 400 KB stay available), the router goes into OOM, so I do not = think that upgrading would work well if it really needs so much memory? = I have a hunch that the openwork base under cerowrt does not assume = something as big and demanding as the 11MB bind9 named process running = :) > There's also the issue > that in bug #379, only wireless traffic caused problems. I think that > even if excessive logs are the problem, the real issue must be > somewhere within the wireless driver, but I could well be wrong=85 Oh I agree the /tmp issue is a tangent, but it does not seem = healthy that the router spirals into reboot once /tmp fills up (BTW if I = remove my 30000KB file from /tmp while the first OOM is in process the = router recovers) My hunch is that the falmost fully instantiated tmpfs = takes to o much memory from the system for it to handle its usual = business. On top of that are the wireless issues, say what about a kernel = memory leak caused by ath wireless that grows and grows until the = problematic /tmp size is in the single digit MBs that starts the spiral = to reboot? >=20 > I'm thinking that maybe flooding wireless->wired with UDP traffic for > 5-10 minutes is the right approach, and then vice-versa (restarting > the router inbetween?). If there are problems like infinite retries > or packet memory leaks, that might show them up quickly. That sounds like the right way to process, except I am no expert = at setting netsurf up so that might take a while until I get around to = actually test that hypothesis. (Do you by any chance know a publicly = available net server process running in the internets to which I could = point a local netperf, and do you have any recommendations how to create = the UDP flood with netperf ?) Best Sebastian >=20 > -- > Robert Bradley