From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id DC8C421F1A9 for ; Wed, 29 Jan 2014 04:45:53 -0800 (PST) Received: from u-089-cab203a2.am1.uni-tuebingen.de ([134.2.89.4]) by mail.gmx.com (mrgmx102) with ESMTPSA (Nemesis) id 0M6fXs-1VMxwI0gTc-00wVau for ; Wed, 29 Jan 2014 13:45:50 +0100 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) From: Sebastian Moeller In-Reply-To: Date: Wed, 29 Jan 2014 13:45:50 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: Dave Taht X-Mailer: Apple Mail (2.1510) X-Provags-ID: V03:K0:0W5zQzD7gg7Bu+6AO1KgQU7ScbDytVZG4858CPwJxspO7tt7pt9 zuzBJWLEDVUsxXiHAHKmzkfCU4AO56v5LPQN3bVgq3FAA277CI0eIMZcmnq8hISObeXyxsA GHspf49KefwzjAHTIjLlKBND10vzLoxdl9JL3kTLwSEEVjzs58j9S/u7n6+xj84GGSwCVcO h5KEAlVmhxGVoEPEiebhg== Cc: "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] cerowrt issues (3.10.24-8) X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jan 2014 12:45:54 -0000 Hi Dave, quick question, how does one turn of logging for babeld? It seems that = if daemonized it defaults to logging to /var/log/babeld.log (or = similar). Is setting the log file to /dev/null really the answer? (Since I have no the IPv6 issue not yet resolved, I assume = babeld is unhappy) I resorted to stopping babeld completely, but that = feels like a crutch=85 Best Sebastian On Jan 27, 2014, at 22:14 , Dave Taht wrote: > certainly turn off the babeld log! I will leave it off in the next = release. >=20 > On Mon, Jan 27, 2014 at 4:10 PM, Steve Jenson = wrote: >> Looking more, the buffer errors are showing up in syslog well before = tmpfs >> fills up. Is the memtester openwrt package available for cerowrt? I = don't >> see it under `Available packages`. >>=20 >> Thanks, >> Steve >>=20 >>=20 >> On Mon, Jan 27, 2014 at 1:06 PM, Steve Jenson = wrote: >>>=20 >>> On Fri, Jan 24, 2014 at 3:23 PM, Dave Taht = wrote: >>>>=20 >>>> On Fri, Jan 24, 2014 at 6:08 PM, Steve Jenson = >>>> wrote: >>>>> Hi everybody, >>>>>=20 >>>>> I've been using cerowrt as a secondary wifi network (just a single = AP >>>>> for >>>>> now) for a few weeks now. Recently, my wndr3800 got stuck in a bad >>>>> state and >>>>> eventually rebooted. I've had this happen a few times now and am >>>>> looking for >>>>> ways to debug the issue. I'm new to cerowrt and openwrt so any = advice >>>>> is >>>>> appreciated. >>>>>=20 >>>>> Since I use it as a secondary network, this is no way critical. >>>>=20 >>>> Yea! I appreciate caution before putting alpha software on your gw. >>>>=20 >>>>> I'm not >>>>> looking for free tech support but I couldn't find anything on the = wiki >>>>> about >>>>> troubleshooting. I'd love to start a page and write some shell = scripts >>>>> to >>>>> diagnose and report issues. I know that a cerowrt router is meant = to be >>>>> a >>>>> research project rather a consumer device but these things seem = helpful >>>>> regardless. >>>>=20 >>>> Sure, let me know your wiki account. I have been lax about granting >>>> access of late as the signup process is overrun by spammers. >>>=20 >>>=20 >>> My username is stevej on the wiki. Thanks! >>>=20 >>>=20 >>>>=20 >>>>> Please let me know if you'd prefer I not email the list with these >>>>> issues or >>>>> if you'd rather I used trac or a different forum. >>>>>=20 >>>>=20 >>>> The list is where most stuff happens. Also in the irc channel. >>>>=20 >>>> If it gets to where it needs to be tracked we have a bugtracker at >>>>=20 >>>> http://www.bufferbloat.net/projects/cerowrt/issues >>>>=20 >>>> The first question I have is: Are you on comcast? Cerowrt >>>> had a dhcpv6-pd implementation that "just worked" from feburary = through >>>> december. Regrettably they changed the RA announcement interval >>>> to a really low number around then... and this triggers a firewall = reload >>>> every minute on everything prior to the release I point to below. >>>>=20 >>>> If there is a memory leak somewhere that would have triggered it. >>>=20 >>>=20 >>> I am on AT&T ADSL2+ with a Motorola NVG510 modem. >>>=20 >>>=20 >>>>=20 >>>>> In this state, I can connect to the cerowrt base station via wifi = but >>>>> am >>>>> unable to route packets to the internet. I can connect to :81 and = see >>>>> the >>>>> login page but logging in results in a lua error at = `/cgi-bin/luci` >>>>>=20 >>>>>=20 >>>>> /usr/lib/lua/luci/dispatcher.lua:448: Failed to execute function >>>>> dispatcher target for entry '/'. >>>>> The called action terminated with an exception: >>>>> /usr/lib/lua/luci/sauth.lua:87: Session data invalid! >>>>> stack traceback: >>>>> [C]: in function 'assert' >>>>> /usr/lib/lua/luci/dispatcher.lua:448: in function 'dispatch' >>>>> /usr/lib/lua/luci/dispatcher.lua:195: in function >>>>> >>>>>=20 >>>>> I can ssh into the device and cat various log files until the = router >>>>> hangs >>>>> and reboots. here's a few relevant lines from my terminal history >>>>> before the >>>>> device rebooted (I'm assuming a watchdog kicked in and rebooted = it). >>>>>=20 >>>>> root@buffy2-1:~# ping google.com >>>>> ping: bad address 'google.com' >>>>> root@buffy2-1:~# free >>>>> total used free shared = buffers >>>>> Mem: 126336 110332 16004 0 = 5616 >>>>> -/+ buffers: 104716 21620 >>>>> Swap: 0 0 0 >>>>> root@buffy2-1:~# uptime >>>>> 02:08:54 up 2 days, 1:26, load average: 0.10, 0.21, 0.17 >>>>> root@buffy2-1:~# dmesg >>>>> [ 0.000000] Linux version 3.10.24 (cero2@snapon) (gcc version = 4.6.4 >>>>> (OpenWrt/Linaro GCC 4.6-2013.05 r38226) ) #1 Tue Dec 24 >>>>> 10:50:15 PST 2013 >>>>> [skipping some lines] >>>>>=20 >>>>> [ 13.156250] Error: Driver 'gpio-keys-polled' is already = registered, >>>>> aborting... >>>>> [ 19.414062] IPv6: ADDRCONF(NETDEV_UP): ge00: link is not ready >>>>> [ 19.421875] ar71xx: pll_reg 0xb8050010: 0x11110000 >>>>> [ 19.429687] se00: link up (1000Mbps/Full duplex) >>>>> [ 22.140625] IPv6: ADDRCONF(NETDEV_UP): sw00: link is not ready >>>>> [ 23.351562] IPv6: ADDRCONF(NETDEV_CHANGE): sw00: link becomes = ready >>>>> [ 23.757812] ar71xx: pll_reg 0xb8050014: 0x11110000 >>>>> [ 23.757812] ge00: link up (1000Mbps/Full duplex) >>>>> [ 23.773437] IPv6: ADDRCONF(NETDEV_CHANGE): ge00: link becomes = ready >>>>>=20 >>>>> root@buffy2-1:~# ifconfig >>>>> ge00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B1 >>>>> inet addr:192.168.1.138 Bcast:192.168.1.255 >>>>> Mask:255.255.255.0 >>>>> inet6 addr: fe80::2eb0:5dff:fea0:c5b1/64 Scope:Link >>>>> inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 >>>>> Scope:Global >>>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>>>> RX packets:1469670 errors:0 dropped:8 overruns:0 frame:0 >>>>> TX packets:547733 errors:0 dropped:0 overruns:0 carrier:0 >>>>> collisions:0 txqueuelen:1000 >>>>> RX bytes:229243410 (218.6 MiB) TX bytes:57304808 (54.6 = MiB) >>>>> Interrupt:5 >>>>>=20 >>>>> lo Link encap:Local Loopback >>>>> inet addr:127.0.0.1 Mask:255.0.0.0 >>>>> inet6 addr: ::1/128 Scope:Host >>>>> UP LOOPBACK RUNNING MTU:65536 Metric:1 >>>>> RX packets:23689 errors:0 dropped:0 overruns:0 frame:0 >>>>> TX packets:23689 errors:0 dropped:0 overruns:0 carrier:0 >>>>> collisions:0 txqueuelen:0 >>>>> RX bytes:2612713 (2.4 MiB) TX bytes:2612713 (2.4 MiB) >>>>>=20 >>>>> pimreg Link encap:UNSPEC HWaddr >>>>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 >>>>> UP RUNNING NOARP MTU:1472 Metric:1 >>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 >>>>> collisions:0 txqueuelen:0 >>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) >>>>>=20 >>>>> se00 Link encap:Ethernet HWaddr 2E:B0:5D:A0:C5:B0 >>>>> inet addr:172.30.42.1 Bcast:172.30.42.31 >>>>> Mask:255.255.255.224 >>>>> inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 >>>>> Scope:Global >>>>=20 >>>> How are you assigning your ipv6 addresses? >>>=20 >>>=20 >>> It's been a while since I messed with this but I think IPv6 is = assigned >>> thanks to 6relayd? My modem has IPv6 enabled but no DHCPv6 options = that I >>> can find. Here's how cerowrt is configured. >>>=20 >>> root@buffy2-1:/overlay/etc/config# cat 6relayd >>> config server 'default' >>> option fallback_relay 'rd dhcpv6 ndp' >>> list network 'ge00' >>> list network 'ge01' >>> list network 'gw00' >>> list network 'gw01' >>> list network 'gw10' >>> list network 'gw11' >>> list network 'se00' >>> list network 'sw00' >>> list network 'sw10' >>> option rd 'relay' >>> option dhcpv6 'relay' >>> option ndp 'relay' >>> option master 'ge00' >>>=20 >>>>=20 >>>>> inet6 addr: fe80::2cb0:5dff:fea0:c5b0/64 Scope:Link >>>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >>>>> TX packets:191740 errors:0 dropped:0 overruns:0 carrier:0 >>>>> collisions:0 txqueuelen:1000 >>>>> RX bytes:0 (0.0 B) TX bytes:42184988 (40.2 MiB) >>>>> Interrupt:4 >>>>>=20 >>>>> sw00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B0 >>>>> inet addr:172.30.42.65 Bcast:172.30.42.95 >>>>> Mask:255.255.255.224 >>>>> inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 >>>>> Scope:Global >>>>> inet6 addr: fe80::2eb0:5dff:fea0:c5b0/64 Scope:Link >>>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>>>> RX packets:70239 errors:0 dropped:0 overruns:0 frame:0 >>>>> TX packets:286967 errors:0 dropped:0 overruns:0 carrier:0 >>>>> collisions:0 txqueuelen:1000 >>>>> RX bytes:15590189 (14.8 MiB) TX bytes:127357293 (121.4 = MiB) >>>>>=20 >>>>> root@buffy2-1:~# less /var/log/babeld.log >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> send: Cannot assign requested address >>>>> send: Cannot assign requested address >>>>> send: Cannot assign requested address >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> netlink_read: recvmsg(): No buffer space available >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>> Couldn't determine channel of interface sw00: Invalid argument. >>>>=20 >>>> This is a problem in babel detecting the channel on a "normal" >>>> rather than a mesh interface. It's bugged me a long while, but >>>> haven't got around to finding what triggers it. Might "fix" it by >>>> acquiring the channel at babel start time from = /etc/config/wireless. >>>>=20 >>>> It messes up the diversity routing calculation, grump. >>>>=20 >>>> There is a possibility a logfile got really big, but this one >>>> generally doesn't, but I should turn off logging in some >>>> future release... >>>=20 >>>=20 >>> I believe I've tracked down part of what's going on. It looks like = my >>> tmpfs is filling up 100% and then the device enters a bad state: >>>=20 >>> After 24 hours, with tmpfs at 50%, babeld.log is the largest file by = far >>> in tmpfs and the only file that appears to be growing (based on = `du`). It >>> takes about 48 hours from reboot to fill up tmpfs on my device. >>>=20 >>> # sort babeld.log | uniq -c |sort -rn |head >>>=20 >>> 503236 Couldn't determine channel of interface sw00: Invalid = argument. >>>=20 >>> 1376 netlink_read: recvmsg(): No buffer space available >>>=20 >>> 3 send: Cannot assign requested address >>>=20 >>> # wc -l babeld.log >>>=20 >>> 504617 babeld.log >>>=20 >>> I sped up system failure by using `dd` to fill up tmpfs and the = system >>> became immediately unusable. >>>=20 >>> This also explains the luci session store errors as sessions are = stored in >>> tmpfs. >>>=20 >>> The other buffer issues may or may not be related to this. >>>=20 >>> Best, >>> Steve >>=20 >>=20 >=20 >=20 >=20 > --=20 > Dave T=E4ht >=20 > Fixing bufferbloat with cerowrt: = http://www.teklibre.com/cerowrt/subscribe.html > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel