From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.15]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 01B2521F09E for ; Wed, 29 Jan 2014 11:51:20 -0800 (PST) Received: from hms-beagle.home.lan ([217.86.112.208]) by mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id 0MBFBB-1W0f4L49o5-00AH4r for ; Wed, 29 Jan 2014 20:51:17 +0100 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) From: Sebastian Moeller In-Reply-To: Date: Wed, 29 Jan 2014 20:51:15 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <49E93736-D90E-4E07-803B-26B84CD428C2@gmx.de> References: To: Dave Taht X-Mailer: Apple Mail (2.1510) X-Provags-ID: V03:K0:Nim/dbPj8ySlRxvjhVBhRKYaWGJouEfZtCXmPLx1h9bzPqlHTpq u7Rf12p1ChfUryVb29tjsBK0QIubN/FtSOlLJsy4ONztDOdUJvPbE+qEswHz7XrbxhWTD3l CXJf8GzKpdM05teKrP12hZLkwaLPaGhAlR2N4R17erHxfm8WjJper6bP4H27tjlZkUN1A5L z41tu6gNuNQDyCyQ8TymQ== Cc: "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] cerowrt issues (3.10.24-8) X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jan 2014 19:51:21 -0000 Hi Dave, On Jan 29, 2014, at 19:21 , Dave Taht wrote: > On Wed, Jan 29, 2014 at 9:44 AM, Sebastian Moeller = wrote: >> On January 29, 2014 5:10:18 PM CET, Dave Taht = wrote: >>> On Wed, Jan 29, 2014 at 4:45 AM, Sebastian Moeller >>> wrote: >>>> Hi Dave, >>>>=20 >>>> quick question, how does one turn of logging for babeld? It seems >>> that if daemonized it defaults to logging to /var/log/babeld.log (or >>> similar). Is setting the log file to /dev/null really the answer? >>>=20 >>> seems so. >>=20 >> Okay, I guess I will try that then... >>=20 >>=20 >>>=20 >>>> (Since I have no the IPv6 issue not yet resolved, I assume >>> babeld is unhappy) I resorted to stopping babeld completely, but = that >>> feels like a crutch... >>>=20 >>> no daemon in an embedded system should ever write to flash in an >>> uncontrollable manner. >>=20 >> What bugs me is that it basically keeps repeating the same error = over and over again. If it would rate limit and:or push messages to the = system log it would be nicer. >>=20 >>=20 >>>=20 >>> I will also argue that not being able to find the channel is a bug >>> that messes with diversity routing >>> in particular. >>=20 >> I have actually not yet understood what it wants to tell me ;), = since I got your attention, is there an easy way to run a babel client = under macosx? >=20 > for coping with the mac I use "macports" to get a compiler and support > for open source software. Ah, same here (even though I would have thought you a homebrew = user, no idea why) ;) Alas, "port search babel" does not find anything babeld related... >=20 > I haven't ever tried to run babeld on the mac I have, I will put it on > my list=85 I just thought it would be nice to finally get into the "stay = connected while switching between wired and wire-less fun", but this is = in no way essential for me.=20 Best Regards Sebastian >=20 >>=20 >>=20 >> Best Regards >> Sebastian >>=20 >>=20 >>>=20 >>>>=20 >>>> Best >>>> Sebastian >>>>=20 >>>>=20 >>>>=20 >>>> On Jan 27, 2014, at 22:14 , Dave Taht wrote: >>>>=20 >>>>> certainly turn off the babeld log! I will leave it off in the next >>> release. >>>>>=20 >>>>> On Mon, Jan 27, 2014 at 4:10 PM, Steve Jenson = >>> wrote: >>>>>> Looking more, the buffer errors are showing up in syslog well >>> before tmpfs >>>>>> fills up. Is the memtester openwrt package available for cerowrt? = I >>> don't >>>>>> see it under `Available packages`. >>>>>>=20 >>>>>> Thanks, >>>>>> Steve >>>>>>=20 >>>>>>=20 >>>>>> On Mon, Jan 27, 2014 at 1:06 PM, Steve Jenson >>> wrote: >>>>>>>=20 >>>>>>> On Fri, Jan 24, 2014 at 3:23 PM, Dave Taht >>> wrote: >>>>>>>>=20 >>>>>>>> On Fri, Jan 24, 2014 at 6:08 PM, Steve Jenson >>> >>>>>>>> wrote: >>>>>>>>> Hi everybody, >>>>>>>>>=20 >>>>>>>>> I've been using cerowrt as a secondary wifi network (just a >>> single AP >>>>>>>>> for >>>>>>>>> now) for a few weeks now. Recently, my wndr3800 got stuck in a >>> bad >>>>>>>>> state and >>>>>>>>> eventually rebooted. I've had this happen a few times now and = am >>>>>>>>> looking for >>>>>>>>> ways to debug the issue. I'm new to cerowrt and openwrt so any >>> advice >>>>>>>>> is >>>>>>>>> appreciated. >>>>>>>>>=20 >>>>>>>>> Since I use it as a secondary network, this is no way = critical. >>>>>>>>=20 >>>>>>>> Yea! I appreciate caution before putting alpha software on your >>> gw. >>>>>>>>=20 >>>>>>>>> I'm not >>>>>>>>> looking for free tech support but I couldn't find anything on >>> the wiki >>>>>>>>> about >>>>>>>>> troubleshooting. I'd love to start a page and write some shell >>> scripts >>>>>>>>> to >>>>>>>>> diagnose and report issues. I know that a cerowrt router is >>> meant to be >>>>>>>>> a >>>>>>>>> research project rather a consumer device but these things = seem >>> helpful >>>>>>>>> regardless. >>>>>>>>=20 >>>>>>>> Sure, let me know your wiki account. I have been lax about >>> granting >>>>>>>> access of late as the signup process is overrun by spammers. >>>>>>>=20 >>>>>>>=20 >>>>>>> My username is stevej on the wiki. Thanks! >>>>>>>=20 >>>>>>>=20 >>>>>>>>=20 >>>>>>>>> Please let me know if you'd prefer I not email the list with >>> these >>>>>>>>> issues or >>>>>>>>> if you'd rather I used trac or a different forum. >>>>>>>>>=20 >>>>>>>>=20 >>>>>>>> The list is where most stuff happens. Also in the irc channel. >>>>>>>>=20 >>>>>>>> If it gets to where it needs to be tracked we have a bugtracker >>> at >>>>>>>>=20 >>>>>>>> http://www.bufferbloat.net/projects/cerowrt/issues >>>>>>>>=20 >>>>>>>> The first question I have is: Are you on comcast? Cerowrt >>>>>>>> had a dhcpv6-pd implementation that "just worked" from feburary >>> through >>>>>>>> december. Regrettably they changed the RA announcement interval >>>>>>>> to a really low number around then... and this triggers a >>> firewall reload >>>>>>>> every minute on everything prior to the release I point to = below. >>>>>>>>=20 >>>>>>>> If there is a memory leak somewhere that would have triggered = it. >>>>>>>=20 >>>>>>>=20 >>>>>>> I am on AT&T ADSL2+ with a Motorola NVG510 modem. >>>>>>>=20 >>>>>>>=20 >>>>>>>>=20 >>>>>>>>> In this state, I can connect to the cerowrt base station via >>> wifi but >>>>>>>>> am >>>>>>>>> unable to route packets to the internet. I can connect to :81 >>> and see >>>>>>>>> the >>>>>>>>> login page but logging in results in a lua error at >>> `/cgi-bin/luci` >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> /usr/lib/lua/luci/dispatcher.lua:448: Failed to execute >>> function >>>>>>>>> dispatcher target for entry '/'. >>>>>>>>> The called action terminated with an exception: >>>>>>>>> /usr/lib/lua/luci/sauth.lua:87: Session data invalid! >>>>>>>>> stack traceback: >>>>>>>>> [C]: in function 'assert' >>>>>>>>> /usr/lib/lua/luci/dispatcher.lua:448: in function 'dispatch' >>>>>>>>> /usr/lib/lua/luci/dispatcher.lua:195: in function >>>>>>>>> >>>>>>>>>=20 >>>>>>>>> I can ssh into the device and cat various log files until the >>> router >>>>>>>>> hangs >>>>>>>>> and reboots. here's a few relevant lines from my terminal >>> history >>>>>>>>> before the >>>>>>>>> device rebooted (I'm assuming a watchdog kicked in and = rebooted >>> it). >>>>>>>>>=20 >>>>>>>>> root@buffy2-1:~# ping google.com >>>>>>>>> ping: bad address 'google.com' >>>>>>>>> root@buffy2-1:~# free >>>>>>>>> total used free shared >>> buffers >>>>>>>>> Mem: 126336 110332 16004 0 >>> 5616 >>>>>>>>> -/+ buffers: 104716 21620 >>>>>>>>> Swap: 0 0 0 >>>>>>>>> root@buffy2-1:~# uptime >>>>>>>>> 02:08:54 up 2 days, 1:26, load average: 0.10, 0.21, 0.17 >>>>>>>>> root@buffy2-1:~# dmesg >>>>>>>>> [ 0.000000] Linux version 3.10.24 (cero2@snapon) (gcc = version >>> 4.6.4 >>>>>>>>> (OpenWrt/Linaro GCC 4.6-2013.05 r38226) ) #1 Tue Dec 24 >>>>>>>>> 10:50:15 PST 2013 >>>>>>>>> [skipping some lines] >>>>>>>>>=20 >>>>>>>>> [ 13.156250] Error: Driver 'gpio-keys-polled' is already >>> registered, >>>>>>>>> aborting... >>>>>>>>> [ 19.414062] IPv6: ADDRCONF(NETDEV_UP): ge00: link is not >>> ready >>>>>>>>> [ 19.421875] ar71xx: pll_reg 0xb8050010: 0x11110000 >>>>>>>>> [ 19.429687] se00: link up (1000Mbps/Full duplex) >>>>>>>>> [ 22.140625] IPv6: ADDRCONF(NETDEV_UP): sw00: link is not >>> ready >>>>>>>>> [ 23.351562] IPv6: ADDRCONF(NETDEV_CHANGE): sw00: link = becomes >>> ready >>>>>>>>> [ 23.757812] ar71xx: pll_reg 0xb8050014: 0x11110000 >>>>>>>>> [ 23.757812] ge00: link up (1000Mbps/Full duplex) >>>>>>>>> [ 23.773437] IPv6: ADDRCONF(NETDEV_CHANGE): ge00: link = becomes >>> ready >>>>>>>>>=20 >>>>>>>>> root@buffy2-1:~# ifconfig >>>>>>>>> ge00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B1 >>>>>>>>> inet addr:192.168.1.138 Bcast:192.168.1.255 >>>>>>>>> Mask:255.255.255.0 >>>>>>>>> inet6 addr: fe80::2eb0:5dff:fea0:c5b1/64 Scope:Link >>>>>>>>> inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 >>>>>>>>> Scope:Global >>>>>>>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>>>>>>>> RX packets:1469670 errors:0 dropped:8 overruns:0 >>> frame:0 >>>>>>>>> TX packets:547733 errors:0 dropped:0 overruns:0 >>> carrier:0 >>>>>>>>> collisions:0 txqueuelen:1000 >>>>>>>>> RX bytes:229243410 (218.6 MiB) TX bytes:57304808 = (54.6 >>> MiB) >>>>>>>>> Interrupt:5 >>>>>>>>>=20 >>>>>>>>> lo Link encap:Local Loopback >>>>>>>>> inet addr:127.0.0.1 Mask:255.0.0.0 >>>>>>>>> inet6 addr: ::1/128 Scope:Host >>>>>>>>> UP LOOPBACK RUNNING MTU:65536 Metric:1 >>>>>>>>> RX packets:23689 errors:0 dropped:0 overruns:0 frame:0 >>>>>>>>> TX packets:23689 errors:0 dropped:0 overruns:0 >>> carrier:0 >>>>>>>>> collisions:0 txqueuelen:0 >>>>>>>>> RX bytes:2612713 (2.4 MiB) TX bytes:2612713 (2.4 MiB) >>>>>>>>>=20 >>>>>>>>> pimreg Link encap:UNSPEC HWaddr >>>>>>>>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 >>>>>>>>> UP RUNNING NOARP MTU:1472 Metric:1 >>>>>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >>>>>>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 >>>>>>>>> collisions:0 txqueuelen:0 >>>>>>>>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) >>>>>>>>>=20 >>>>>>>>> se00 Link encap:Ethernet HWaddr 2E:B0:5D:A0:C5:B0 >>>>>>>>> inet addr:172.30.42.1 Bcast:172.30.42.31 >>>>>>>>> Mask:255.255.255.224 >>>>>>>>> inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 >>>>>>>>> Scope:Global >>>>>>>>=20 >>>>>>>> How are you assigning your ipv6 addresses? >>>>>>>=20 >>>>>>>=20 >>>>>>> It's been a while since I messed with this but I think IPv6 is >>> assigned >>>>>>> thanks to 6relayd? My modem has IPv6 enabled but no DHCPv6 = options >>> that I >>>>>>> can find. Here's how cerowrt is configured. >>>>>>>=20 >>>>>>> root@buffy2-1:/overlay/etc/config# cat 6relayd >>>>>>> config server 'default' >>>>>>> option fallback_relay 'rd dhcpv6 ndp' >>>>>>> list network 'ge00' >>>>>>> list network 'ge01' >>>>>>> list network 'gw00' >>>>>>> list network 'gw01' >>>>>>> list network 'gw10' >>>>>>> list network 'gw11' >>>>>>> list network 'se00' >>>>>>> list network 'sw00' >>>>>>> list network 'sw10' >>>>>>> option rd 'relay' >>>>>>> option dhcpv6 'relay' >>>>>>> option ndp 'relay' >>>>>>> option master 'ge00' >>>>>>>=20 >>>>>>>>=20 >>>>>>>>> inet6 addr: fe80::2cb0:5dff:fea0:c5b0/64 Scope:Link >>>>>>>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>>>>>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0 >>>>>>>>> TX packets:191740 errors:0 dropped:0 overruns:0 >>> carrier:0 >>>>>>>>> collisions:0 txqueuelen:1000 >>>>>>>>> RX bytes:0 (0.0 B) TX bytes:42184988 (40.2 MiB) >>>>>>>>> Interrupt:4 >>>>>>>>>=20 >>>>>>>>> sw00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B0 >>>>>>>>> inet addr:172.30.42.65 Bcast:172.30.42.95 >>>>>>>>> Mask:255.255.255.224 >>>>>>>>> inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 >>>>>>>>> Scope:Global >>>>>>>>> inet6 addr: fe80::2eb0:5dff:fea0:c5b0/64 Scope:Link >>>>>>>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 >>>>>>>>> RX packets:70239 errors:0 dropped:0 overruns:0 frame:0 >>>>>>>>> TX packets:286967 errors:0 dropped:0 overruns:0 >>> carrier:0 >>>>>>>>> collisions:0 txqueuelen:1000 >>>>>>>>> RX bytes:15590189 (14.8 MiB) TX bytes:127357293 = (121.4 >>> MiB) >>>>>>>>>=20 >>>>>>>>> root@buffy2-1:~# less /var/log/babeld.log >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> send: Cannot assign requested address >>>>>>>>> send: Cannot assign requested address >>>>>>>>> send: Cannot assign requested address >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> netlink_read: recvmsg(): No buffer space available >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>> Couldn't determine channel of interface sw00: Invalid = argument. >>>>>>>>=20 >>>>>>>> This is a problem in babel detecting the channel on a "normal" >>>>>>>> rather than a mesh interface. It's bugged me a long while, but >>>>>>>> haven't got around to finding what triggers it. Might "fix" it = by >>>>>>>> acquiring the channel at babel start time from >>> /etc/config/wireless. >>>>>>>>=20 >>>>>>>> It messes up the diversity routing calculation, grump. >>>>>>>>=20 >>>>>>>> There is a possibility a logfile got really big, but this one >>>>>>>> generally doesn't, but I should turn off logging in some >>>>>>>> future release... >>>>>>>=20 >>>>>>>=20 >>>>>>> I believe I've tracked down part of what's going on. It looks = like >>> my >>>>>>> tmpfs is filling up 100% and then the device enters a bad state: >>>>>>>=20 >>>>>>> After 24 hours, with tmpfs at 50%, babeld.log is the largest = file >>> by far >>>>>>> in tmpfs and the only file that appears to be growing (based on >>> `du`). It >>>>>>> takes about 48 hours from reboot to fill up tmpfs on my device. >>>>>>>=20 >>>>>>> # sort babeld.log | uniq -c |sort -rn |head >>>>>>>=20 >>>>>>> 503236 Couldn't determine channel of interface sw00: Invalid >>> argument. >>>>>>>=20 >>>>>>> 1376 netlink_read: recvmsg(): No buffer space available >>>>>>>=20 >>>>>>> 3 send: Cannot assign requested address >>>>>>>=20 >>>>>>> # wc -l babeld.log >>>>>>>=20 >>>>>>> 504617 babeld.log >>>>>>>=20 >>>>>>> I sped up system failure by using `dd` to fill up tmpfs and the >>> system >>>>>>> became immediately unusable. >>>>>>>=20 >>>>>>> This also explains the luci session store errors as sessions are >>> stored in >>>>>>> tmpfs. >>>>>>>=20 >>>>>>> The other buffer issues may or may not be related to this. >>>>>>>=20 >>>>>>> Best, >>>>>>> Steve >>>>>>=20 >>>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> -- >>>>> Dave T=E4ht >>>>>=20 >>>>> Fixing bufferbloat with cerowrt: >>> http://www.teklibre.com/cerowrt/subscribe.html >>>>> _______________________________________________ >>>>> Cerowrt-devel mailing list >>>>> Cerowrt-devel@lists.bufferbloat.net >>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel >>>>=20 >>=20 >> Hi Dave, >> -- >> Sent from my Android phone with K-9 Mail. Please excuse my brevity. >=20 >=20 >=20 > --=20 > Dave T=E4ht >=20 > Fixing bufferbloat with cerowrt: = http://www.teklibre.com/cerowrt/subscribe.html