On Fri, Jan 24, 2014 at 3:23 PM, Dave Taht <dave.taht@gmail.com> wrote:
On Fri, Jan 24, 2014 at 6:08 PM, Steve Jenson <stevej@fruitless.org> wrote:Yea! I appreciate caution before putting alpha software on your gw.
> Hi everybody,
>
> I've been using cerowrt as a secondary wifi network (just a single AP for
> now) for a few weeks now. Recently, my wndr3800 got stuck in a bad state and
> eventually rebooted. I've had this happen a few times now and am looking for
> ways to debug the issue. I'm new to cerowrt and openwrt so any advice is
> appreciated.
>
> Since I use it as a secondary network, this is no way critical.
Sure, let me know your wiki account. I have been lax about granting
> I'm not
> looking for free tech support but I couldn't find anything on the wiki about
> troubleshooting. I'd love to start a page and write some shell scripts to
> diagnose and report issues. I know that a cerowrt router is meant to be a
> research project rather a consumer device but these things seem helpful
> regardless.
access of late as the signup process is overrun by spammers.My username is stevej on the wiki. Thanks!> Please let me know if you'd prefer I not email the list with these issues orThe list is where most stuff happens. Also in the irc channel.
> if you'd rather I used trac or a different forum.
>
If it gets to where it needs to be tracked we have a bugtracker at
http://www.bufferbloat.net/projects/cerowrt/issues
The first question I have is: Are you on comcast? Cerowrt
had a dhcpv6-pd implementation that "just worked" from feburary through
december. Regrettably they changed the RA announcement interval
to a really low number around then... and this triggers a firewall reload
every minute on everything prior to the release I point to below.
If there is a memory leak somewhere that would have triggered it.I am on AT&T ADSL2+ with a Motorola NVG510 modem.How are you assigning your ipv6 addresses?> In this state, I can connect to the cerowrt base station via wifi but am
> unable to route packets to the internet. I can connect to :81 and see the
> login page but logging in results in a lua error at `/cgi-bin/luci`
>
>
> /usr/lib/lua/luci/dispatcher.lua:448: Failed to execute function
> dispatcher target for entry '/'.
> The called action terminated with an exception:
> /usr/lib/lua/luci/sauth.lua:87: Session data invalid!
> stack traceback:
> [C]: in function 'assert'
> /usr/lib/lua/luci/dispatcher.lua:448: in function 'dispatch'
> /usr/lib/lua/luci/dispatcher.lua:195: in function
> </usr/lib/lua/luci/dispatcher.lua:194>
>
> I can ssh into the device and cat various log files until the router hangs
> and reboots. here's a few relevant lines from my terminal history before the
> device rebooted (I'm assuming a watchdog kicked in and rebooted it).
>
> root@buffy2-1:~# ping google.com
> ping: bad address 'google.com'
> root@buffy2-1:~# free
> total used free shared buffers
> Mem: 126336 110332 16004 0 5616
> -/+ buffers: 104716 21620
> Swap: 0 0 0
> root@buffy2-1:~# uptime
> 02:08:54 up 2 days, 1:26, load average: 0.10, 0.21, 0.17
> root@buffy2-1:~# dmesg
> [ 0.000000] Linux version 3.10.24 (cero2@snapon) (gcc version 4.6.4
> (OpenWrt/Linaro GCC 4.6-2013.05 r38226) ) #1 Tue Dec 24
> 10:50:15 PST 2013
> [skipping some lines]
>
> [ 13.156250] Error: Driver 'gpio-keys-polled' is already registered,
> aborting...
> [ 19.414062] IPv6: ADDRCONF(NETDEV_UP): ge00: link is not ready
> [ 19.421875] ar71xx: pll_reg 0xb8050010: 0x11110000
> [ 19.429687] se00: link up (1000Mbps/Full duplex)
> [ 22.140625] IPv6: ADDRCONF(NETDEV_UP): sw00: link is not ready
> [ 23.351562] IPv6: ADDRCONF(NETDEV_CHANGE): sw00: link becomes ready
> [ 23.757812] ar71xx: pll_reg 0xb8050014: 0x11110000
> [ 23.757812] ge00: link up (1000Mbps/Full duplex)
> [ 23.773437] IPv6: ADDRCONF(NETDEV_CHANGE): ge00: link becomes ready
>
> root@buffy2-1:~# ifconfig
> ge00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B1
> inet addr:192.168.1.138 Bcast:192.168.1.255 Mask:255.255.255.0
> inet6 addr: fe80::2eb0:5dff:fea0:c5b1/64 Scope:Link
> inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 Scope:Global
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:1469670 errors:0 dropped:8 overruns:0 frame:0
> TX packets:547733 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:229243410 (218.6 MiB) TX bytes:57304808 (54.6 MiB)
> Interrupt:5
>
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> inet6 addr: ::1/128 Scope:Host
> UP LOOPBACK RUNNING MTU:65536 Metric:1
> RX packets:23689 errors:0 dropped:0 overruns:0 frame:0
> TX packets:23689 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:2612713 (2.4 MiB) TX bytes:2612713 (2.4 MiB)
>
> pimreg Link encap:UNSPEC HWaddr
> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> UP RUNNING NOARP MTU:1472 Metric:1
> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
>
> se00 Link encap:Ethernet HWaddr 2E:B0:5D:A0:C5:B0
> inet addr:172.30.42.1 Bcast:172.30.42.31 Mask:255.255.255.224
> inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 Scope:Global
It's been a while since I messed with this but I think IPv6 is assigned thanks to 6relayd? My modem has IPv6 enabled but no DHCPv6 options that I can find. Here's how cerowrt is configured.
root@buffy2-1:/overlay/etc/config# cat 6relayd
config server 'default'
option fallback_relay 'rd dhcpv6 ndp'
list network 'ge00'
list network 'ge01'
list network 'gw00'
list network 'gw01'
list network 'gw10'
list network 'gw11'
list network 'se00'
list network 'sw00'
list network 'sw10'
option rd 'relay'
option dhcpv6 'relay'
option ndp 'relay'
option master 'ge00'This is a problem in babel detecting the channel on a "normal"
> inet6 addr: fe80::2cb0:5dff:fea0:c5b0/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> TX packets:191740 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:0 (0.0 B) TX bytes:42184988 (40.2 MiB)
> Interrupt:4
>
> sw00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B0
> inet addr:172.30.42.65 Bcast:172.30.42.95 Mask:255.255.255.224
> inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 Scope:Global
> inet6 addr: fe80::2eb0:5dff:fea0:c5b0/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:70239 errors:0 dropped:0 overruns:0 frame:0
> TX packets:286967 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:15590189 (14.8 MiB) TX bytes:127357293 (121.4 MiB)
>
> root@buffy2-1:~# less /var/log/babeld.log
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> send: Cannot assign requested address
> send: Cannot assign requested address
> send: Cannot assign requested address
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> netlink_read: recvmsg(): No buffer space available
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
> Couldn't determine channel of interface sw00: Invalid argument.
rather than a mesh interface. It's bugged me a long while, but
haven't got around to finding what triggers it. Might "fix" it by
acquiring the channel at babel start time from /etc/config/wireless.
It messes up the diversity routing calculation, grump.
There is a possibility a logfile got really big, but this one
generally doesn't, but I should turn off logging in some
future release...I believe I've tracked down part of what's going on. It looks like my tmpfs is filling up 100% and then the device enters a bad state:After 24 hours, with tmpfs at 50%, babeld.log is the largest file by far in tmpfs and the only file that appears to be growing (based on `du`). It takes about 48 hours from reboot to fill up tmpfs on my device.# sort babeld.log | uniq -c |sort -rn |head
503236 Couldn't determine channel of interface sw00: Invalid argument.
1376 netlink_read: recvmsg(): No buffer space available
3 send: Cannot assign requested address
# wc -l babeld.log
504617 babeld.log
I sped up system failure by using `dd` to fill up tmpfs and the system became immediately unusable.This also explains the luci session store errors as sessions are stored in tmpfs.The other buffer issues may or may not be related to this.Best,Steve