Hi everybody, I've been using cerowrt as a secondary wifi network (just a single AP for now) for a few weeks now. Recently, my wndr3800 got stuck in a bad state and eventually rebooted. I've had this happen a few times now and am looking for ways to debug the issue. I'm new to cerowrt and openwrt so any advice is appreciated. Since I use it as a secondary network, this is no way critical. I'm not looking for free tech support but I couldn't find anything on the wiki about troubleshooting. I'd love to start a page and write some shell scripts to diagnose and report issues. I know that a cerowrt router is meant to be a research project rather a consumer device but these things seem helpful regardless. Please let me know if you'd prefer I not email the list with these issues or if you'd rather I used trac or a different forum. -- In this state, I can connect to the cerowrt base station via wifi but am unable to route packets to the internet. I can connect to :81 and see the login page but logging in results in a lua error at `/cgi-bin/luci` /usr/lib/lua/luci/dispatcher.lua:448: Failed to execute function dispatcher target for entry '/'. The called action terminated with an exception: /usr/lib/lua/luci/sauth.lua:87: Session data invalid! stack traceback: [C]: in function 'assert' /usr/lib/lua/luci/dispatcher.lua:448: in function 'dispatch' /usr/lib/lua/luci/dispatcher.lua:195: in function I can ssh into the device and cat various log files until the router hangs and reboots. here's a few relevant lines from my terminal history before the device rebooted (I'm assuming a watchdog kicked in and rebooted it). root@buffy2-1:~# ping google.com ping: bad address 'google.com' root@buffy2-1:~# free total used free shared buffers Mem: 126336 110332 16004 0 5616 -/+ buffers: 104716 21620 Swap: 0 0 0 root@buffy2-1:~# uptime 02:08:54 up 2 days, 1:26, load average: 0.10, 0.21, 0.17 root@buffy2-1:~# dmesg [ 0.000000] Linux version 3.10.24 (cero2@snapon) (gcc version 4.6.4 (OpenWrt/Linaro GCC 4.6-2013.05 r38226) ) #1 Tue Dec 24 10:50:15 PST 2013 [skipping some lines] [ 13.156250] Error: Driver 'gpio-keys-polled' is already registered, aborting... [ 19.414062] IPv6: ADDRCONF(NETDEV_UP): ge00: link is not ready [ 19.421875] ar71xx: pll_reg 0xb8050010: 0x11110000 [ 19.429687] se00: link up (1000Mbps/Full duplex) [ 22.140625] IPv6: ADDRCONF(NETDEV_UP): sw00: link is not ready [ 23.351562] IPv6: ADDRCONF(NETDEV_CHANGE): sw00: link becomes ready [ 23.757812] ar71xx: pll_reg 0xb8050014: 0x11110000 [ 23.757812] ge00: link up (1000Mbps/Full duplex) [ 23.773437] IPv6: ADDRCONF(NETDEV_CHANGE): ge00: link becomes ready root@buffy2-1:~# ifconfig ge00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B1 inet addr:192.168.1.138 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::2eb0:5dff:fea0:c5b1/64 Scope:Link inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 Scope:Global UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1469670 errors:0 dropped:8 overruns:0 frame:0 TX packets:547733 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:229243410 (218.6 MiB) TX bytes:57304808 (54.6 MiB) Interrupt:5 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:23689 errors:0 dropped:0 overruns:0 frame:0 TX packets:23689 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2612713 (2.4 MiB) TX bytes:2612713 (2.4 MiB) pimreg Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 UP RUNNING NOARP MTU:1472 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) se00 Link encap:Ethernet HWaddr 2E:B0:5D:A0:C5:B0 inet addr:172.30.42.1 Bcast:172.30.42.31 Mask:255.255.255.224 inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 Scope:Global inet6 addr: fe80::2cb0:5dff:fea0:c5b0/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:191740 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:42184988 (40.2 MiB) Interrupt:4 sw00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B0 inet addr:172.30.42.65 Bcast:172.30.42.95 Mask:255.255.255.224 inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64 Scope:Global inet6 addr: fe80::2eb0:5dff:fea0:c5b0/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:70239 errors:0 dropped:0 overruns:0 frame:0 TX packets:286967 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:15590189 (14.8 MiB) TX bytes:127357293 (121.4 MiB) root@buffy2-1:~# less /var/log/babeld.log Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. send: Cannot assign requested address send: Cannot assign requested address send: Cannot assign requested address Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. netlink_read: recvmsg(): No buffer space available Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. Couldn't determine channel of interface sw00: Invalid argument. and then the system hang. Note: the first time I ran 'less /var/log/babeld.log', it was Killed. I would assume that's an OOM killer? I'm running cerowrt 3.10.24-8 on a wndr3800. I have a spare wndr3800 that I'm getting ready to flash with the same version and I can see if I get the same issue. What's the best way to clone a configuration? --- Here are some log lines from a separate instance of the same problem, I recovered these log lines from the web server (I was able to login to the web interface for another hour until I got the same lua exception as in the beginning of this email) This time I was completely unable to connect to wifi from my iPhone and you can see where hostapd disassociates the iPhone soon after it connects. the kernel log: [158995.550781] icmp6_send: no reply to icmp error [160930.527343] ath: phy0: Failed to stop TX DMA, queues=0x004! [160930.542968] ath: phy0: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020 DMADBG_7=0x000084c0 [160930.554687] ath: phy0: Could not stop RX, we could be confusing the DMA engine when we start RX up and in the system log: Fri Jan 24 15:51:15 2014 daemon.err minissdpd[4933]: recvmsg(s, &hdr, 0): No buffer space available Fri Jan 24 15:51:15 2014 daemon.err avahi-daemon[1273]: netlink.c: recvmsg() failed: No buffer space available Fri Jan 24 15:52:18 2014 daemon.err minissdpd[4933]: recvmsg(s, &hdr, 0): No buffer space available root@buffy2-1:/tmp/log# free total used free shared buffers Mem: 126336 110128 16208 0 5616 -/+ buffers: 104512 21824 Swap: 0 0 0 Fri Jan 24 17:14:53 2014 daemon.err minissdpd[4933]: recvmsg(s, &hdr, 0): No buffer space available Fri Jan 24 17:15:53 2014 daemon.err minissdpd[4933]: recvmsg(s, &hdr, 0): No buffer space available Fri Jan 24 17:16:18 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 IEEE 802.11: authenticated Fri Jan 24 17:16:18 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 IEEE 802.11: associated (aid 1) Fri Jan 24 17:16:18 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 RADIUS: starting accounting session 52DC70A7-00000010 Fri Jan 24 17:16:18 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 WPA: pairwise key handshake completed (RSN) Fri Jan 24 17:16:18 2014 daemon.info dnsmasq-dhcp[4999]: DHCPREQUEST(sw00) 192.168.1.114 90:72:40:e9:9c:b6 Fri Jan 24 17:16:18 2014 daemon.info dnsmasq-dhcp[4999]: DHCPNAK(sw00) 192.168.1.114 90:72:40:e9:9c:b6 wrong network Fri Jan 24 17:16:21 2014 daemon.info dnsmasq-dhcp[4999]: DHCPDISCOVER(sw00) 90:72:40:e9:9c:b6 Fri Jan 24 17:16:21 2014 daemon.info dnsmasq-dhcp[4999]: DHCPOFFER(sw00) 172.30.42.82 90:72:40:e9:9c:b6 Fri Jan 24 17:16:21 2014 daemon.info dnsmasq-dhcp[4999]: DHCPDISCOVER(sw00) 90:72:40:e9:9c:b6 Fri Jan 24 17:16:21 2014 daemon.info dnsmasq-dhcp[4999]: DHCPOFFER(sw00) 172.30.42.82 90:72:40:e9:9c:b6 Fri Jan 24 17:16:22 2014 daemon.info dnsmasq-dhcp[4999]: DHCPREQUEST(sw00) 172.30.42.82 90:72:40:e9:9c:b6 Fri Jan 24 17:16:22 2014 daemon.info dnsmasq-dhcp[4999]: DHCPACK(sw00) 172.30.42.82 90:72:40:e9:9c:b6 stevejs-iPhone Fri Jan 24 17:16:35 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 IEEE 802.11: disassociated Fri Jan 24 17:16:36 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE) Fri Jan 24 17:16:57 2014 daemon.err minissdpd[4933]: recvmsg(s, &hdr, 0): No buffer space available Fri Jan 24 17:17:59 2014 daemon.err minissdpd[4933]: recvmsg(s, &hdr, 0): No buffer space available Fri Jan 24 17:19:04 2014 daemon.err minissdpd[4933]: recvmsg(s, &hdr, 0): No buffer space available Fri Jan 24 17:19:04 2014 daemon.err avahi-daemon[1273]: netlink.c: recvmsg() failed: No buffer space available Fri Jan 24 17:19:45 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 IEEE 802.11: authenticated Fri Jan 24 17:19:45 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 IEEE 802.11: associated (aid 1) Fri Jan 24 17:19:45 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 RADIUS: starting accounting session 52DC70A7-00000011 Fri Jan 24 17:19:45 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 WPA: pairwise key handshake completed (RSN) Fri Jan 24 17:19:45 2014 daemon.info dnsmasq-dhcp[4999]: DHCPDISCOVER(sw00) 90:72:40:e9:9c:b6 Fri Jan 24 17:19:45 2014 daemon.info dnsmasq-dhcp[4999]: DHCPOFFER(sw00) 172.30.42.82 90:72:40:e9:9c:b6 Fri Jan 24 17:19:47 2014 daemon.info dnsmasq-dhcp[4999]: DHCPREQUEST(sw00) 172.30.42.82 90:72:40:e9:9c:b6 Fri Jan 24 17:19:47 2014 daemon.info dnsmasq-dhcp[4999]: DHCPACK(sw00) 172.30.42.82 90:72:40:e9:9c:b6 stevejs-iPhone Fri Jan 24 17:20:01 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 IEEE 802.11: disassociated Fri Jan 24 17:20:02 2014 daemon.info hostapd: sw00: STA 90:72:40:e9:9c:b6 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE) Fri Jan 24 17:20:09 2014 daemon.err minissdpd[4933]: recvmsg(s, &hdr, 0): No buffer space available Fri Jan 24 17:21:10 2014 daemon.err minissdpd[4933]: recvmsg(s, &hdr, 0): No buffer space available and then it hung and rebooted. I bought this particular wndr3800 used so it's possible I have a bad piece of hardware. Are there any hardware testing scripts or anything like memtest86 for these devices? Are there entries in /sys that I should be looking at? I've noticed about 9k unaligned_instructions on boot but that never grows. Thanks! Steve Jenson @stevej