From: Steve Jenson <stevej@fruitless.org>
To: Dave Taht <dave.taht@gmail.com>
Cc: "cerowrt-devel@lists.bufferbloat.net"
<cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel] cerowrt issues (3.10.24-8)
Date: Mon, 27 Jan 2014 13:06:55 -0800 [thread overview]
Message-ID: <CABiHZ_XyhwQpJgU1soNXZ-4E+578aP0qsDAg1sZ0UxpSY=x5_w@mail.gmail.com> (raw)
In-Reply-To: <CAA93jw7oXDAiVoPU-jokBpg3VbwbB2VOgMXz7bHY6tgNSZG7Ow@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 11366 bytes --]
On Fri, Jan 24, 2014 at 3:23 PM, Dave Taht <dave.taht@gmail.com> wrote:
> On Fri, Jan 24, 2014 at 6:08 PM, Steve Jenson <stevej@fruitless.org>
> wrote:
> > Hi everybody,
> >
> > I've been using cerowrt as a secondary wifi network (just a single AP for
> > now) for a few weeks now. Recently, my wndr3800 got stuck in a bad state
> and
> > eventually rebooted. I've had this happen a few times now and am looking
> for
> > ways to debug the issue. I'm new to cerowrt and openwrt so any advice is
> > appreciated.
> >
> > Since I use it as a secondary network, this is no way critical.
>
> Yea! I appreciate caution before putting alpha software on your gw.
>
> > I'm not
> > looking for free tech support but I couldn't find anything on the wiki
> about
> > troubleshooting. I'd love to start a page and write some shell scripts to
> > diagnose and report issues. I know that a cerowrt router is meant to be a
> > research project rather a consumer device but these things seem helpful
> > regardless.
>
> Sure, let me know your wiki account. I have been lax about granting
> access of late as the signup process is overrun by spammers.
My username is stevej on the wiki. Thanks!
> > Please let me know if you'd prefer I not email the list with these
> issues or
> > if you'd rather I used trac or a different forum.
> >
>
> The list is where most stuff happens. Also in the irc channel.
>
> If it gets to where it needs to be tracked we have a bugtracker at
>
> http://www.bufferbloat.net/projects/cerowrt/issues
>
> The first question I have is: Are you on comcast? Cerowrt
> had a dhcpv6-pd implementation that "just worked" from feburary through
> december. Regrettably they changed the RA announcement interval
> to a really low number around then... and this triggers a firewall reload
> every minute on everything prior to the release I point to below.
>
> If there is a memory leak somewhere that would have triggered it.
I am on AT&T ADSL2+ with a Motorola NVG510 modem.
> > In this state, I can connect to the cerowrt base station via wifi but am
> > unable to route packets to the internet. I can connect to :81 and see the
> > login page but logging in results in a lua error at `/cgi-bin/luci`
> >
> >
> > /usr/lib/lua/luci/dispatcher.lua:448: Failed to execute function
> > dispatcher target for entry '/'.
> > The called action terminated with an exception:
> > /usr/lib/lua/luci/sauth.lua:87: Session data invalid!
> > stack traceback:
> > [C]: in function 'assert'
> > /usr/lib/lua/luci/dispatcher.lua:448: in function 'dispatch'
> > /usr/lib/lua/luci/dispatcher.lua:195: in function
> > </usr/lib/lua/luci/dispatcher.lua:194>
> >
> > I can ssh into the device and cat various log files until the router
> hangs
> > and reboots. here's a few relevant lines from my terminal history before
> the
> > device rebooted (I'm assuming a watchdog kicked in and rebooted it).
> >
> > root@buffy2-1:~# ping google.com
> > ping: bad address 'google.com'
> > root@buffy2-1:~# free
> > total used free shared buffers
> > Mem: 126336 110332 16004 0 5616
> > -/+ buffers: 104716 21620
> > Swap: 0 0 0
> > root@buffy2-1:~# uptime
> > 02:08:54 up 2 days, 1:26, load average: 0.10, 0.21, 0.17
> > root@buffy2-1:~# dmesg
> > [ 0.000000] Linux version 3.10.24 (cero2@snapon) (gcc version 4.6.4
> > (OpenWrt/Linaro GCC 4.6-2013.05 r38226) ) #1 Tue Dec 24
> > 10:50:15 PST 2013
> > [skipping some lines]
> >
> > [ 13.156250] Error: Driver 'gpio-keys-polled' is already registered,
> > aborting...
> > [ 19.414062] IPv6: ADDRCONF(NETDEV_UP): ge00: link is not ready
> > [ 19.421875] ar71xx: pll_reg 0xb8050010: 0x11110000
> > [ 19.429687] se00: link up (1000Mbps/Full duplex)
> > [ 22.140625] IPv6: ADDRCONF(NETDEV_UP): sw00: link is not ready
> > [ 23.351562] IPv6: ADDRCONF(NETDEV_CHANGE): sw00: link becomes ready
> > [ 23.757812] ar71xx: pll_reg 0xb8050014: 0x11110000
> > [ 23.757812] ge00: link up (1000Mbps/Full duplex)
> > [ 23.773437] IPv6: ADDRCONF(NETDEV_CHANGE): ge00: link becomes ready
> >
> > root@buffy2-1:~# ifconfig
> > ge00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B1
> > inet addr:192.168.1.138 Bcast:192.168.1.255
> Mask:255.255.255.0
> > inet6 addr: fe80::2eb0:5dff:fea0:c5b1/64 Scope:Link
> > inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
> Scope:Global
> > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> > RX packets:1469670 errors:0 dropped:8 overruns:0 frame:0
> > TX packets:547733 errors:0 dropped:0 overruns:0 carrier:0
> > collisions:0 txqueuelen:1000
> > RX bytes:229243410 (218.6 MiB) TX bytes:57304808 (54.6 MiB)
> > Interrupt:5
> >
> > lo Link encap:Local Loopback
> > inet addr:127.0.0.1 Mask:255.0.0.0
> > inet6 addr: ::1/128 Scope:Host
> > UP LOOPBACK RUNNING MTU:65536 Metric:1
> > RX packets:23689 errors:0 dropped:0 overruns:0 frame:0
> > TX packets:23689 errors:0 dropped:0 overruns:0 carrier:0
> > collisions:0 txqueuelen:0
> > RX bytes:2612713 (2.4 MiB) TX bytes:2612713 (2.4 MiB)
> >
> > pimreg Link encap:UNSPEC HWaddr
> > 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> > UP RUNNING NOARP MTU:1472 Metric:1
> > RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> > collisions:0 txqueuelen:0
> > RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
> >
> > se00 Link encap:Ethernet HWaddr 2E:B0:5D:A0:C5:B0
> > inet addr:172.30.42.1 Bcast:172.30.42.31 Mask:255.255.255.224
> > inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
> Scope:Global
>
> How are you assigning your ipv6 addresses?
>
It's been a while since I messed with this but I think IPv6 is assigned
thanks to 6relayd? My modem has IPv6 enabled but no DHCPv6 options that I
can find. Here's how cerowrt is configured.
root@buffy2-1:/overlay/etc/config# cat 6relayd
config server 'default'
option fallback_relay 'rd dhcpv6 ndp'
list network 'ge00'
list network 'ge01'
list network 'gw00'
list network 'gw01'
list network 'gw10'
list network 'gw11'
list network 'se00'
list network 'sw00'
list network 'sw10'
option rd 'relay'
option dhcpv6 'relay'
option ndp 'relay'
option master 'ge00'
> > inet6 addr: fe80::2cb0:5dff:fea0:c5b0/64 Scope:Link
> > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> > RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > TX packets:191740 errors:0 dropped:0 overruns:0 carrier:0
> > collisions:0 txqueuelen:1000
> > RX bytes:0 (0.0 B) TX bytes:42184988 (40.2 MiB)
> > Interrupt:4
> >
> > sw00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B0
> > inet addr:172.30.42.65 Bcast:172.30.42.95
> Mask:255.255.255.224
> > inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
> Scope:Global
> > inet6 addr: fe80::2eb0:5dff:fea0:c5b0/64 Scope:Link
> > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> > RX packets:70239 errors:0 dropped:0 overruns:0 frame:0
> > TX packets:286967 errors:0 dropped:0 overruns:0 carrier:0
> > collisions:0 txqueuelen:1000
> > RX bytes:15590189 (14.8 MiB) TX bytes:127357293 (121.4 MiB)
> >
> > root@buffy2-1:~# less /var/log/babeld.log
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > send: Cannot assign requested address
> > send: Cannot assign requested address
> > send: Cannot assign requested address
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > netlink_read: recvmsg(): No buffer space available
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
> > Couldn't determine channel of interface sw00: Invalid argument.
>
> This is a problem in babel detecting the channel on a "normal"
> rather than a mesh interface. It's bugged me a long while, but
> haven't got around to finding what triggers it. Might "fix" it by
> acquiring the channel at babel start time from /etc/config/wireless.
>
> It messes up the diversity routing calculation, grump.
>
> There is a possibility a logfile got really big, but this one
> generally doesn't, but I should turn off logging in some
> future release...
I believe I've tracked down part of what's going on. It looks like my tmpfs
is filling up 100% and then the device enters a bad state:
After 24 hours, with tmpfs at 50%, babeld.log is the largest file by far in
tmpfs and the only file that appears to be growing (based on `du`). It
takes about 48 hours from reboot to fill up tmpfs on my device.
# sort babeld.log | uniq -c |sort -rn |head
503236 Couldn't determine channel of interface sw00: Invalid argument.
1376 netlink_read: recvmsg(): No buffer space available
3 send: Cannot assign requested address
# wc -l babeld.log
504617 babeld.log
I sped up system failure by using `dd` to fill up tmpfs and the system
became immediately unusable.
This also explains the luci session store errors as sessions are stored in
tmpfs.
The other buffer issues may or may not be related to this.
Best,
Steve
[-- Attachment #2: Type: text/html, Size: 14513 bytes --]
next prev parent reply other threads:[~2014-01-27 21:07 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-24 23:08 Steve Jenson
2014-01-24 23:23 ` Dave Taht
2014-01-27 21:06 ` Steve Jenson [this message]
2014-01-27 21:10 ` Steve Jenson
2014-01-27 21:14 ` Dave Taht
2014-01-29 12:45 ` Sebastian Moeller
2014-01-29 16:10 ` Dave Taht
2014-01-29 17:44 ` Sebastian Moeller
2014-01-29 18:21 ` Dave Taht
2014-01-29 19:51 ` Sebastian Moeller
2014-01-29 19:56 ` David Personette
2014-01-29 20:04 ` Sebastian Moeller
2014-01-29 18:24 ` Steve Jenson
2014-01-29 19:55 ` Sebastian Moeller
2014-01-30 16:21 ` Dave Taht
2014-01-30 18:46 ` Sebastian Moeller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/cerowrt-devel.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CABiHZ_XyhwQpJgU1soNXZ-4E+578aP0qsDAg1sZ0UxpSY=x5_w@mail.gmail.com' \
--to=stevej@fruitless.org \
--cc=cerowrt-devel@lists.bufferbloat.net \
--cc=dave.taht@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox