From: Steve Jenson <stevej@fruitless.org>
To: Dave Taht <dave.taht@gmail.com>
Cc: "cerowrt-devel@lists.bufferbloat.net"
<cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel] cerowrt issues (3.10.24-8)
Date: Mon, 27 Jan 2014 13:10:39 -0800 [thread overview]
Message-ID: <CABiHZ_Uv8Tge9Qoh+hcgjmBw_3eUTA2JTLZv7e1YUWg-zS1-VA@mail.gmail.com> (raw)
In-Reply-To: <CABiHZ_XyhwQpJgU1soNXZ-4E+578aP0qsDAg1sZ0UxpSY=x5_w@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 11966 bytes --]
Looking more, the buffer errors are showing up in syslog well before tmpfs
fills up. Is the memtester openwrt package available for cerowrt? I don't
see it under `Available packages`.
Thanks,
Steve
On Mon, Jan 27, 2014 at 1:06 PM, Steve Jenson <stevej@fruitless.org> wrote:
> On Fri, Jan 24, 2014 at 3:23 PM, Dave Taht <dave.taht@gmail.com> wrote:
>
>> On Fri, Jan 24, 2014 at 6:08 PM, Steve Jenson <stevej@fruitless.org>
>> wrote:
>> > Hi everybody,
>> >
>> > I've been using cerowrt as a secondary wifi network (just a single AP
>> for
>> > now) for a few weeks now. Recently, my wndr3800 got stuck in a bad
>> state and
>> > eventually rebooted. I've had this happen a few times now and am
>> looking for
>> > ways to debug the issue. I'm new to cerowrt and openwrt so any advice is
>> > appreciated.
>> >
>> > Since I use it as a secondary network, this is no way critical.
>>
>> Yea! I appreciate caution before putting alpha software on your gw.
>>
>> > I'm not
>> > looking for free tech support but I couldn't find anything on the wiki
>> about
>> > troubleshooting. I'd love to start a page and write some shell scripts
>> to
>> > diagnose and report issues. I know that a cerowrt router is meant to be
>> a
>> > research project rather a consumer device but these things seem helpful
>> > regardless.
>>
>> Sure, let me know your wiki account. I have been lax about granting
>> access of late as the signup process is overrun by spammers.
>
>
> My username is stevej on the wiki. Thanks!
>
>
>
>> > Please let me know if you'd prefer I not email the list with these
>> issues or
>> > if you'd rather I used trac or a different forum.
>> >
>>
>> The list is where most stuff happens. Also in the irc channel.
>>
>> If it gets to where it needs to be tracked we have a bugtracker at
>>
>> http://www.bufferbloat.net/projects/cerowrt/issues
>>
>> The first question I have is: Are you on comcast? Cerowrt
>> had a dhcpv6-pd implementation that "just worked" from feburary through
>> december. Regrettably they changed the RA announcement interval
>> to a really low number around then... and this triggers a firewall reload
>> every minute on everything prior to the release I point to below.
>>
>> If there is a memory leak somewhere that would have triggered it.
>
>
> I am on AT&T ADSL2+ with a Motorola NVG510 modem.
>
>
>
>> > In this state, I can connect to the cerowrt base station via wifi but am
>> > unable to route packets to the internet. I can connect to :81 and see
>> the
>> > login page but logging in results in a lua error at `/cgi-bin/luci`
>> >
>> >
>> > /usr/lib/lua/luci/dispatcher.lua:448: Failed to execute function
>> > dispatcher target for entry '/'.
>> > The called action terminated with an exception:
>> > /usr/lib/lua/luci/sauth.lua:87: Session data invalid!
>> > stack traceback:
>> > [C]: in function 'assert'
>> > /usr/lib/lua/luci/dispatcher.lua:448: in function 'dispatch'
>> > /usr/lib/lua/luci/dispatcher.lua:195: in function
>> > </usr/lib/lua/luci/dispatcher.lua:194>
>> >
>> > I can ssh into the device and cat various log files until the router
>> hangs
>> > and reboots. here's a few relevant lines from my terminal history
>> before the
>> > device rebooted (I'm assuming a watchdog kicked in and rebooted it).
>> >
>> > root@buffy2-1:~# ping google.com
>> > ping: bad address 'google.com'
>> > root@buffy2-1:~# free
>> > total used free shared buffers
>> > Mem: 126336 110332 16004 0 5616
>> > -/+ buffers: 104716 21620
>> > Swap: 0 0 0
>> > root@buffy2-1:~# uptime
>> > 02:08:54 up 2 days, 1:26, load average: 0.10, 0.21, 0.17
>> > root@buffy2-1:~# dmesg
>> > [ 0.000000] Linux version 3.10.24 (cero2@snapon) (gcc version 4.6.4
>> > (OpenWrt/Linaro GCC 4.6-2013.05 r38226) ) #1 Tue Dec 24
>> > 10:50:15 PST 2013
>> > [skipping some lines]
>> >
>> > [ 13.156250] Error: Driver 'gpio-keys-polled' is already registered,
>> > aborting...
>> > [ 19.414062] IPv6: ADDRCONF(NETDEV_UP): ge00: link is not ready
>> > [ 19.421875] ar71xx: pll_reg 0xb8050010: 0x11110000
>> > [ 19.429687] se00: link up (1000Mbps/Full duplex)
>> > [ 22.140625] IPv6: ADDRCONF(NETDEV_UP): sw00: link is not ready
>> > [ 23.351562] IPv6: ADDRCONF(NETDEV_CHANGE): sw00: link becomes ready
>> > [ 23.757812] ar71xx: pll_reg 0xb8050014: 0x11110000
>> > [ 23.757812] ge00: link up (1000Mbps/Full duplex)
>> > [ 23.773437] IPv6: ADDRCONF(NETDEV_CHANGE): ge00: link becomes ready
>> >
>> > root@buffy2-1:~# ifconfig
>> > ge00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B1
>> > inet addr:192.168.1.138 Bcast:192.168.1.255
>> Mask:255.255.255.0
>> > inet6 addr: fe80::2eb0:5dff:fea0:c5b1/64 Scope:Link
>> > inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
>> Scope:Global
>> > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> > RX packets:1469670 errors:0 dropped:8 overruns:0 frame:0
>> > TX packets:547733 errors:0 dropped:0 overruns:0 carrier:0
>> > collisions:0 txqueuelen:1000
>> > RX bytes:229243410 (218.6 MiB) TX bytes:57304808 (54.6 MiB)
>> > Interrupt:5
>> >
>> > lo Link encap:Local Loopback
>> > inet addr:127.0.0.1 Mask:255.0.0.0
>> > inet6 addr: ::1/128 Scope:Host
>> > UP LOOPBACK RUNNING MTU:65536 Metric:1
>> > RX packets:23689 errors:0 dropped:0 overruns:0 frame:0
>> > TX packets:23689 errors:0 dropped:0 overruns:0 carrier:0
>> > collisions:0 txqueuelen:0
>> > RX bytes:2612713 (2.4 MiB) TX bytes:2612713 (2.4 MiB)
>> >
>> > pimreg Link encap:UNSPEC HWaddr
>> > 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>> > UP RUNNING NOARP MTU:1472 Metric:1
>> > RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>> > collisions:0 txqueuelen:0
>> > RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
>> >
>> > se00 Link encap:Ethernet HWaddr 2E:B0:5D:A0:C5:B0
>> > inet addr:172.30.42.1 Bcast:172.30.42.31
>> Mask:255.255.255.224
>> > inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
>> Scope:Global
>>
>> How are you assigning your ipv6 addresses?
>>
>
> It's been a while since I messed with this but I think IPv6 is assigned
> thanks to 6relayd? My modem has IPv6 enabled but no DHCPv6 options that I
> can find. Here's how cerowrt is configured.
>
> root@buffy2-1:/overlay/etc/config# cat 6relayd
> config server 'default'
> option fallback_relay 'rd dhcpv6 ndp'
> list network 'ge00'
> list network 'ge01'
> list network 'gw00'
> list network 'gw01'
> list network 'gw10'
> list network 'gw11'
> list network 'se00'
> list network 'sw00'
> list network 'sw10'
> option rd 'relay'
> option dhcpv6 'relay'
> option ndp 'relay'
> option master 'ge00'
>
>
>> > inet6 addr: fe80::2cb0:5dff:fea0:c5b0/64 Scope:Link
>> > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> > RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> > TX packets:191740 errors:0 dropped:0 overruns:0 carrier:0
>> > collisions:0 txqueuelen:1000
>> > RX bytes:0 (0.0 B) TX bytes:42184988 (40.2 MiB)
>> > Interrupt:4
>> >
>> > sw00 Link encap:Ethernet HWaddr 2C:B0:5D:A0:C5:B0
>> > inet addr:172.30.42.65 Bcast:172.30.42.95
>> Mask:255.255.255.224
>> > inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
>> Scope:Global
>> > inet6 addr: fe80::2eb0:5dff:fea0:c5b0/64 Scope:Link
>> > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> > RX packets:70239 errors:0 dropped:0 overruns:0 frame:0
>> > TX packets:286967 errors:0 dropped:0 overruns:0 carrier:0
>> > collisions:0 txqueuelen:1000
>> > RX bytes:15590189 (14.8 MiB) TX bytes:127357293 (121.4 MiB)
>> >
>> > root@buffy2-1:~# less /var/log/babeld.log
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > send: Cannot assign requested address
>> > send: Cannot assign requested address
>> > send: Cannot assign requested address
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > netlink_read: recvmsg(): No buffer space available
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>> > Couldn't determine channel of interface sw00: Invalid argument.
>>
>> This is a problem in babel detecting the channel on a "normal"
>> rather than a mesh interface. It's bugged me a long while, but
>> haven't got around to finding what triggers it. Might "fix" it by
>> acquiring the channel at babel start time from /etc/config/wireless.
>>
>> It messes up the diversity routing calculation, grump.
>>
>> There is a possibility a logfile got really big, but this one
>> generally doesn't, but I should turn off logging in some
>> future release...
>
>
> I believe I've tracked down part of what's going on. It looks like my
> tmpfs is filling up 100% and then the device enters a bad state:
>
> After 24 hours, with tmpfs at 50%, babeld.log is the largest file by far
> in tmpfs and the only file that appears to be growing (based on `du`). It
> takes about 48 hours from reboot to fill up tmpfs on my device.
>
> # sort babeld.log | uniq -c |sort -rn |head
>
> 503236 Couldn't determine channel of interface sw00: Invalid argument.
>
> 1376 netlink_read: recvmsg(): No buffer space available
>
> 3 send: Cannot assign requested address
>
> # wc -l babeld.log
>
> 504617 babeld.log
> I sped up system failure by using `dd` to fill up tmpfs and the system
> became immediately unusable.
>
> This also explains the luci session store errors as sessions are stored in
> tmpfs.
>
> The other buffer issues may or may not be related to this.
>
> Best,
> Steve
>
[-- Attachment #2: Type: text/html, Size: 15217 bytes --]
next prev parent reply other threads:[~2014-01-27 21:10 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-24 23:08 Steve Jenson
2014-01-24 23:23 ` Dave Taht
2014-01-27 21:06 ` Steve Jenson
2014-01-27 21:10 ` Steve Jenson [this message]
2014-01-27 21:14 ` Dave Taht
2014-01-29 12:45 ` Sebastian Moeller
2014-01-29 16:10 ` Dave Taht
2014-01-29 17:44 ` Sebastian Moeller
2014-01-29 18:21 ` Dave Taht
2014-01-29 19:51 ` Sebastian Moeller
2014-01-29 19:56 ` David Personette
2014-01-29 20:04 ` Sebastian Moeller
2014-01-29 18:24 ` Steve Jenson
2014-01-29 19:55 ` Sebastian Moeller
2014-01-30 16:21 ` Dave Taht
2014-01-30 18:46 ` Sebastian Moeller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/cerowrt-devel.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABiHZ_Uv8Tge9Qoh+hcgjmBw_3eUTA2JTLZv7e1YUWg-zS1-VA@mail.gmail.com \
--to=stevej@fruitless.org \
--cc=cerowrt-devel@lists.bufferbloat.net \
--cc=dave.taht@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox