[Cerowrt-devel] cerowrt issues (3.10.24-8)

Sebastian Moeller moeller0 at gmx.de
Wed Jan 29 04:45:50 PST 2014


Hi Dave,

quick question, how does one turn of logging for babeld? It seems that if daemonized it defaults to logging to /var/log/babeld.log (or similar). Is setting the log file to /dev/null really the answer?
	(Since I have no the IPv6 issue not yet resolved, I assume babeld is unhappy) I resorted to stopping babeld completely, but that feels like a crutch…

Best
	Sebastian



On Jan 27, 2014, at 22:14 , Dave Taht <dave.taht at gmail.com> wrote:

> certainly turn off the babeld log! I will leave it off in the next release.
> 
> On Mon, Jan 27, 2014 at 4:10 PM, Steve Jenson <stevej at fruitless.org> wrote:
>> Looking more, the buffer errors are showing up in syslog well before tmpfs
>> fills up. Is the memtester openwrt package available for cerowrt? I don't
>> see it under `Available packages`.
>> 
>> Thanks,
>> Steve
>> 
>> 
>> On Mon, Jan 27, 2014 at 1:06 PM, Steve Jenson <stevej at fruitless.org> wrote:
>>> 
>>> On Fri, Jan 24, 2014 at 3:23 PM, Dave Taht <dave.taht at gmail.com> wrote:
>>>> 
>>>> On Fri, Jan 24, 2014 at 6:08 PM, Steve Jenson <stevej at fruitless.org>
>>>> wrote:
>>>>> Hi everybody,
>>>>> 
>>>>> I've been using cerowrt as a secondary wifi network (just a single AP
>>>>> for
>>>>> now) for a few weeks now. Recently, my wndr3800 got stuck in a bad
>>>>> state and
>>>>> eventually rebooted. I've had this happen a few times now and am
>>>>> looking for
>>>>> ways to debug the issue. I'm new to cerowrt and openwrt so any advice
>>>>> is
>>>>> appreciated.
>>>>> 
>>>>> Since I use it as a secondary network, this is no way critical.
>>>> 
>>>> Yea! I appreciate caution before putting alpha software on your gw.
>>>> 
>>>>> I'm not
>>>>> looking for free tech support but I couldn't find anything on the wiki
>>>>> about
>>>>> troubleshooting. I'd love to start a page and write some shell scripts
>>>>> to
>>>>> diagnose and report issues. I know that a cerowrt router is meant to be
>>>>> a
>>>>> research project rather a consumer device but these things seem helpful
>>>>> regardless.
>>>> 
>>>> Sure, let me know your wiki account. I have been lax about granting
>>>> access of late as the signup process is overrun by spammers.
>>> 
>>> 
>>> My username is stevej on the wiki. Thanks!
>>> 
>>> 
>>>> 
>>>>> Please let me know if you'd prefer I not email the list with these
>>>>> issues or
>>>>> if you'd rather I used trac or a different forum.
>>>>> 
>>>> 
>>>> The list is where most stuff happens. Also in the irc channel.
>>>> 
>>>> If it gets to where it needs to be tracked we have a bugtracker at
>>>> 
>>>> http://www.bufferbloat.net/projects/cerowrt/issues
>>>> 
>>>> The first question I have is: Are you on comcast? Cerowrt
>>>> had a dhcpv6-pd implementation that "just worked" from feburary through
>>>> december. Regrettably they changed the RA announcement interval
>>>> to a really low number around then... and this triggers a firewall reload
>>>> every minute on everything prior to the release I point to below.
>>>> 
>>>> If there is a memory leak somewhere that would have triggered it.
>>> 
>>> 
>>> I am on AT&T ADSL2+ with a Motorola NVG510 modem.
>>> 
>>> 
>>>> 
>>>>> In this state, I can connect to the cerowrt base station via wifi but
>>>>> am
>>>>> unable to route packets to the internet. I can connect to :81 and see
>>>>> the
>>>>> login page but logging in results in a lua error at `/cgi-bin/luci`
>>>>> 
>>>>> 
>>>>>  /usr/lib/lua/luci/dispatcher.lua:448: Failed to execute function
>>>>> dispatcher target for entry '/'.
>>>>>  The called action terminated with an exception:
>>>>>  /usr/lib/lua/luci/sauth.lua:87: Session data invalid!
>>>>>  stack traceback:
>>>>>  [C]: in function 'assert'
>>>>>  /usr/lib/lua/luci/dispatcher.lua:448: in function 'dispatch'
>>>>>  /usr/lib/lua/luci/dispatcher.lua:195: in function
>>>>> </usr/lib/lua/luci/dispatcher.lua:194>
>>>>> 
>>>>> I can ssh into the device and cat various log files until the router
>>>>> hangs
>>>>> and reboots. here's a few relevant lines from my terminal history
>>>>> before the
>>>>> device rebooted (I'm assuming a watchdog kicked in and rebooted it).
>>>>> 
>>>>> root at buffy2-1:~# ping google.com
>>>>> ping: bad address 'google.com'
>>>>> root at buffy2-1:~# free
>>>>>             total         used         free       shared      buffers
>>>>> Mem:        126336       110332        16004            0         5616
>>>>> -/+ buffers:             104716        21620
>>>>> Swap:            0            0            0
>>>>> root at buffy2-1:~# uptime
>>>>> 02:08:54 up 2 days,  1:26,  load average: 0.10, 0.21, 0.17
>>>>> root at buffy2-1:~# dmesg
>>>>> [    0.000000] Linux version 3.10.24 (cero2 at snapon) (gcc version 4.6.4
>>>>> (OpenWrt/Linaro GCC 4.6-2013.05 r38226) ) #1 Tue Dec 24
>>>>> 10:50:15 PST 2013
>>>>> [skipping some lines]
>>>>> 
>>>>> [   13.156250] Error: Driver 'gpio-keys-polled' is already registered,
>>>>> aborting...
>>>>> [   19.414062] IPv6: ADDRCONF(NETDEV_UP): ge00: link is not ready
>>>>> [   19.421875] ar71xx: pll_reg 0xb8050010: 0x11110000
>>>>> [   19.429687] se00: link up (1000Mbps/Full duplex)
>>>>> [   22.140625] IPv6: ADDRCONF(NETDEV_UP): sw00: link is not ready
>>>>> [   23.351562] IPv6: ADDRCONF(NETDEV_CHANGE): sw00: link becomes ready
>>>>> [   23.757812] ar71xx: pll_reg 0xb8050014: 0x11110000
>>>>> [   23.757812] ge00: link up (1000Mbps/Full duplex)
>>>>> [   23.773437] IPv6: ADDRCONF(NETDEV_CHANGE): ge00: link becomes ready
>>>>> 
>>>>> root at buffy2-1:~# ifconfig
>>>>> ge00      Link encap:Ethernet  HWaddr 2C:B0:5D:A0:C5:B1
>>>>>          inet addr:192.168.1.138  Bcast:192.168.1.255
>>>>> Mask:255.255.255.0
>>>>>          inet6 addr: fe80::2eb0:5dff:fea0:c5b1/64 Scope:Link
>>>>>          inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
>>>>> Scope:Global
>>>>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>>>          RX packets:1469670 errors:0 dropped:8 overruns:0 frame:0
>>>>>          TX packets:547733 errors:0 dropped:0 overruns:0 carrier:0
>>>>>          collisions:0 txqueuelen:1000
>>>>>          RX bytes:229243410 (218.6 MiB)  TX bytes:57304808 (54.6 MiB)
>>>>>          Interrupt:5
>>>>> 
>>>>> lo        Link encap:Local Loopback
>>>>>          inet addr:127.0.0.1  Mask:255.0.0.0
>>>>>          inet6 addr: ::1/128 Scope:Host
>>>>>          UP LOOPBACK RUNNING  MTU:65536  Metric:1
>>>>>          RX packets:23689 errors:0 dropped:0 overruns:0 frame:0
>>>>>          TX packets:23689 errors:0 dropped:0 overruns:0 carrier:0
>>>>>          collisions:0 txqueuelen:0
>>>>>          RX bytes:2612713 (2.4 MiB)  TX bytes:2612713 (2.4 MiB)
>>>>> 
>>>>> pimreg    Link encap:UNSPEC  HWaddr
>>>>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>>>>>          UP RUNNING NOARP  MTU:1472  Metric:1
>>>>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>>>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>>>          collisions:0 txqueuelen:0
>>>>>          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
>>>>> 
>>>>> se00      Link encap:Ethernet  HWaddr 2E:B0:5D:A0:C5:B0
>>>>>          inet addr:172.30.42.1  Bcast:172.30.42.31
>>>>> Mask:255.255.255.224
>>>>>          inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
>>>>> Scope:Global
>>>> 
>>>> How are you assigning your ipv6 addresses?
>>> 
>>> 
>>> It's been a while since I messed with this but I think IPv6 is assigned
>>> thanks to 6relayd? My modem has IPv6 enabled but no DHCPv6 options that I
>>> can find. Here's how cerowrt is configured.
>>> 
>>> root at buffy2-1:/overlay/etc/config# cat 6relayd
>>> config server 'default'
>>>  option fallback_relay 'rd dhcpv6 ndp'
>>>  list network 'ge00'
>>>  list network 'ge01'
>>>  list network 'gw00'
>>>  list network 'gw01'
>>>  list network 'gw10'
>>>  list network 'gw11'
>>>  list network 'se00'
>>>  list network 'sw00'
>>>  list network 'sw10'
>>>  option rd 'relay'
>>>  option dhcpv6 'relay'
>>>  option ndp 'relay'
>>>  option master 'ge00'
>>> 
>>>> 
>>>>>          inet6 addr: fe80::2cb0:5dff:fea0:c5b0/64 Scope:Link
>>>>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>>>          TX packets:191740 errors:0 dropped:0 overruns:0 carrier:0
>>>>>          collisions:0 txqueuelen:1000
>>>>>          RX bytes:0 (0.0 B)  TX bytes:42184988 (40.2 MiB)
>>>>>          Interrupt:4
>>>>> 
>>>>> sw00      Link encap:Ethernet  HWaddr 2C:B0:5D:A0:C5:B0
>>>>>          inet addr:172.30.42.65  Bcast:172.30.42.95
>>>>> Mask:255.255.255.224
>>>>>          inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
>>>>> Scope:Global
>>>>>          inet6 addr: fe80::2eb0:5dff:fea0:c5b0/64 Scope:Link
>>>>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>>>          RX packets:70239 errors:0 dropped:0 overruns:0 frame:0
>>>>>          TX packets:286967 errors:0 dropped:0 overruns:0 carrier:0
>>>>>          collisions:0 txqueuelen:1000
>>>>>          RX bytes:15590189 (14.8 MiB)  TX bytes:127357293 (121.4 MiB)
>>>>> 
>>>>> root at buffy2-1:~# less /var/log/babeld.log
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> send: Cannot assign requested address
>>>>> send: Cannot assign requested address
>>>>> send: Cannot assign requested address
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> netlink_read: recvmsg(): No buffer space available
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>> 
>>>> This is a problem in babel detecting the channel on a "normal"
>>>> rather than a mesh interface. It's bugged me a long while, but
>>>> haven't got around to finding what triggers it. Might "fix" it by
>>>> acquiring the channel at babel start time from /etc/config/wireless.
>>>> 
>>>> It messes up the diversity routing calculation, grump.
>>>> 
>>>> There is a possibility a logfile got really big, but this one
>>>> generally doesn't, but I should turn off logging in some
>>>> future release...
>>> 
>>> 
>>> I believe I've tracked down part of what's going on. It looks like my
>>> tmpfs is filling up 100% and then the device enters a bad state:
>>> 
>>> After 24 hours, with tmpfs at 50%, babeld.log is the largest file by far
>>> in tmpfs and the only file that appears to be growing (based on `du`). It
>>> takes about 48 hours from reboot to fill up tmpfs on my device.
>>> 
>>> #  sort babeld.log | uniq -c |sort -rn |head
>>> 
>>> 503236 Couldn't determine channel of interface sw00: Invalid argument.
>>> 
>>>   1376 netlink_read: recvmsg(): No buffer space available
>>> 
>>>      3 send: Cannot assign requested address
>>> 
>>> # wc -l babeld.log
>>> 
>>> 504617 babeld.log
>>> 
>>> I sped up system failure by using `dd` to fill up tmpfs and the system
>>> became immediately unusable.
>>> 
>>> This also explains the luci session store errors as sessions are stored in
>>> tmpfs.
>>> 
>>> The other buffer issues may or may not be related to this.
>>> 
>>> Best,
>>> Steve
>> 
>> 
> 
> 
> 
> -- 
> Dave Täht
> 
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel



More information about the Cerowrt-devel mailing list