[Cerowrt-devel] cerowrt issues (3.10.24-8)

Sebastian Moeller moeller0 at gmx.de
Wed Jan 29 12:44:52 EST 2014


On January 29, 2014 5:10:18 PM CET, Dave Taht <dave.taht at gmail.com> wrote:
>On Wed, Jan 29, 2014 at 4:45 AM, Sebastian Moeller <moeller0 at gmx.de>
>wrote:
>> Hi Dave,
>>
>> quick question, how does one turn of logging for babeld? It seems
>that if daemonized it defaults to logging to /var/log/babeld.log (or
>similar). Is setting the log file to /dev/null really the answer?
>
>seems so.     

      Okay, I guess I will try that then...


>
>>         (Since I have no the IPv6 issue not yet resolved, I assume
>babeld is unhappy) I resorted to stopping babeld completely, but that
>feels like a crutch...
>
>no daemon in an embedded system should ever write to flash in an
>uncontrollable manner.

     What bugs me is that it basically keeps repeating the same error over and over again. If it would rate limit and:or push messages to the system log it would be nicer.


>
>I will also argue that not being able to find the channel is a bug
>that messes with diversity routing
>in particular.

      I have actually not yet understood what it wants to tell me ;), since I got your attention, is there an easy way to run a babel client under macosx?


Best Regards
       Sebastian


>
>>
>> Best
>>         Sebastian
>>
>>
>>
>> On Jan 27, 2014, at 22:14 , Dave Taht <dave.taht at gmail.com> wrote:
>>
>>> certainly turn off the babeld log! I will leave it off in the next
>release.
>>>
>>> On Mon, Jan 27, 2014 at 4:10 PM, Steve Jenson <stevej at fruitless.org>
>wrote:
>>>> Looking more, the buffer errors are showing up in syslog well
>before tmpfs
>>>> fills up. Is the memtester openwrt package available for cerowrt? I
>don't
>>>> see it under `Available packages`.
>>>>
>>>> Thanks,
>>>> Steve
>>>>
>>>>
>>>> On Mon, Jan 27, 2014 at 1:06 PM, Steve Jenson
><stevej at fruitless.org> wrote:
>>>>>
>>>>> On Fri, Jan 24, 2014 at 3:23 PM, Dave Taht <dave.taht at gmail.com>
>wrote:
>>>>>>
>>>>>> On Fri, Jan 24, 2014 at 6:08 PM, Steve Jenson
><stevej at fruitless.org>
>>>>>> wrote:
>>>>>>> Hi everybody,
>>>>>>>
>>>>>>> I've been using cerowrt as a secondary wifi network (just a
>single AP
>>>>>>> for
>>>>>>> now) for a few weeks now. Recently, my wndr3800 got stuck in a
>bad
>>>>>>> state and
>>>>>>> eventually rebooted. I've had this happen a few times now and am
>>>>>>> looking for
>>>>>>> ways to debug the issue. I'm new to cerowrt and openwrt so any
>advice
>>>>>>> is
>>>>>>> appreciated.
>>>>>>>
>>>>>>> Since I use it as a secondary network, this is no way critical.
>>>>>>
>>>>>> Yea! I appreciate caution before putting alpha software on your
>gw.
>>>>>>
>>>>>>> I'm not
>>>>>>> looking for free tech support but I couldn't find anything on
>the wiki
>>>>>>> about
>>>>>>> troubleshooting. I'd love to start a page and write some shell
>scripts
>>>>>>> to
>>>>>>> diagnose and report issues. I know that a cerowrt router is
>meant to be
>>>>>>> a
>>>>>>> research project rather a consumer device but these things seem
>helpful
>>>>>>> regardless.
>>>>>>
>>>>>> Sure, let me know your wiki account. I have been lax about
>granting
>>>>>> access of late as the signup process is overrun by spammers.
>>>>>
>>>>>
>>>>> My username is stevej on the wiki. Thanks!
>>>>>
>>>>>
>>>>>>
>>>>>>> Please let me know if you'd prefer I not email the list with
>these
>>>>>>> issues or
>>>>>>> if you'd rather I used trac or a different forum.
>>>>>>>
>>>>>>
>>>>>> The list is where most stuff happens. Also in the irc channel.
>>>>>>
>>>>>> If it gets to where it needs to be tracked we have a bugtracker
>at
>>>>>>
>>>>>> http://www.bufferbloat.net/projects/cerowrt/issues
>>>>>>
>>>>>> The first question I have is: Are you on comcast? Cerowrt
>>>>>> had a dhcpv6-pd implementation that "just worked" from feburary
>through
>>>>>> december. Regrettably they changed the RA announcement interval
>>>>>> to a really low number around then... and this triggers a
>firewall reload
>>>>>> every minute on everything prior to the release I point to below.
>>>>>>
>>>>>> If there is a memory leak somewhere that would have triggered it.
>>>>>
>>>>>
>>>>> I am on AT&T ADSL2+ with a Motorola NVG510 modem.
>>>>>
>>>>>
>>>>>>
>>>>>>> In this state, I can connect to the cerowrt base station via
>wifi but
>>>>>>> am
>>>>>>> unable to route packets to the internet. I can connect to :81
>and see
>>>>>>> the
>>>>>>> login page but logging in results in a lua error at
>`/cgi-bin/luci`
>>>>>>>
>>>>>>>
>>>>>>>  /usr/lib/lua/luci/dispatcher.lua:448: Failed to execute
>function
>>>>>>> dispatcher target for entry '/'.
>>>>>>>  The called action terminated with an exception:
>>>>>>>  /usr/lib/lua/luci/sauth.lua:87: Session data invalid!
>>>>>>>  stack traceback:
>>>>>>>  [C]: in function 'assert'
>>>>>>>  /usr/lib/lua/luci/dispatcher.lua:448: in function 'dispatch'
>>>>>>>  /usr/lib/lua/luci/dispatcher.lua:195: in function
>>>>>>> </usr/lib/lua/luci/dispatcher.lua:194>
>>>>>>>
>>>>>>> I can ssh into the device and cat various log files until the
>router
>>>>>>> hangs
>>>>>>> and reboots. here's a few relevant lines from my terminal
>history
>>>>>>> before the
>>>>>>> device rebooted (I'm assuming a watchdog kicked in and rebooted
>it).
>>>>>>>
>>>>>>> root at buffy2-1:~# ping google.com
>>>>>>> ping: bad address 'google.com'
>>>>>>> root at buffy2-1:~# free
>>>>>>>             total         used         free       shared     
>buffers
>>>>>>> Mem:        126336       110332        16004            0       
> 5616
>>>>>>> -/+ buffers:             104716        21620
>>>>>>> Swap:            0            0            0
>>>>>>> root at buffy2-1:~# uptime
>>>>>>> 02:08:54 up 2 days,  1:26,  load average: 0.10, 0.21, 0.17
>>>>>>> root at buffy2-1:~# dmesg
>>>>>>> [    0.000000] Linux version 3.10.24 (cero2 at snapon) (gcc version
>4.6.4
>>>>>>> (OpenWrt/Linaro GCC 4.6-2013.05 r38226) ) #1 Tue Dec 24
>>>>>>> 10:50:15 PST 2013
>>>>>>> [skipping some lines]
>>>>>>>
>>>>>>> [   13.156250] Error: Driver 'gpio-keys-polled' is already
>registered,
>>>>>>> aborting...
>>>>>>> [   19.414062] IPv6: ADDRCONF(NETDEV_UP): ge00: link is not
>ready
>>>>>>> [   19.421875] ar71xx: pll_reg 0xb8050010: 0x11110000
>>>>>>> [   19.429687] se00: link up (1000Mbps/Full duplex)
>>>>>>> [   22.140625] IPv6: ADDRCONF(NETDEV_UP): sw00: link is not
>ready
>>>>>>> [   23.351562] IPv6: ADDRCONF(NETDEV_CHANGE): sw00: link becomes
>ready
>>>>>>> [   23.757812] ar71xx: pll_reg 0xb8050014: 0x11110000
>>>>>>> [   23.757812] ge00: link up (1000Mbps/Full duplex)
>>>>>>> [   23.773437] IPv6: ADDRCONF(NETDEV_CHANGE): ge00: link becomes
>ready
>>>>>>>
>>>>>>> root at buffy2-1:~# ifconfig
>>>>>>> ge00      Link encap:Ethernet  HWaddr 2C:B0:5D:A0:C5:B1
>>>>>>>          inet addr:192.168.1.138  Bcast:192.168.1.255
>>>>>>> Mask:255.255.255.0
>>>>>>>          inet6 addr: fe80::2eb0:5dff:fea0:c5b1/64 Scope:Link
>>>>>>>          inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
>>>>>>> Scope:Global
>>>>>>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>>>>>          RX packets:1469670 errors:0 dropped:8 overruns:0
>frame:0
>>>>>>>          TX packets:547733 errors:0 dropped:0 overruns:0
>carrier:0
>>>>>>>          collisions:0 txqueuelen:1000
>>>>>>>          RX bytes:229243410 (218.6 MiB)  TX bytes:57304808 (54.6
>MiB)
>>>>>>>          Interrupt:5
>>>>>>>
>>>>>>> lo        Link encap:Local Loopback
>>>>>>>          inet addr:127.0.0.1  Mask:255.0.0.0
>>>>>>>          inet6 addr: ::1/128 Scope:Host
>>>>>>>          UP LOOPBACK RUNNING  MTU:65536  Metric:1
>>>>>>>          RX packets:23689 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>          TX packets:23689 errors:0 dropped:0 overruns:0
>carrier:0
>>>>>>>          collisions:0 txqueuelen:0
>>>>>>>          RX bytes:2612713 (2.4 MiB)  TX bytes:2612713 (2.4 MiB)
>>>>>>>
>>>>>>> pimreg    Link encap:UNSPEC  HWaddr
>>>>>>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
>>>>>>>          UP RUNNING NOARP  MTU:1472  Metric:1
>>>>>>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>>>>>          collisions:0 txqueuelen:0
>>>>>>>          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
>>>>>>>
>>>>>>> se00      Link encap:Ethernet  HWaddr 2E:B0:5D:A0:C5:B0
>>>>>>>          inet addr:172.30.42.1  Bcast:172.30.42.31
>>>>>>> Mask:255.255.255.224
>>>>>>>          inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
>>>>>>> Scope:Global
>>>>>>
>>>>>> How are you assigning your ipv6 addresses?
>>>>>
>>>>>
>>>>> It's been a while since I messed with this but I think IPv6 is
>assigned
>>>>> thanks to 6relayd? My modem has IPv6 enabled but no DHCPv6 options
>that I
>>>>> can find. Here's how cerowrt is configured.
>>>>>
>>>>> root at buffy2-1:/overlay/etc/config# cat 6relayd
>>>>> config server 'default'
>>>>>  option fallback_relay 'rd dhcpv6 ndp'
>>>>>  list network 'ge00'
>>>>>  list network 'ge01'
>>>>>  list network 'gw00'
>>>>>  list network 'gw01'
>>>>>  list network 'gw10'
>>>>>  list network 'gw11'
>>>>>  list network 'se00'
>>>>>  list network 'sw00'
>>>>>  list network 'sw10'
>>>>>  option rd 'relay'
>>>>>  option dhcpv6 'relay'
>>>>>  option ndp 'relay'
>>>>>  option master 'ge00'
>>>>>
>>>>>>
>>>>>>>          inet6 addr: fe80::2cb0:5dff:fea0:c5b0/64 Scope:Link
>>>>>>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>>>>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>          TX packets:191740 errors:0 dropped:0 overruns:0
>carrier:0
>>>>>>>          collisions:0 txqueuelen:1000
>>>>>>>          RX bytes:0 (0.0 B)  TX bytes:42184988 (40.2 MiB)
>>>>>>>          Interrupt:4
>>>>>>>
>>>>>>> sw00      Link encap:Ethernet  HWaddr 2C:B0:5D:A0:C5:B0
>>>>>>>          inet addr:172.30.42.65  Bcast:172.30.42.95
>>>>>>> Mask:255.255.255.224
>>>>>>>          inet6 addr: 2602:30a:2cdb:330:2eb0:5dff:fea0:c5b1/64
>>>>>>> Scope:Global
>>>>>>>          inet6 addr: fe80::2eb0:5dff:fea0:c5b0/64 Scope:Link
>>>>>>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>>>>>          RX packets:70239 errors:0 dropped:0 overruns:0 frame:0
>>>>>>>          TX packets:286967 errors:0 dropped:0 overruns:0
>carrier:0
>>>>>>>          collisions:0 txqueuelen:1000
>>>>>>>          RX bytes:15590189 (14.8 MiB)  TX bytes:127357293 (121.4
>MiB)
>>>>>>>
>>>>>>> root at buffy2-1:~# less /var/log/babeld.log
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> send: Cannot assign requested address
>>>>>>> send: Cannot assign requested address
>>>>>>> send: Cannot assign requested address
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> netlink_read: recvmsg(): No buffer space available
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>> Couldn't determine channel of interface sw00: Invalid argument.
>>>>>>
>>>>>> This is a problem in babel detecting the channel on a "normal"
>>>>>> rather than a mesh interface. It's bugged me a long while, but
>>>>>> haven't got around to finding what triggers it. Might "fix" it by
>>>>>> acquiring the channel at babel start time from
>/etc/config/wireless.
>>>>>>
>>>>>> It messes up the diversity routing calculation, grump.
>>>>>>
>>>>>> There is a possibility a logfile got really big, but this one
>>>>>> generally doesn't, but I should turn off logging in some
>>>>>> future release...
>>>>>
>>>>>
>>>>> I believe I've tracked down part of what's going on. It looks like
>my
>>>>> tmpfs is filling up 100% and then the device enters a bad state:
>>>>>
>>>>> After 24 hours, with tmpfs at 50%, babeld.log is the largest file
>by far
>>>>> in tmpfs and the only file that appears to be growing (based on
>`du`). It
>>>>> takes about 48 hours from reboot to fill up tmpfs on my device.
>>>>>
>>>>> #  sort babeld.log | uniq -c |sort -rn |head
>>>>>
>>>>> 503236 Couldn't determine channel of interface sw00: Invalid
>argument.
>>>>>
>>>>>   1376 netlink_read: recvmsg(): No buffer space available
>>>>>
>>>>>      3 send: Cannot assign requested address
>>>>>
>>>>> # wc -l babeld.log
>>>>>
>>>>> 504617 babeld.log
>>>>>
>>>>> I sped up system failure by using `dd` to fill up tmpfs and the
>system
>>>>> became immediately unusable.
>>>>>
>>>>> This also explains the luci session store errors as sessions are
>stored in
>>>>> tmpfs.
>>>>>
>>>>> The other buffer issues may or may not be related to this.
>>>>>
>>>>> Best,
>>>>> Steve
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Dave Täht
>>>
>>> Fixing bufferbloat with cerowrt:
>http://www.teklibre.com/cerowrt/subscribe.html
>>> _______________________________________________
>>> Cerowrt-devel mailing list
>>> Cerowrt-devel at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>

Hi Dave,
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.



More information about the Cerowrt-devel mailing list