Development issues regarding the cerowrt test router project
 help / color / mirror / Atom feed
* [Cerowrt-devel] cerowrt_stability?
@ 2014-06-06 14:46 Dave Taht
  2014-06-07 12:38 ` Török Edwin
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Taht @ 2014-06-06 14:46 UTC (permalink / raw)
  To: cerowrt-devel

1)how many are encountering bug 442 regularly?

Getting it to occur is hard for me. I've only seen in once in the last
several weeks, jg can have it happen inside of two days. It mostly
seems to occur in conditions of poor signal strength, near as I can tell.

2) aside from that, how are things?

3) I am looking over progress on the homenet and homewrt front and
it is looking close to time to drop avahi in favor of mdns proxy,
and get hnetd up and running for ipv6 prefix distribution on
secondary routers. The code is still kind of raw - and requires things like
using a different dhcp server to work right - but as a way of configuring
interior routers it is the present path of the homenet working group,
and it would be good to get more eyeballs on how to make it work
right with cero.

http://tools.ietf.org/html/draft-stenberg-homenet-hncp-00

http://www.homewrt.org/doku.php?id=run-conf#configuring_and_starting_hnet


-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Cerowrt-devel] cerowrt_stability?
  2014-06-06 14:46 [Cerowrt-devel] cerowrt_stability? Dave Taht
@ 2014-06-07 12:38 ` Török Edwin
  2014-06-07 17:55   ` Dave Taht
  0 siblings, 1 reply; 5+ messages in thread
From: Török Edwin @ 2014-06-07 12:38 UTC (permalink / raw)
  To: cerowrt-devel

[-- Attachment #1: Type: text/plain, Size: 2450 bytes --]

On 06/06/2014 05:46 PM, Dave Taht wrote:
> 1)how many are encountering bug 442 regularly?
> 
> Getting it to occur is hard for me. I've only seen in once in the last
> several weeks, jg can have it happen inside of two days. It mostly
> seems to occur in conditions of poor signal strength, near as I can tell.
> 
> 2) aside from that, how are things?

I am quite happy with 3.10.40-5, although I might try -6 soon as apparently that has a new dnsmasq.
I don't use wireless that often these days anymore, so I can't say about that bug.
I didn't have troubles with DNSSEC with the default config.
IPv6 works reliably too, in fact it is too reliable :) It happened once that I DHCP/IPv4 was broken, but IPv6 still worked.

There are just 2 strange things that weren't reproducible [1], sorry that I can't give you more than anecdotal evidence:
1.  I booted the router, the interface went up, I read my email, tried to search on startpage.com, but it was down (or so I thought),
so I searched google, but then none of the links in google work ... ah there is no IPV4 address ...
  Running DHCP didn't give me anything, and manually setting an IP address on eth0 didn't help either as I wasn't able to ping / ssh the router on IPv4
(and apparently ssh doesn't work on IPv6, might be my fault though)
  I was able to open the web interface on IPv6, tried restarting dnsmasq but that didn't fix things, so I just rebooted the router and then it worked.

2. At some point my internet speed and latency become very bad. I don't know if this was due to the fault of my ISP or not, but a reboot fixed it.

A ping looked like this (over an ethernet connection):
PING www.google.com (173.194.44.52) 56(84) bytes of data.
64 bytes from muc03s08-in-f20.1e100.net (173.194.44.52): icmp_seq=1 ttl=55 time=2467 ms
64 bytes from muc03s08-in-f20.1e100.net (173.194.44.52): icmp_seq=2 ttl=55 time=2502 ms
64 bytes from muc03s08-in-f20.1e100.net (173.194.44.52): icmp_seq=3 ttl=55 time=2349 ms

The good news is that I had a VoIP call running at the time, and I could still understand everyone, in fact I only noticed 
something was wrong when I finished the call and people told me I was very lagged with my replies.
I think thats a success for cerowrt's SQM, that voip was still able to work even on a very lagged and slow line :)

P.S. I use a variation of the attached script to configure my router, which is based on the script from the wiki.

Best regards,
--Edwin

[-- Attachment #2: script.sh --]
[-- Type: application/x-shellscript, Size: 7001 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Cerowrt-devel] cerowrt_stability?
  2014-06-07 12:38 ` Török Edwin
@ 2014-06-07 17:55   ` Dave Taht
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Taht @ 2014-06-07 17:55 UTC (permalink / raw)
  To: Török Edwin; +Cc: cerowrt-devel

On Sat, Jun 7, 2014 at 5:38 AM, Török Edwin <edwin+ml-cerowrt@etorok.net> wrote:
> On 06/06/2014 05:46 PM, Dave Taht wrote:
>> 1)how many are encountering bug 442 regularly?
>>
>> Getting it to occur is hard for me. I've only seen in once in the last
>> several weeks, jg can have it happen inside of two days. It mostly
>> seems to occur in conditions of poor signal strength, near as I can tell.
>>
>> 2) aside from that, how are things?
>
> I am quite happy with 3.10.40-5, although I might try -6 soon as apparently that has a new dnsmasq.
> I don't use wireless that often these days anymore, so I can't say about that bug.
> I didn't have troubles with DNSSEC with the default config.
> IPv6 works reliably too, in fact it is too reliable :) It happened once that I DHCP/IPv4 was broken, but IPv6 still worked.
>
> There are just 2 strange things that weren't reproducible [1], sorry that I can't give you more than anecdotal evidence:
> 1.  I booted the router, the interface went up, I read my email, tried to search on startpage.com, but it was down (or so I thought),
> so I searched google, but then none of the links in google work ... ah there is no IPV4 address ...
>   Running DHCP didn't give me anything, and manually setting an IP address on eth0 didn't help either as I wasn't able to ping / ssh the router on IPv4
> (and apparently ssh doesn't work on IPv6, might be my fault though)
>   I was able to open the web interface on IPv6, tried restarting dnsmasq but that didn't fix things, so I just rebooted the router and then it worked.

One of the problems is that openwrt is moving away from using dnsmasq
as a dhcp server, in favor of their
tightly integrated odhcp server. This is the opposite direction in
which I'd prefer, I'd like addressing and naming
to be more closely tied together than they are, (and dnsmasq's dhcp
and dhcpv6 implementations are more mature) but the size and
complexity of the code base for dnsmasq is intimidating, and the
functionality needed is tied to ubus, and dhcp is a fairly simple
protocol to implement, so there we are.

So I have been in cases where odhcpd AND dnsmasq get enabled for some
reason or another and bad things
happen. Currently the new hnetd code relies on odhcp not dnsmasq, for
dhcp service.

There is also an open bug in dnsmasq (race condition), that results in
it running away and giving
you the result you had. There's a fix in 2.72beta2 for it.

>
> 2. At some point my internet speed and latency become very bad. I don't know if this was due to the fault of my ISP or not, but a reboot fixed it.
>
> A ping looked like this (over an ethernet connection):
> PING www.google.com (173.194.44.52) 56(84) bytes of data.
> 64 bytes from muc03s08-in-f20.1e100.net (173.194.44.52): icmp_seq=1 ttl=55 time=2467 ms
> 64 bytes from muc03s08-in-f20.1e100.net (173.194.44.52): icmp_seq=2 ttl=55 time=2502 ms
> 64 bytes from muc03s08-in-f20.1e100.net (173.194.44.52): icmp_seq=3 ttl=55 time=2349 ms
>

I have seen this in a specific situation - ping flood, or (in my
case), I'd accidentally implemented a version of tcp more like tcp
relentless - so I ended up pouring out packets at 1gigE into the
router, which can only handle 300mbit forwarding at best,
in this case, simple.qos was enabled, so only 20mbit was egressing.

So although htb and fq_codel kept working, the cpu was overloaded, and
packet buffers remained full, leading to
really huge lag especially in simple.qos (which deprioritizes ping) in
the range you mention.

I've thought about improving or removing the overload search of the
flow space in fq_codel for this reason. What happens when you exceed
the packet limit is that it does a complete search of the flow space
to find the biggest flow (scanning 4k of data each time), and drops
packets from that flow. Dropping tail in that case would be simpler
and nearly as effective. Bumping up codel's drop rate faster in that
sort of situation might be good too. Still, the root of that problem
to me was that we can take gigE in but only put 300mbit out, so I
don't know if it would have done any good in that case.


> The good news is that I had a VoIP call running at the time, and I could still understand everyone, in fact I only noticed
> something was wrong when I finished the call and people told me I was very lagged with my replies.
> I think thats a success for cerowrt's SQM, that voip was still able to work even on a very lagged and slow line :)
>
> P.S. I use a variation of the attached script to configure my router, which is based on the script from the wiki.
>
> Best regards,
> --Edwin
>
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Cerowrt-devel] cerowrt_stability?
  2014-06-07  3:28 R.
@ 2014-06-10  0:01 ` R.
  0 siblings, 0 replies; 5+ messages in thread
From: R. @ 2014-06-10  0:01 UTC (permalink / raw)
  Cc: cerowrt-devel

WiFi just hung. Had to physically reboot. Besides a bunch of DMA error
messages a few minutes earlier, nothing of particular interest is to
be found in the syslog.

Though I did notice that upon booting, the router had the wrong date:

[...]
Sat Jun  7 12:43:30 2014 user.notice dnsmasq-checkntp[2436]: Started
checkntp. Date says: Sat Jun  7 12:43:30 EDT 2014. Sleeping for 5
seconds.
Mon Jun  9 19:36:29 2014 daemon.info dnsmasq[2442]: started, version
2.71 cachesize 5000
[...]

On Fri, Jun 6, 2014 at 11:28 PM, R. <redag2@gmail.com> wrote:
> My WNDR3800 on 3.10.40-6 crashed yesterday. APs disappeared for about
> 2 minutes and came back by themselves.
> Lots of clients on my router.
>
> Unfortunately, I didn't have logging enabled when that happened. From
> now on, I'll try to log to the USB port (flash disk). Oh, and DMA
> messages are aplenty.
>
> Overall, though, I'm pretty satisfied!
>
> On Fri, Jun 6, 2014 at 10:46 AM, Dave Taht <dave.taht@gmail.com> wrote:
>> 1)how many are encountering bug 442 regularly?
>>
>> Getting it to occur is hard for me. I've only seen in once in the last
>> several weeks, jg can have it happen inside of two days. It mostly
>> seems to occur in conditions of poor signal strength, near as I can tell.
>>
>> 2) aside from that, how are things?
>>
>> 3) I am looking over progress on the homenet and homewrt front and
>> it is looking close to time to drop avahi in favor of mdns proxy,
>> and get hnetd up and running for ipv6 prefix distribution on
>> secondary routers. The code is still kind of raw - and requires things like
>> using a different dhcp server to work right - but as a way of configuring
>> interior routers it is the present path of the homenet working group,
>> and it would be good to get more eyeballs on how to make it work
>> right with cero.
>>
>> http://tools.ietf.org/html/draft-stenberg-homenet-hncp-00
>>
>> http://www.homewrt.org/doku.php?id=run-conf#configuring_and_starting_hnet
>>
>>
>> --
>> Dave Täht
>>
>> NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Cerowrt-devel] cerowrt_stability?
@ 2014-06-07  3:28 R.
  2014-06-10  0:01 ` R.
  0 siblings, 1 reply; 5+ messages in thread
From: R. @ 2014-06-07  3:28 UTC (permalink / raw)
  To: Dave Taht; +Cc: cerowrt-devel

My WNDR3800 on 3.10.40-6 crashed yesterday. APs disappeared for about
2 minutes and came back by themselves.
Lots of clients on my router.

Unfortunately, I didn't have logging enabled when that happened. From
now on, I'll try to log to the USB port (flash disk). Oh, and DMA
messages are aplenty.

Overall, though, I'm pretty satisfied!

On Fri, Jun 6, 2014 at 10:46 AM, Dave Taht <dave.taht@gmail.com> wrote:
> 1)how many are encountering bug 442 regularly?
>
> Getting it to occur is hard for me. I've only seen in once in the last
> several weeks, jg can have it happen inside of two days. It mostly
> seems to occur in conditions of poor signal strength, near as I can tell.
>
> 2) aside from that, how are things?
>
> 3) I am looking over progress on the homenet and homewrt front and
> it is looking close to time to drop avahi in favor of mdns proxy,
> and get hnetd up and running for ipv6 prefix distribution on
> secondary routers. The code is still kind of raw - and requires things like
> using a different dhcp server to work right - but as a way of configuring
> interior routers it is the present path of the homenet working group,
> and it would be good to get more eyeballs on how to make it work
> right with cero.
>
> http://tools.ietf.org/html/draft-stenberg-homenet-hncp-00
>
> http://www.homewrt.org/doku.php?id=run-conf#configuring_and_starting_hnet
>
>
> --
> Dave Täht
>
> NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-06-10  0:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-06 14:46 [Cerowrt-devel] cerowrt_stability? Dave Taht
2014-06-07 12:38 ` Török Edwin
2014-06-07 17:55   ` Dave Taht
2014-06-07  3:28 R.
2014-06-10  0:01 ` R.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox