From: Sebastian Moeller <moeller0@gmx.de>
To: Dave Taht <dave.taht@gmail.com>
Cc: "cerowrt-devel@lists.bufferbloat.net"
<cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel] Wireless failures 3.10.17-3
Date: Wed, 11 Dec 2013 21:41:30 +0100 [thread overview]
Message-ID: <F9A80BD8-AFF6-457C-9C9F-9169B66D90C8@gmx.de> (raw)
In-Reply-To: <CAA93jw4V3EiGe7Jg+oNqaxSHJ4GfnJAb8pB+Umq959FU07YaVA@mail.gmail.com>
Hi List, hi Dave,
On Dec 11, 2013, at 19:41 , Dave Taht <dave.taht@gmail.com> wrote:
> I have the regrettable problem of mostly testing the 5ghz channel due
> to interference issues on the 2ghz band.
>
> What I am seeing in the last several releases of the 3.8.x and 3.10
> series is after tons of traffic and multiple days of uptime a DMA tx
> error which you can see via the logread or dmesg tool, and once it
> happens, at least sometimes, that radio can "go away" and not be
> resettable. "cannot stop tx dma" is the error.
I think I can make tho error appear "at will" by running netperf-wrapper against my wndr3700v2, just tested under 3.10.21-1:
/netperf-wrapper -l 300 -H gw.home.lan rrul -p all -t hms-beagle_cerowrt3.10.21-1_2_nacktmulle
dmesg on the router:
[ 53.007812] IPv6: ADDRCONF(NETDEV_CHANGE): gw11: link becomes ready
[28792.039062] ath: phy1: Failed to stop TX DMA, queues=0x00e!
[28794.078125] ath: phy1: Failed to stop TX DMA, queues=0x00e!
[28807.164062] ath: phy1: Failed to stop TX DMA, queues=0x00e!
[28809.191406] ath: phy1: Failed to stop TX DMA, queues=0x002!
[28823.269531] ath: phy1: Failed to stop TX DMA, queues=0x00e!
dmesg was clean before so these 5 failures are from the rrul test over the 5GHz radio
running the same over the 2.4GHz radio adds the following:
[29200.921875] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29206.980468] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29209.019531] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29211.066406] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29215.109375] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29227.195312] ath: phy0: Failed to stop TX DMA, queues=0x006!
[29233.257812] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29238.308593] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29240.351562] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29247.417968] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29251.480468] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29253.515625] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29256.558593] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29262.617187] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29264.652343] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29269.699218] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29273.750000] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29278.804687] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29281.859375] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29291.933593] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29294.972656] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29304.050781] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29312.117187] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29315.167968] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29322.246093] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29325.292968] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29330.355468] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29332.390625] ath: phy0: Failed to stop TX DMA, queues=0x00a!
[29334.445312] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29336.484375] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29337.527343] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29343.617187] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29349.679687] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29358.757812] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29361.816406] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29363.851562] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29364.882812] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29370.937500] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29371.976562] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29376.031250] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29378.062500] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29381.105468] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29388.175781] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29393.230468] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29401.292968] ath: phy0: Failed to stop TX DMA, queues=0x003!
[29403.332031] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29413.429687] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29417.480468] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29422.542968] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29424.582031] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29427.636718] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29429.671875] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29431.718750] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29433.765625] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29445.835937] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29449.898437] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29454.960937] ath: phy0: Failed to stop TX DMA, queues=0x00f!
[29461.023437] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29463.062500] ath: phy0: Failed to stop TX DMA, queues=0x00e!
[29466.117187] ath: phy0: Failed to stop TX DMA, queues=0x00f!
I have to admit before today I never tested with 2.4GHz and only say the 4 to 5 messages in the 5GHz band.
Running the same over the wired interface does not cause these messages…
And running from a 5GHz client through the router to a wired client (both on the internal side) just adds:
[30643.500000] ath: phy1: Failed to stop TX DMA, queues=0x00c!
[30736.898437] ath: phy1: Failed to stop TX DMA, queues=0x00e!
It does not immediately lead to a drop of the radio though...
Maybe this can be helpful in the hands of a real expert?
> I have seen this error
> many, many times in cerowrt releases for the last 2 years, but this
> time it seems more severe than usual.
>
> There was also a bug in dnsmasq or somewhere in the lower level of the
> stack where it stops responding to multicast dhcp packets.
>
> The upcoming 3.10.23-1 development release has a refresh of mac80211,
> and a bug fix related to multicast, so I have some hope for it.
>
> It has also the latest dnsmasq 2.68 (which fixes a bug in cname
> handling in particular), and also pie v3 but I am (as usual) not in a
> position to test it right now.
>
> It is my hope that now that the bug happens a lot we can track it
> down. Or, that it's fixed. :)
>
> I just put that release up at:
>
> http://snapon.lab.bufferbloat.net/~cero2/cerowrt/wndr/3.10.23-1/
>
> It does not have the updated aqm-scripts code and gui (sorry
> sebastian),
Ah, even better, I finished the discussed cosmetic changes and tested them, I will try to send them before Sunday, so they might end up in the next cero release. That means you will have to integrate with your changes to avoid HTB for high bandwidths… (or you just put your version in and I will do the integration after the next release :) )
Also, I still need to figure out how to make mutually exclusive with the default QOS system...
> nor the pie v4 drop that just got rejected for kernel
> mainline. I'll try to do a respin this weekend with those, and poke
> harder at the dma tx issue after I get back in the lab. Thoughts
> towards being able to isolate the cause and minimize the effect are
> welcomed - it's one of the biggest barriers to declaring a stable
> release at this point!
>
>
> On Wed, Dec 11, 2013 at 8:58 AM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> Has anyone seen wireless failing after several days with 3.10.17-3?
>>
>> The symptoms are devices fall off the net several days (or a week) after
>> router has been running. I saw the bg AP go away, but the 5 Ghz AP still
>> working. Wired attachment works.
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
>
>
> --
> Dave Täht
>
> Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
next prev parent reply other threads:[~2013-12-11 20:41 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-11 16:58 Stephen Hemminger
2013-12-11 18:25 ` Jim Gettys
2013-12-11 18:30 ` David Personette
2013-12-11 18:41 ` Dave Taht
2013-12-11 20:41 ` Sebastian Moeller [this message]
2013-12-11 22:05 ` Jim Gettys
[not found] <20131211174519.34966001@nehalam.linuxnetplumber.net>
[not found] ` <21161.18818.926049.511664@gargle.gargle.HOWL>
[not found] ` <C0DD393A-6810-4CB6-B705-AE801ED5BBBA@gmx.de>
2013-12-13 9:27 ` Sujith Manoharan
2013-12-13 9:48 ` Sebastian Moeller
2013-12-13 16:51 ` Felix Fietkau
2013-12-13 19:08 ` Sebastian Moeller
2013-12-13 20:56 ` Dave Taht
2013-12-13 23:02 ` Dave Taht
2013-12-14 4:00 ` Sujith Manoharan
2013-12-14 21:40 ` Dave Taht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/cerowrt-devel.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=F9A80BD8-AFF6-457C-9C9F-9169B66D90C8@gmx.de \
--to=moeller0@gmx.de \
--cc=cerowrt-devel@lists.bufferbloat.net \
--cc=dave.taht@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox