Development issues regarding the cerowrt test router project
 help / color / mirror / Atom feed
From: Dave Taht <dave.taht@gmail.com>
To: Neil Shepperd <nshepperd@gmail.com>
Cc: cerowrt@lists.bufferbloat.net,
	"cerowrt-devel@lists.bufferbloat.net"
	<cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel] [Bug #442] smoking gun found for wifi hang
Date: Tue, 8 Apr 2014 12:50:23 -0700	[thread overview]
Message-ID: <CAA93jw5-gZhWkoFGKev7F5ENy1e5so-Pc4KA=2rqQcr7mYKGDQ@mail.gmail.com> (raw)

Finally found the smoke, from a gun still offstage.

The background wifi queue (1:40) gets wedged.

This explains why this only seemed to happen on comcast (Which
re-marks a LOT of traffic
background that it shouldn't, and yes we should start mangling packets
back to "be" in sqm
as an option), and why local traffic seemed to mostly work when stuff
coming back from the internet didn't.

As to *why* it happens, don't know. I'm sitting in the #bufferbloat channel
scratching my head as to means to explore the problem without
unwedging the interface.

It seems plausible we can MUCH more easily reproduce this now by flooding the
background queues with traffic (netperf can do this). It's not clear
you can trigger it
with just tcp however or if multiple hops are required, etc, etc.

root@cerowrt:/mnt/disk1# tc -s qdisc show dev sw00
qdisc mq 1: root
 Sent 3926131082 bytes 2998293 pkt (dropped 91657, overlimits 0 requeues 70095)
 backlog 77608b 1000p requeues 70095
qdisc fq_codel 10: parent 1:1 limit 800p flows 1024 quantum 500 target
10.0ms interval 100.0ms
 Sent 110555 bytes 771 pkt (dropped 0, overlimits 0 requeues 5)
 backlog 0b 0p requeues 5
  maxpacket 256 drop_overlimit 0 new_flow_count 2 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 20: parent 1:2 limit 800p flows 1024 quantum 300 target
5.0ms interval 100.0ms ecn
 Sent 2526448 bytes 17982 pkt (dropped 1, overlimits 0 requeues 31)
 backlog 0b 0p requeues 31
  maxpacket 929 drop_overlimit 0 new_flow_count 71 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 30: parent 1:3 limit 1000p flows 1024 quantum 300
target 5.0ms interval 100.0ms ecn
 Sent 15145657 bytes 106290 pkt (dropped 0, overlimits 0 requeues 179)
 backlog 0b 0p requeues 179
  maxpacket 256 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 40: parent 1:4 limit 1000p flows 1024 quantum 300
target 5.0ms interval 100.0ms
 Sent 3908348422 bytes 2873250 pkt (dropped 91656, overlimits 0 requeues 69880)
 backlog 77608b 1000p requeues 69880
                         ^^^^^!!!!!

  maxpacket 1514 drop_overlimit 72128 new_flow_count 85727 ecn_mark 0
  new_flows_len 238 old_flows_len 1

I got the "wedged" interface to work again re-marking all tcp traffic
as best effort"

iptables -A FORWARD -o sw00 -t mangle -p tcp -m tcp -j DSCP --set-dscp-class be

thus moving traffic into 1:3 above.

(can probably improve on this iptables thing, but it's just a
workaround and for all I know we can also trigger this on the be
queue)

icmp replies however, seems to want to always go into the background
queue for some reason. (?)

We did have this happen earlier on this run

[31325.589843] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32380.960937] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32381.035156] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32381.140625] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32381.242187] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32381.343750] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32418.824218] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32445.863281] ath: phy0: Failed to stop TX DMA, queues=0x108!
[32445.960937] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32446.062500] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32446.164062] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32446.265625] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32446.367187] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32446.472656] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32446.574218] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32446.683593] ath: phy0: Failed to stop TX DMA, queues=0x00c!
[32446.777343] ath: phy0: Failed to stop TX DMA, queues=0x008!
[32446.886718] ath: phy0: Failed to stop TX DMA, queues=0x009!
[34701.062500] ath: phy0: Failed to stop TX DMA, queues=0x008!
[34701.140625] ath: phy0: Failed to stop TX DMA, queues=0x008!
[34701.242187] ath: phy0: Failed to stop TX DMA, queues=0x008!

                 reply	other threads:[~2014-04-08 19:50 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/cerowrt-devel.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA93jw5-gZhWkoFGKev7F5ENy1e5so-Pc4KA=2rqQcr7mYKGDQ@mail.gmail.com' \
    --to=dave.taht@gmail.com \
    --cc=cerowrt-devel@lists.bufferbloat.net \
    --cc=cerowrt@lists.bufferbloat.net \
    --cc=nshepperd@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox