From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nbd.name (nbd.name [IPv6:2a01:4f8:131:30e2::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id B2A983B25E for ; Tue, 12 Jul 2016 09:21:25 -0400 (EDT) To: Dave Taht , make-wifi-fast@lists.bufferbloat.net References: <11fa6d16-21e2-2169-8d18-940f6dc11dca@nbd.name> Cc: linux-wireless , Michal Kazior , =?UTF-8?Q?Toke_H=c3=b8iland-J=c3=b8rgensen?= From: Felix Fietkau Message-ID: <097af8e4-5393-8e1b-1748-36233e605867@nbd.name> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Mailman-Approved-At: Mon, 28 Nov 2016 08:47:10 -0500 Subject: Re: [Make-wifi-fast] TCP performance regression in mac80211 triggered by the fq code X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Tue, 12 Jul 2016 13:21:25 -0000 X-Original-Date: Tue, 12 Jul 2016 15:21:21 +0200 X-List-Received-Date: Tue, 12 Jul 2016 13:21:25 -0000 On 2016-07-12 14:13, Dave Taht wrote: > On Tue, Jul 12, 2016 at 12:09 PM, Felix Fietkau wrote: >> Hi, >> >> With Toke's ath9k txq patch I've noticed a pretty nasty performance >> regression when running local iperf on an AP (running the txq stuff) to >> a wireless client. > > Your kernel? cpu architecture? QCA9558, 720 MHz, running Linux 4.4.14 > What happens when going through the AP to a server from the wireless client? Will test that next. > Which direction? AP->STA, iperf running on the AP. Client is a regular MacBook Pro (Broadcom). >> Here's some things that I found: >> - when I use only one TCP stream I get around 90-110 Mbit/s > > with how much cpu left over? ~20% >> - when running multiple TCP streams, I get only 35-40 Mbit/s total > with how much cpu left over? ~30% > context switch difference between the two tests? What's the easiest way to track that? > tcp_limit_output_bytes is? 262144 > got perf? Need to make a new build for that. >> - fairness between TCP streams looks completely fine > > A codel will get to long term fairness pretty fast. Packet captures > from a fq will show much more regular interleaving of packets, > regardless. > >> - there's no big queue buildup, the code never actually drops any packets > > A "trick" I have been using to observe codel behavior has been to > enable ecn on server and client, then checking in wireshark for ect(3) > marked packets. I verified this with printk. The same issue already appears if I have just the fq patch (with the codel patch reverted). >> - if I put a hack in the fq code to force the hash to a constant value > > You could also set "flows" to 1 to keep the hash being generated, but > not actually use it. > >> (effectively disabling fq without disabling codel), the problem >> disappears and even multiple streams get proper performance. > > Meaning you get 90-110Mbits ? Right. > Do you have a "before toke" figure for this platform? It's quite similar. >> Please let me know if you have any ideas. > > I am in berlin, packing hardware... Nice! - Felix