From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [IPv6:2001:470:dc45:1000::1]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 336613CB3C for ; Wed, 14 Feb 2018 03:18:46 -0500 (EST) Date: Wed, 14 Feb 2018 09:18:43 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1518596324; bh=HhB8bZ1Knr6xgw1uJgqJxQw65Kgsd7DpctsVcJe1Mao=; h=Date:In-Reply-To:References:Subject:To:From:From; b=J5budPgyRRH9rBXqW2ke2v+I3F6EgCISsf9htb36oCV+judC0VTKq+zqq5quNdL44 D1C0kDaonzJy7Twj1cXklWo7KxyBq+jwl0ZUnRYa8hJ/+09aQaXwAI2wi+/hPJpSg5 gzgKoC5FE8e1knZ6tOE72UVMxDMvCEq4+aSn0qKUUFpHO9bUiZH4+6N9ZUacMT2YSI b3VOO9+waHrTkzwGbPN+6dfy0xgPbHH5pa32oA9VlQ/v9T5gmIkLJh4NZlIXBGmFgJ uZjEuWYBYQd0AWHESQkBjmRCAijW7r7HdKzF1BOPJ4wywNjDHQYpCPkwwXPQ177g1Z eewkuGbZ77KtA== In-Reply-To: <40f644f6-ecfa-c31b-ce98-3491c954d6b1@qti.qualcomm.com> References: <20180202151105.30043-1-toke@toke.dk> <40f644f6-ecfa-c31b-ce98-3491c954d6b1@qti.qualcomm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable To: Ryan Hsu , "make-wifi-fast@lists.bufferbloat.net" , "linux-wireless@vger.kernel.org" From: =?ISO-8859-1?Q?Toke_H=F8iland-J=F8rgensen?= X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <41B51538-B1F5-4611-AAB4-923C585FF3DA@toke.dk> Subject: Re: [Make-wifi-fast] [PATCH] mac80211: Adjust TSQ pacing shift X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Feb 2018 08:18:46 -0000 On 14 February 2018 01:43:25 CET, Ryan Hsu = wrote: >On 02/02/2018 07:11 AM, Toke H=C3=B8iland-J=C3=B8rgensen wrote: > >> Since we now have the convenient helper to do so, actually adjust the >> TSQ pacing shift for packets going out over a WiFi interface=2E This >> significantly improves throughput for locally-originated TCP >> connections=2E The default pacing shift of 10 corresponds to ~1ms of >> queued packet data=2E Adjusting this to a shift of 8 (i=2Ee=2E ~4ms) >improves >> 1-hop throughput for ath9k by a factor of 3, whereas increasing it >more >> has diminishing returns=2E >> >> Achieved throughput for different values of sk_pacing_shift (average >of >> 5 iterations of 10-sec netperf runs to a host on the other side of >the >> WiFi hop): >> >> sk_pacing_shift 10: 43=2E21 Mbps (pre-patch) >> sk_pacing_shift 9: 78=2E17 Mbps >> sk_pacing_shift 8: 123=2E94 Mbps >> sk_pacing_shift 7: 128=2E31 Mbps >> >> Latency for competing flows increases from ~3 ms to ~10 ms with this >> change=2E This is about the same magnitude of queueing latency induced >by >> flows that are not originated on the WiFi device itself (and so are >not >> limited by TSQ)=2E >> >> Signed-off-by: Toke H=C3=B8iland-J=C3=B8rgensen >> --- >> net/mac80211/tx=2Ec | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/net/mac80211/tx=2Ec b/net/mac80211/tx=2Ec >> index 25904af38839=2E=2E69722504e3e1 100644 >> --- a/net/mac80211/tx=2Ec >> +++ b/net/mac80211/tx=2Ec >> @@ -3574,6 +3574,14 @@ void __ieee80211_subif_start_xmit(struct >sk_buff *skb, >> if (!IS_ERR_OR_NULL(sta)) { >> struct ieee80211_fast_tx *fast_tx; >> =20 >> + /* We need a bit of data queued to build aggregates properly, so >> + * instruct the TCP stack to allow more than a single ms of data >> + * to be queued in the stack=2E The value is a bit-shift of 1 >> + * second, so 8 is ~4ms of queued data=2E Only affects local TCP >> + * sockets=2E >> + */ >> + sk_pacing_shift_update(skb->sk, 8); >> + >> fast_tx =3D rcu_dereference(sta->fast_tx); >> =20 >> if (fast_tx && > >I knew increasing the value doesn't help much after 8 for ath9k, but I >ran a >testing on ath10k that 6 or 7 is having optimal number=2E >Since ath10k/11ac device has higher bandwidth than ath9k/11n, can we >consider >to use to 6 or 7 to accommodate that effect? > > tx (mbps) cpu usage (%) >5 404 28=2E5 >6 398 13=2E8 >7 401 8 >8 378 5 >9 230 4=2E5 >10 79=2E6 2 Why does the CPU usage go up >7? Also, what is the latency impact of each = of those values? -Toke