Lets make wifi fast again!
 help / color / mirror / Atom feed
From: Dave Taht <dave.taht@gmail.com>
To: Abhishek Kumar <kuabhs@chromium.org>
Cc: kvalo@kernel.org, ath10k@lists.infradead.org,
	 linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org,
	 netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	 Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	 Make-Wifi-fast <make-wifi-fast@lists.bufferbloat.net>
Subject: Re: [Make-wifi-fast] [PATCH] ath10k: snoc: enable threaded napi on WCN3990
Date: Tue, 20 Dec 2022 07:10:23 -0800	[thread overview]
Message-ID: <CAA93jw7Qi1rfBRxaG=5ARshDwepO=b_Qg3BXFi2AHSG7cO44uw@mail.gmail.com> (raw)
In-Reply-To: <20221220075215.1.Ic12e347e0d61a618124b742614e82bbd5d770173@changeid>

I am always interested in flent.org tcp_nup, tcp_ndown, and rrul_be
tests on wifi hardware. In AP mode, especially, against a few clients
in rtt_fair on the "ending the anomaly" test suite at the bottom of
this link: https://www.cs.kau.se/tohojo/airtime-fairness/ . Of these,
it's trying to optimize bandwidth more fairly and keep latencies low
when 4 or more stations are trying to transmit (in a world with 16 or
more stations online), that increasingly bothers me the most. I'm
seeing 5+ seconds on some rtt_fair-like tests nowadays.

I was also seeing huge simultaneous upload vs download disparities on
the latest kernels, on various threads over here:
https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002 and
more recently here:
https://forum.openwrt.org/t/reducing-multiplexing-latencies-still-further-in-wifi/133605

I don't understand why napi with the default budget (64) is even
needed on the ath10k, as a single txop takes a minimum of ~200us, but
perhaps your patch will help. Still, measuring the TCP statistics
in-band would be nice to see. Some new tools are appearing that can do
this, Apple's goresponsiveness, crusader... that are simpler to use
than flent.

On Tue, Dec 20, 2022 at 12:17 AM Abhishek Kumar <kuabhs@chromium.org> wrote:
>
> NAPI poll can be done in threaded context along with soft irq
> context. Threaded context can be scheduled efficiently, thus
> creating less of bottleneck during Rx processing. This patch is
> to enable threaded NAPI on ath10k driver.
>
> Based on testing, it was observed that on WCN3990, the CPU0 reaches
> 100% utilization when napi runs in softirq context. At the same
> time the other CPUs are at low consumption percentage. This
> does not allow device to reach its maximum throughput potential.
> After enabling threaded napi, CPU load is balanced across all CPUs
> and following improvments were observed:
> - UDP_RX increase by ~22-25%
> - TCP_RX increase by ~15%
>
> Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.2.2-00696-QCAHLSWMTPL-1
> Signed-off-by: Abhishek Kumar <kuabhs@chromium.org>
> ---
>
>  drivers/net/wireless/ath/ath10k/core.c | 16 ++++++++++++++++
>  drivers/net/wireless/ath/ath10k/hw.h   |  2 ++
>  drivers/net/wireless/ath/ath10k/snoc.c |  3 +++
>  3 files changed, 21 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
> index 5eb131ab916fd..ee4b6ba508c81 100644
> --- a/drivers/net/wireless/ath/ath10k/core.c
> +++ b/drivers/net/wireless/ath/ath10k/core.c
> @@ -100,6 +100,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA988X_HW_2_0_VERSION,
> @@ -140,6 +141,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA9887_HW_1_0_VERSION,
> @@ -181,6 +183,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA6174_HW_3_2_VERSION,
> @@ -217,6 +220,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA6174_HW_2_1_VERSION,
> @@ -257,6 +261,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA6174_HW_2_1_VERSION,
> @@ -297,6 +302,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA6174_HW_3_0_VERSION,
> @@ -337,6 +343,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA6174_HW_3_2_VERSION,
> @@ -381,6 +388,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA99X0_HW_2_0_DEV_VERSION,
> @@ -427,6 +435,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA9984_HW_1_0_DEV_VERSION,
> @@ -480,6 +489,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA9888_HW_2_0_DEV_VERSION,
> @@ -530,6 +540,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA9377_HW_1_0_DEV_VERSION,
> @@ -570,6 +581,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA9377_HW_1_1_DEV_VERSION,
> @@ -612,6 +624,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA9377_HW_1_1_DEV_VERSION,
> @@ -645,6 +658,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = QCA4019_HW_1_0_DEV_VERSION,
> @@ -692,6 +706,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = false,
>                 .use_fw_tx_credits = true,
>                 .delay_unmap_buffer = false,
> +               .enable_threaded_napi = false,
>         },
>         {
>                 .id = WCN3990_HW_1_0_DEV_VERSION,
> @@ -725,6 +740,7 @@ static const struct ath10k_hw_params ath10k_hw_params_list[] = {
>                 .hw_restart_disconnect = true,
>                 .use_fw_tx_credits = false,
>                 .delay_unmap_buffer = true,
> +               .enable_threaded_napi = true,
>         },
>  };
>
> diff --git a/drivers/net/wireless/ath/ath10k/hw.h b/drivers/net/wireless/ath/ath10k/hw.h
> index 9643031a4427a..adf3076b96503 100644
> --- a/drivers/net/wireless/ath/ath10k/hw.h
> +++ b/drivers/net/wireless/ath/ath10k/hw.h
> @@ -639,6 +639,8 @@ struct ath10k_hw_params {
>         bool use_fw_tx_credits;
>
>         bool delay_unmap_buffer;
> +
> +       bool enable_threaded_napi;
>  };
>
>  struct htt_resp;
> diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
> index cfcb759a87dea..b94150fb6ef06 100644
> --- a/drivers/net/wireless/ath/ath10k/snoc.c
> +++ b/drivers/net/wireless/ath/ath10k/snoc.c
> @@ -927,6 +927,9 @@ static int ath10k_snoc_hif_start(struct ath10k *ar)
>
>         bitmap_clear(ar_snoc->pending_ce_irqs, 0, CE_COUNT_MAX);
>
> +       if (ar->hw_params.enable_threaded_napi)
> +               dev_set_threaded(&ar->napi_dev, true);
> +
>         ath10k_core_napi_enable(ar);
>         ath10k_snoc_irq_enable(ar);
>         ath10k_snoc_rx_post(ar);
> --
> 2.39.0.314.g84b9a713c41-goog
>


-- 
This song goes out to all the folk that thought Stadia would work:
https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
Dave Täht CEO, TekLibre, LLC

       reply	other threads:[~2022-12-20 15:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20221220075215.1.Ic12e347e0d61a618124b742614e82bbd5d770173@changeid>
2022-12-20 15:10 ` Dave Taht [this message]
2022-12-28 23:53   ` Abhishek Kumar
2022-12-29  0:49     ` Dave Taht
2022-12-29  5:54       ` Bob McMahon
2022-12-30  3:44         ` Dave Taht
2022-12-30 20:41           ` Bob McMahon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/make-wifi-fast.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA93jw7Qi1rfBRxaG=5ARshDwepO=b_Qg3BXFi2AHSG7cO44uw@mail.gmail.com' \
    --to=dave.taht@gmail.com \
    --cc=ath10k@lists.infradead.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuabhs@chromium.org \
    --cc=kuba@kernel.org \
    --cc=kvalo@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=make-wifi-fast@lists.bufferbloat.net \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox