From: Dave Taht <dave.taht@gmail.com>
To: libreqos <libreqos@lists.bufferbloat.net>
Subject: [LibreQoS] Fwd: mlx5 XDP redirect leaking memory on kernel 6.3
Date: Tue, 23 May 2023 11:00:41 -0600 [thread overview]
Message-ID: <CAA93jw4T2uaSH=n8xP1HbAoX3sFr=Q3s9t9H67QLUqd5VDibqw@mail.gmail.com> (raw)
In-Reply-To: <00ca7beb7fe054a3ba1a36c61c1e3b1314369f11.camel@nvidia.com>
[-- Attachment #1: Type: text/plain, Size: 2733 bytes --]
---------- Forwarded message ---------
From: Dragos Tatulea <dtatulea@nvidia.com>
Date: Tue, May 23, 2023, 10:36 AM
Subject: Re: mlx5 XDP redirect leaking memory on kernel 6.3
To: Tariq Toukan <tariqt@nvidia.com>, ttoukan.linux@gmail.com <
ttoukan.linux@gmail.com>, jbrouer@redhat.com <jbrouer@redhat.com>, Saeed
Mahameed <saeedm@nvidia.com>, saeed@kernel.org <saeed@kernel.org>,
linyunsheng@huawei.com <linyunsheng@huawei.com>, netdev@vger.kernel.org <
netdev@vger.kernel.org>
Cc: maxtram95@gmail.com <maxtram95@gmail.com>, lorenzo@kernel.org <
lorenzo@kernel.org>, alexander.duyck@gmail.com <alexander.duyck@gmail.com>,
kheib@redhat.com <kheib@redhat.com>, ilias.apalodimas@linaro.org <
ilias.apalodimas@linaro.org>, mkabat@redhat.com <mkabat@redhat.com>,
brouer@redhat.com <brouer@redhat.com>, atzin@redhat.com <atzin@redhat.com>,
fmaurer@redhat.com <fmaurer@redhat.com>, bpf@vger.kernel.org <
bpf@vger.kernel.org>, jbenc@redhat.com <jbenc@redhat.com>
On Tue, 2023-05-23 at 17:55 +0200, Jesper Dangaard Brouer wrote:
>
> When the mlx5 driver runs an XDP program doing XDP_REDIRECT, then memory
> is getting leaked. Other XDP actions, like XDP_DROP, XDP_PASS and XDP_TX
> works correctly. I tested both redirecting back out same mlx5 device and
> cpumap redirect (with XDP_PASS), which both cause leaking.
>
> After removing the XDP prog, which also cause the page_pool to be
> released by mlx5, then the leaks are visible via the page_pool periodic
> inflight reports. I have this bpftrace[1] tool that I also use to detect
> the problem faster (not waiting 60 sec for a report).
>
> [1]
>
https://github.com/xdp-project/xdp-project/blob/master/areas/mem/bpftrace/page_pool_track_shutdown01.bt
>
> I've been debugging and reading through the code for a couple of days,
> but I've not found the root-cause, yet. I would appreciate new ideas
> where to look and fresh eyes on the issue.
>
>
> To Lin, it looks like mlx5 uses PP_FLAG_PAGE_FRAG, and my current
> suspicion is that mlx5 driver doesn't fully release the bias count (hint
> see MLX5E_PAGECNT_BIAS_MAX).
>
Thanks for the report Jesper. Incidentally I've just picked up this issue
today
as well.
On XDP redirect and tx, the page is set to skip the bias counter release
with
the expectation that page_pool_put_defragged_page will be called from [1].
But,
as I found out now, during XDP redirect only one fragment of the page is
released in xdp core [2]. This is where the leak is coming from.
We'll provide a fix soon.
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c#n665
[2]
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/core/xdp.c#n390
Thanks,
Dragos
[-- Attachment #2: Type: text/html, Size: 5310 bytes --]
next prev parent reply other threads:[~2023-05-23 17:00 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <d862a131-5e31-bd26-84f7-fd8764ca9d48@redhat.com>
2023-05-23 17:00 ` Dave Taht
[not found] ` <00ca7beb7fe054a3ba1a36c61c1e3b1314369f11.camel@nvidia.com>
2023-05-23 17:00 ` Dave Taht [this message]
[not found] ` <6d47e22e-f128-ec8f-bbdc-c030483a8783@redhat.com>
[not found] ` <cc918a244723bffe17f528fc1b9a82c0808a22be.camel@nvidia.com>
[not found] ` <324a5a08-3053-6ab6-d47e-7413d9f2f443@redhat.com>
2023-07-13 15:02 ` Dave Taht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/libreqos.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAA93jw4T2uaSH=n8xP1HbAoX3sFr=Q3s9t9H67QLUqd5VDibqw@mail.gmail.com' \
--to=dave.taht@gmail.com \
--cc=libreqos@lists.bufferbloat.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox