From: Dave Taht <dave.taht@gmail.com>
To: libreqos <libreqos@lists.bufferbloat.net>
Subject: [LibreQoS] Fwd: mlx5 XDP redirect leaking memory on kernel 6.3
Date: Tue, 23 May 2023 11:00:19 -0600 [thread overview]
Message-ID: <CAA93jw5iVjKn1iHfarFkAyhbkJfcS7a64=kFO-t17v+Mh4XLHQ@mail.gmail.com> (raw)
In-Reply-To: <d862a131-5e31-bd26-84f7-fd8764ca9d48@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 2668 bytes --]
Not sure what driver our friends in NZ are using...
---------- Forwarded message ---------
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
Date: Tue, May 23, 2023, 9:55 AM
Subject: mlx5 XDP redirect leaking memory on kernel 6.3
To: Dragos Tatulea <dtatulea@nvidia.com>, Saeed Mahameed <saeed@kernel.org>,
Saeed Mahameed <saeedm@nvidia.com>, Tariq Toukan <tariqt@nvidia.com>, Tariq
Toukan <ttoukan.linux@gmail.com>, Netdev <netdev@vger.kernel.org>, Yunsheng
Lin <linyunsheng@huawei.com>
Cc: <brouer@redhat.com>, <atzin@redhat.com>, <mkabat@redhat.com>, <
kheib@redhat.com>, Jiri Benc <jbenc@redhat.com>, bpf <bpf@vger.kernel.org>,
Felix Maurer <fmaurer@redhat.com>, Alexander Duyck <
alexander.duyck@gmail.com>, Ilias Apalodimas <ilias.apalodimas@linaro.org>,
Lorenzo Bianconi <lorenzo@kernel.org>, Maxim Mikityanskiy <
maxtram95@gmail.com>
When the mlx5 driver runs an XDP program doing XDP_REDIRECT, then memory
is getting leaked. Other XDP actions, like XDP_DROP, XDP_PASS and XDP_TX
works correctly. I tested both redirecting back out same mlx5 device and
cpumap redirect (with XDP_PASS), which both cause leaking.
After removing the XDP prog, which also cause the page_pool to be
released by mlx5, then the leaks are visible via the page_pool periodic
inflight reports. I have this bpftrace[1] tool that I also use to detect
the problem faster (not waiting 60 sec for a report).
[1]
https://github.com/xdp-project/xdp-project/blob/master/areas/mem/bpftrace/page_pool_track_shutdown01.bt
I've been debugging and reading through the code for a couple of days,
but I've not found the root-cause, yet. I would appreciate new ideas
where to look and fresh eyes on the issue.
To Lin, it looks like mlx5 uses PP_FLAG_PAGE_FRAG, and my current
suspicion is that mlx5 driver doesn't fully release the bias count (hint
see MLX5E_PAGECNT_BIAS_MAX).
--Jesper
Extra info about my device. Providing these as mlx5 driver can have
different allocation modes depending on HW and device priv-flags setup.
$ ethtool --show-priv-flags mlx5p1
Private flags for mlx5p1:
rx_cqe_moder : on
tx_cqe_moder : off
rx_cqe_compress : off
rx_striding_rq : on
rx_no_csum_complete: off
xdp_tx_mpwqe : on
skb_tx_mpwqe : on
tx_port_ts : off
$ ethtool -i mlx5p1
driver: mlx5_core
version: 6.4.0-rc2-net-next-vm-lock-dbg+
firmware-version: 16.23.1020 (MT_0000000009)
expansion-rom-version:
bus-info: 0000:03:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
$ lspci -v | grep 03:00.0
03:00.0 Ethernet controller: Mellanox Technologies MT28800 Family
[ConnectX-5 Ex]
[-- Attachment #2: Type: text/html, Size: 4248 bytes --]
next parent reply other threads:[~2023-05-23 17:00 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <d862a131-5e31-bd26-84f7-fd8764ca9d48@redhat.com>
2023-05-23 17:00 ` Dave Taht [this message]
[not found] ` <00ca7beb7fe054a3ba1a36c61c1e3b1314369f11.camel@nvidia.com>
2023-05-23 17:00 ` Dave Taht
[not found] ` <6d47e22e-f128-ec8f-bbdc-c030483a8783@redhat.com>
[not found] ` <cc918a244723bffe17f528fc1b9a82c0808a22be.camel@nvidia.com>
[not found] ` <324a5a08-3053-6ab6-d47e-7413d9f2f443@redhat.com>
2023-07-13 15:02 ` Dave Taht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/libreqos.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAA93jw5iVjKn1iHfarFkAyhbkJfcS7a64=kFO-t17v+Mh4XLHQ@mail.gmail.com' \
--to=dave.taht@gmail.com \
--cc=libreqos@lists.bufferbloat.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox