From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 2C4EA3B2A4 for ; Tue, 23 May 2023 13:00:56 -0400 (EDT) Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-3f6094cb26eso714345e9.2 for ; Tue, 23 May 2023 10:00:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684861255; x=1687453255; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=LPwQ+cn4ZLrAW/UcA8OzH6lh8EF2nY6uMArWQ7mxOoU=; b=YcdvVZpuJx8c4whTMWkB6hcgNQF8XqK+lod4sYQkVMDkP5JX8uZfvWIxzXNEJe0hqg lPeoKnP4IQli/RLw+XFJj7ogMlpsNLzjzi/si1h0Dfg7D8V0/6IvpYC1mxLEr0K+CrG5 YJtsV/kW7TwwzNcF411eOl+ktB5SqSiDt0T5EuG2nWYs8dDSxFOdwCgOZLN/ZFCBL6OI YcgybwHgfBoVC8dsRUgglgqjp8Kc0+/0UUHQUEO37tPsUI0jBR44ST9EwGlGngvbjLUk Tjr+OpGbNc2ChhoXspeSeoVAFx3NEvEfmLECu9CzrQQdB6S5p0sMTp23HX2dZLxEweah D7gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684861255; x=1687453255; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LPwQ+cn4ZLrAW/UcA8OzH6lh8EF2nY6uMArWQ7mxOoU=; b=QMjFaYKTkH7K7+kYwLENOTuZj+O7Fqy+Q828B2Zwq6mum5J3Ny+1tMGf95SJafeXNp cWEPa0zKRH4wWQfzXVKtz3IFJ+uzHiamWALqauMXr3ZWdNwtecqovBQSknBgZnOYZw6X fvtV5TiydZsXyLjtesyZtg22sCbdMomY7byzMy+aFoG1Qo6wK8ZlobH7r3y1KpVVf0G7 7WCD+nEZOtOa6T6/JtBZhJY/3A6EjSQsKRVsaK6Yo+RHSdGH4EQH1ufZmZoYEzWna5x1 EVXTjixIVp5dKJmjIir4q2kewZrvl+yzKyokLm6c3p/UT2Z9p/okA+pIqyCgZpiiXoBW 22jQ== X-Gm-Message-State: AC+VfDwAhc72RHlqKRBBgVuqur0fquPz1GW2s2PgedPWuSCxqcKhFNYr /YJA71BktyN8DQRnrOYYG+jloeYW5p99+2LtaMySXlxU X-Google-Smtp-Source: ACHHUZ4RP/0Y4Kh4cr5n3HbzZD1EdVI4DOGGGWX/5awAqZMyWBl2Xsx20+x6TzFwo5YQoUNHISd97adpl1OyEwoNzXM= X-Received: by 2002:a05:600c:20d:b0:3f6:6da:3ad1 with SMTP id 13-20020a05600c020d00b003f606da3ad1mr4839138wmi.34.1684861254573; Tue, 23 May 2023 10:00:54 -0700 (PDT) MIME-Version: 1.0 References: <00ca7beb7fe054a3ba1a36c61c1e3b1314369f11.camel@nvidia.com> In-Reply-To: <00ca7beb7fe054a3ba1a36c61c1e3b1314369f11.camel@nvidia.com> From: Dave Taht Date: Tue, 23 May 2023 11:00:41 -0600 Message-ID: To: libreqos Content-Type: multipart/alternative; boundary="00000000000093231605fc5f5496" Subject: [LibreQoS] Fwd: mlx5 XDP redirect leaking memory on kernel 6.3 X-BeenThere: libreqos@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Many ISPs need the kinds of quality shaping cake can do List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 May 2023 17:00:56 -0000 --00000000000093231605fc5f5496 Content-Type: text/plain; charset="UTF-8" ---------- Forwarded message --------- From: Dragos Tatulea Date: Tue, May 23, 2023, 10:36 AM Subject: Re: mlx5 XDP redirect leaking memory on kernel 6.3 To: Tariq Toukan , ttoukan.linux@gmail.com < ttoukan.linux@gmail.com>, jbrouer@redhat.com , Saeed Mahameed , saeed@kernel.org , linyunsheng@huawei.com , netdev@vger.kernel.org < netdev@vger.kernel.org> Cc: maxtram95@gmail.com , lorenzo@kernel.org < lorenzo@kernel.org>, alexander.duyck@gmail.com , kheib@redhat.com , ilias.apalodimas@linaro.org < ilias.apalodimas@linaro.org>, mkabat@redhat.com , brouer@redhat.com , atzin@redhat.com , fmaurer@redhat.com , bpf@vger.kernel.org < bpf@vger.kernel.org>, jbenc@redhat.com On Tue, 2023-05-23 at 17:55 +0200, Jesper Dangaard Brouer wrote: > > When the mlx5 driver runs an XDP program doing XDP_REDIRECT, then memory > is getting leaked. Other XDP actions, like XDP_DROP, XDP_PASS and XDP_TX > works correctly. I tested both redirecting back out same mlx5 device and > cpumap redirect (with XDP_PASS), which both cause leaking. > > After removing the XDP prog, which also cause the page_pool to be > released by mlx5, then the leaks are visible via the page_pool periodic > inflight reports. I have this bpftrace[1] tool that I also use to detect > the problem faster (not waiting 60 sec for a report). > > [1] > https://github.com/xdp-project/xdp-project/blob/master/areas/mem/bpftrace/page_pool_track_shutdown01.bt > > I've been debugging and reading through the code for a couple of days, > but I've not found the root-cause, yet. I would appreciate new ideas > where to look and fresh eyes on the issue. > > > To Lin, it looks like mlx5 uses PP_FLAG_PAGE_FRAG, and my current > suspicion is that mlx5 driver doesn't fully release the bias count (hint > see MLX5E_PAGECNT_BIAS_MAX). > Thanks for the report Jesper. Incidentally I've just picked up this issue today as well. On XDP redirect and tx, the page is set to skip the bias counter release with the expectation that page_pool_put_defragged_page will be called from [1]. But, as I found out now, during XDP redirect only one fragment of the page is released in xdp core [2]. This is where the leak is coming from. We'll provide a fix soon. [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c#n665 [2] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/core/xdp.c#n390 Thanks, Dragos --00000000000093231605fc5f5496 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

---------- Forwarded message ---------
From: Dragos Tatulea <dtatulea@nvidia.com&g= t;
Date: Tue, May 23, 2023, 10:36 AM
Subject: Re: mlx5 XDP red= irect leaking memory on kernel 6.3
To: Tariq Toukan <tariqt@nvidia.com>, ttoukan.linux@gmail.com <ttoukan.linux@gmail.com>, jbrouer@redhat.com <jbrouer@redhat.com>, Saeed Mahameed <saeedm@nvidia.com>, s= aeed@kernel.org <saeed@kernel.or= g>, linyunsheng@huawei.com= <linyunsheng@huawei.com>, netdev@vger.kernel.org <netdev@vger.kernel.org&= gt;
Cc: maxtram95@gmail.com &= lt;maxtram95@gmail.com>, lorenzo@kernel.org <lorenzo@kernel.org>, alexander.duyck@gmail.com <alexander.duyck@gmail.com>, kheib@redhat.com <kheib@redhat.com>, ilias.apalodimas@linaro.org <ilias.apalodimas@linaro.org>, mkabat@redhat.com <mkabat@redhat.com>, brouer@red= hat.com <brouer@redhat.com&= gt;, atzin@redhat.com <atzin@redhat.com>, fmaurer@redhat.com <fmaurer@redhat.com>, = bpf@vger.kernel.org <bpf@vger= .kernel.org>, jbenc@redhat.com <jbenc@redhat.com>



On Tue, 2023-05-23 at 17:55 +0200, Jesper Dangaard Brouer wrote:
>
> When the mlx5 driver runs an XDP program doing XDP_REDIRECT, then memo= ry
> is getting leaked. Other XDP actions, like XDP_DROP, XDP_PASS and XDP_= TX
> works correctly. I tested both redirecting back out same mlx5 device a= nd
> cpumap redirect (with XDP_PASS), which both cause leaking.
>
> After removing the XDP prog, which also cause the page_pool to be
> released by mlx5, then the leaks are visible via the page_pool periodi= c
> inflight reports. I have this bpftrace[1] tool that I also use to dete= ct
> the problem faster (not waiting 60 sec for a report).
>
> =C2=A0 [1]
> https://github.com/xdp-project/xdp-project/blob/master/a= reas/mem/bpftrace/page_pool_track_shutdown01.bt
>
> I've been debugging and reading through the code for a couple of d= ays,
> but I've not found the root-cause, yet. I would appreciate new ide= as
> where to look and fresh eyes on the issue.
>
>
> To Lin, it looks like mlx5 uses PP_FLAG_PAGE_FRAG, and my current
> suspicion is that mlx5 driver doesn't fully release the bias count= (hint
> see MLX5E_PAGECNT_BIAS_MAX).
>

Thanks for the report Jesper. Incidentally I've just picked up this iss= ue today
as well.

On XDP redirect and tx, the page is set to skip the bias counter release wi= th
the expectation that page_pool_put_defragged_page will be called from [1]. = But,
as I found out now, during XDP redirect only one fragment of the page is released in xdp core [2]. This is where the leak is coming from.

We'll provide a fix soon.

[1]
https://git.kernel.org/pub/scm/linux/k= ernel/git/netdev/net-next.git/tree/drivers/net/ethernet/mellanox/mlx5/core/= en/xdp.c#n665

[2]
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree= /net/core/xdp.c#n390

Thanks,
Dragos


--00000000000093231605fc5f5496--