From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mail.toke.dk; dkim=pass header.d=toke.dk; arc=none (Message is not ARC signed); dmarc=pass (Used From Domain Record) header.from=toke.dk policy.dmarc=reject From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1763385829; bh=xc3v1AyoRURQTanMjViwCpxZz23hwtZqB8PsZ4KUMKk=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=sld5GOH1Br0rljc6YQGHJ9RHvVV0KTKJMmG1wuab2Py+6CvzWDEfzAvxjo3IWK6gi awv7R5Y71h5c6vv+mli/owthV7cICMIaHs+hczeODvxSkKi/paeogS+fDZzx1Y0v+K AjRW8c8E8YgUp0Xng2ozXLrR3kxhKC8hAutVveiuxcXIWGzz0+s9zfiMmdhbS6y0eO lgKgJ+Y6VD2i3lXk/2jlWZHeUDMmJzpNqhzIwQbcxuIYYLLnjOHngreupdC6mNXbt2 YGpX206g1EiHojwrkRxxBxr58ZzdtMJtbRKZA2xOKlp7Ca6Ej44ABgDYHv/mdDlg3G tvZ6uzlTR0SVg== To: Xiang Mei Cc: security@kernel.org, netdev@vger.kernel.org, cake@lists.bufferbloat.net, bestswngs@gmail.com In-Reply-To: References: <20251113035303.51165-1-xmei5@asu.edu> <87346ijbs9.fsf@toke.dk> Date: Mon, 17 Nov 2025 14:23:45 +0100 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <87a50kokri.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: QKKN5VUW6UAQBRKNU2JGE3A2V67FTQVM X-Message-ID-Hash: QKKN5VUW6UAQBRKNU2JGE3A2V67FTQVM X-MailFrom: toke@toke.dk X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Subject: [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop List-Id: Cake - FQ_codel the next generation Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: > will not run because the parent scheduler stops enqueueing after seeing > NET_XMIT_CN. For normal packets (non-GSO), it's easy to fix: just do > qdisc_tree_reduce_backlog(sch, 1, len). However, GSO splitting makes this > difficult because we may have already added multiple segments into the > flow, and we don=E2=80=99t know how many of them were dequeued. Huh, dequeued? This is all running under the qdisc lock, nothing gets dequeued in the meantime. Besides, the ACK thinning is irrelevant to the drop compensation. Here's an example: Without ACK splitting - we enqueue 1 packet of 100 bytes, then drop 1 packet of 200 bytes, so we should end up with the same qlen, but 100 fewer bytes in the queue: start: parent qlen =3D X, parent backlog =3D Y len =3D 100; cake_drop() drops 1 pkt / 200 bytes if (same_flow) { qdisc_reduce_backlog(0, 100) // parent qlen =3D=3D X, parent backlog =3D= =3D Y - 100 return NET_XMIT_CN; // no change in parent, so parent qlen =3D=3D X, parent backlog =3D=3D Y = - 100 } else { qdisc_reduce_backlog(1, 200); // parent qlen =3D=3D X - 1, backlog =3D=3D= Y - 200 return NET_XMIT_SUCCESS; // parent does qlen +=3D1, backlog +=3D 100, so parent qlen =3D=3D x, par= ent backlog =3D=3D Y - 100 } With ACK splitting - we enqueue 10 segments totalling 110 bytes, then drop 1 packet of 200 bytes, so we should end up with 9 packets more in the queue, but 90 bytes less: start: parent qlen =3D X, parent backlog =3D Y len =3D 100; /* split ack: slen =3D=3D 110, numsegs =3D=3D 10 */ qdisc_tree_reduce_backlog(-9, -10); // parent qlen =3D=3D X + 9, backlog = =3D=3D Y + 10 cake_drop() drops 1 pkt / 200 bytes if (same_flow) { qdisc_reduce_backlog(0, 100) // parent qlen =3D=3D X + 9, backlog =3D= =3D Y - 90 return NET_XMIT_CN; // no change in parent, so parent qlen =3D=3D X + 9, backlog =3D=3D Y - 90 } else { qdisc_reduce_backlog(1, 200); // parent qlen =3D=3D X + 8, backlog =3D=3D= Y - 190 return NET_XMIT_SUCCESS; // parent does qlen +=3D1, backlog +=3D 100, so parent qlen =3D=3D X + 9,= backlog =3D=3D Y - 90 } In both cases, what happens is that we drop one or more packets, reduce the backlog by the number of packets/bytes dropped *while compensating for what the parent qdisc does on return*. So the adjustments made by the segmentation makes no difference to the final outcome. However, we do have one problem with the ACK thinning code: in the 'if (ack)' branch, we currently adjust 'len' if we drop an ACK. Meaning that if we use that value later to adjust for what the parent qdisc, the value will no longer agree with what the parent does. So we'll have to introduce a new variable for the length used in the ACK thinning calculation. > The number of dequeued segments can be anywhere in [0, numsegs], and the > dequeued length in [0, slen]. We cannot know the exact number without=20 > checking the tin/flow index of each dropped packet. Therefore, we should > check inside the loop (as v1 did): > > ``` > cake_drop(...) > { > ... > if (likely(current_flow !=3D idx + (tin << 16))) > qdisc_tree_reduce_backlog(sch, 1, len); > ... > } > ``` No, this is not needed - the calculation involving prev_qlen and prev_backlog will correctly give us the total number of packets/bytes dropped. > > This solution also has a problem, as you mentioned: > if the flow already contains packets, dropping those packets should > trigger backlog reduction, but our check would incorrectly skip that. One > possible solution is to track the number of packets/segments enqueued > in the current cake_enqueue (numsegs or 1), and then avoid calling > `qdisc_tree_reduce_backlog(sch, 1, len)` for the 1 or numsegs dropped > packets. If that makes sense, I'll make the patch and test it. It does not - see above. > ----- > > Besides, I have a question about the condition for returning NET_XMIT_CN. > Do we return NET_XMIT_CN when: > > The incoming packet itself is dropped? (makes more sense to me) > or > The same flow dequeued once? (This is the current logic) The same flow. The current logic it correct. -Toke