From: Xiang Mei <xmei5@asu.edu>
To: "Toke Høiland-Jørgensen" <toke@toke.dk>
Cc: security@kernel.org, netdev@vger.kernel.org,
cake@lists.bufferbloat.net, bestswngs@gmail.com
Subject: [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop
Date: Fri, 21 Nov 2025 13:33:25 -0700 [thread overview]
Message-ID: <7guebhjv734hjkgtnmloyj7lwaaxj6nz5as4bjruo24t3vs72r@54ryrobt6tdo> (raw)
In-Reply-To: <87a50kokri.fsf@toke.dk>
On Mon, Nov 17, 2025 at 02:23:45PM +0100, Toke Høiland-Jørgensen wrote:
> > will not run because the parent scheduler stops enqueueing after seeing
> > NET_XMIT_CN. For normal packets (non-GSO), it's easy to fix: just do
> > qdisc_tree_reduce_backlog(sch, 1, len). However, GSO splitting makes this
> > difficult because we may have already added multiple segments into the
> > flow, and we don’t know how many of them were dequeued.
>
> Huh, dequeued? This is all running under the qdisc lock, nothing gets
> dequeued in the meantime.
Sorry for using wrong word. The "dequeued" should be "dropped".
>
> Besides, the ACK thinning is irrelevant to the drop compensation. Here's
> an example:
>
> Without ACK splitting - we enqueue 1 packet of 100 bytes, then drop 1
> packet of 200 bytes, so we should end up with the same qlen, but 100
> fewer bytes in the queue:
>
> start: parent qlen = X, parent backlog = Y
>
> len = 100;
> cake_drop() drops 1 pkt / 200 bytes
>
> if (same_flow) {
> qdisc_reduce_backlog(0, 100) // parent qlen == X, parent backlog == Y - 100
> return NET_XMIT_CN;
> // no change in parent, so parent qlen == X, parent backlog == Y - 100
> } else {
> qdisc_reduce_backlog(1, 200); // parent qlen == X - 1, backlog == Y - 200
> return NET_XMIT_SUCCESS;
> // parent does qlen +=1, backlog += 100, so parent qlen == x, parent backlog == Y - 100
> }
>
> With ACK splitting - we enqueue 10 segments totalling 110 bytes, then
> drop 1 packet of 200 bytes, so we should end up with 9 packets more in
> the queue, but 90 bytes less:
>
> start: parent qlen = X, parent backlog = Y
>
> len = 100;
> /* split ack: slen == 110, numsegs == 10 */
> qdisc_tree_reduce_backlog(-9, -10); // parent qlen == X + 9, backlog == Y + 10
>
> cake_drop() drops 1 pkt / 200 bytes
>
> if (same_flow) {
> qdisc_reduce_backlog(0, 100) // parent qlen == X + 9, backlog == Y - 90
> return NET_XMIT_CN;
> // no change in parent, so parent qlen == X + 9, backlog == Y - 90
>
> } else {
> qdisc_reduce_backlog(1, 200); // parent qlen == X + 8, backlog == Y - 190
> return NET_XMIT_SUCCESS;
> // parent does qlen +=1, backlog += 100, so parent qlen == X + 9, backlog == Y - 90
> }
>
>
> In both cases, what happens is that we drop one or more packets, reduce
> the backlog by the number of packets/bytes dropped *while compensating
> for what the parent qdisc does on return*. So the adjustments made by
> the segmentation makes no difference to the final outcome.
Thanks for the detailed explanations. You are right. The current patch
logic is correct to handle these cases.
>
> However, we do have one problem with the ACK thinning code: in the 'if
> (ack)' branch, we currently adjust 'len' if we drop an ACK. Meaning that
> if we use that value later to adjust for what the parent qdisc, the
> value will no longer agree with what the parent does. So we'll have to
> introduce a new variable for the length used in the ACK thinning
> calculation.
>
I see the issue. It will be resolved in the new patch.
> > The number of dequeued segments can be anywhere in [0, numsegs], and the
> > dequeued length in [0, slen]. We cannot know the exact number without
> > checking the tin/flow index of each dropped packet. Therefore, we should
> > check inside the loop (as v1 did):
> >
> > ```
> > cake_drop(...)
> > {
> > ...
> > if (likely(current_flow != idx + (tin << 16)))
> > qdisc_tree_reduce_backlog(sch, 1, len);
> > ...
> > }
> > ```
>
> No, this is not needed - the calculation involving prev_qlen and
> prev_backlog will correctly give us the total number of packets/bytes
> dropped.
> >
> > This solution also has a problem, as you mentioned:
> > if the flow already contains packets, dropping those packets should
> > trigger backlog reduction, but our check would incorrectly skip that. One
> > possible solution is to track the number of packets/segments enqueued
> > in the current cake_enqueue (numsegs or 1), and then avoid calling
> > `qdisc_tree_reduce_backlog(sch, 1, len)` for the 1 or numsegs dropped
> > packets. If that makes sense, I'll make the patch and test it.
>
> It does not - see above.
>
> > -----
> >
> > Besides, I have a question about the condition for returning NET_XMIT_CN.
> > Do we return NET_XMIT_CN when:
> >
> > The incoming packet itself is dropped? (makes more sense to me)
> > or
> > The same flow dequeued once? (This is the current logic)
>
> The same flow. The current logic it correct.
>
> -Toke
Thanks for the explanations. I have understood NET_XMIT_CN wrong. A patch
(removing var droppped and handling ACK branch correctly) will be tested
and sent.
Thanks,
Xiang
next prev parent reply other threads:[~2025-11-21 20:33 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-13 3:53 [Cake] " Xiang Mei
2025-11-13 4:05 ` [Cake] " Xiang Mei
2025-11-13 13:35 ` Toke Høiland-Jørgensen
2025-11-15 9:23 ` Xiang Mei
2025-11-17 13:23 ` Toke Høiland-Jørgensen
2025-11-21 20:33 ` Xiang Mei [this message]
2025-11-13 13:21 ` Toke Høiland-Jørgensen
2025-11-15 9:03 ` Xiang Mei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7guebhjv734hjkgtnmloyj7lwaaxj6nz5as4bjruo24t3vs72r@54ryrobt6tdo \
--to=xmei5@asu.edu \
--cc=bestswngs@gmail.com \
--cc=cake@lists.bufferbloat.net \
--cc=netdev@vger.kernel.org \
--cc=security@kernel.org \
--cc=toke@toke.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox