Cake - FQ_codel the next generation
 help / color / mirror / Atom feed
From: Xiang Mei <xmei5@asu.edu>
To: "Toke Høiland-Jørgensen" <toke@toke.dk>
Cc: security@kernel.org, netdev@vger.kernel.org,
	cake@lists.bufferbloat.net, bestswngs@gmail.com
Subject: [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop
Date: Fri, 21 Nov 2025 13:33:25 -0700	[thread overview]
Message-ID: <7guebhjv734hjkgtnmloyj7lwaaxj6nz5as4bjruo24t3vs72r@54ryrobt6tdo> (raw)
In-Reply-To: <87a50kokri.fsf@toke.dk>

On Mon, Nov 17, 2025 at 02:23:45PM +0100, Toke Høiland-Jørgensen wrote:
> > will not run because the parent scheduler stops enqueueing after seeing
> > NET_XMIT_CN. For normal packets (non-GSO), it's easy to fix: just do
> > qdisc_tree_reduce_backlog(sch, 1, len). However, GSO splitting makes this
> > difficult because we may have already added multiple segments into the
> > flow, and we don’t know how many of them were dequeued.
> 
> Huh, dequeued? This is all running under the qdisc lock, nothing gets
> dequeued in the meantime.

Sorry for using wrong word. The "dequeued" should be "dropped".

> 
> Besides, the ACK thinning is irrelevant to the drop compensation. Here's
> an example:
> 
> Without ACK splitting - we enqueue 1 packet of 100 bytes, then drop 1
> packet of 200 bytes, so we should end up with the same qlen, but 100
> fewer bytes in the queue:
> 
> start: parent qlen = X, parent backlog = Y
> 
> len = 100;
> cake_drop() drops 1 pkt / 200 bytes
> 
> if (same_flow) {
>   qdisc_reduce_backlog(0, 100) // parent qlen == X, parent backlog == Y - 100
>   return NET_XMIT_CN;
>   // no change in parent, so parent qlen == X, parent backlog == Y - 100
> } else {
>   qdisc_reduce_backlog(1, 200); // parent qlen == X - 1, backlog == Y - 200
>   return NET_XMIT_SUCCESS;
>   // parent does qlen +=1, backlog += 100, so parent qlen == x, parent backlog == Y - 100
> }
> 
> With ACK splitting - we enqueue 10 segments totalling 110 bytes, then
> drop 1 packet of 200 bytes, so we should end up with 9 packets more in
> the queue, but 90 bytes less:
> 
> start: parent qlen = X, parent backlog = Y
> 
> len = 100;
> /* split ack: slen == 110, numsegs == 10 */
> qdisc_tree_reduce_backlog(-9, -10); // parent qlen == X + 9, backlog == Y + 10
> 
> cake_drop() drops 1 pkt / 200 bytes
> 
> if (same_flow) {
>   qdisc_reduce_backlog(0, 100)   // parent qlen == X + 9, backlog == Y - 90
>   return NET_XMIT_CN;
>   // no change in parent, so parent qlen == X + 9, backlog == Y - 90
> 
> } else {
>   qdisc_reduce_backlog(1, 200); // parent qlen == X + 8, backlog == Y - 190
>   return NET_XMIT_SUCCESS;
>   // parent does qlen +=1, backlog += 100, so parent qlen == X + 9, backlog == Y - 90
> }
> 
> 
> In both cases, what happens is that we drop one or more packets, reduce
> the backlog by the number of packets/bytes dropped *while compensating
> for what the parent qdisc does on return*. So the adjustments made by
> the segmentation makes no difference to the final outcome.

Thanks for the detailed explanations. You are right. The current patch
logic is correct to handle these cases.

> 
> However, we do have one problem with the ACK thinning code: in the 'if
> (ack)' branch, we currently adjust 'len' if we drop an ACK. Meaning that
> if we use that value later to adjust for what the parent qdisc, the
> value will no longer agree with what the parent does. So we'll have to
> introduce a new variable for the length used in the ACK thinning
> calculation.
> 

I see the issue. It will be resolved in the new patch.

> > The number of dequeued segments can be anywhere in [0, numsegs], and the
> > dequeued length in [0, slen]. We cannot know the exact number without 
> > checking the tin/flow index of each dropped packet. Therefore, we should
> > check inside the loop (as v1 did):
> >
> > ```
> > cake_drop(...)
> > {
> >     ...
> >     if (likely(current_flow != idx + (tin << 16)))
> >         qdisc_tree_reduce_backlog(sch, 1, len);
> >     ...
> > }
> > ```
> 
> No, this is not needed - the calculation involving prev_qlen and
> prev_backlog will correctly give us the total number of packets/bytes
> dropped.
> >
> > This solution also has a problem, as you mentioned:
> > if the flow already contains packets, dropping those packets should
> > trigger backlog reduction, but our check would incorrectly skip that. One
> > possible solution is to track the number of packets/segments enqueued
> > in the current cake_enqueue (numsegs or 1), and then avoid calling
> > `qdisc_tree_reduce_backlog(sch, 1, len)` for the 1 or numsegs dropped
> > packets. If that makes sense, I'll make the patch and test it.
> 
> It does not - see above.
> 
> > -----
> >
> > Besides, I have a question about the condition for returning NET_XMIT_CN.
> > Do we return NET_XMIT_CN when:
> >
> > The incoming packet itself is dropped? (makes more sense to me)
> > or
> > The same flow dequeued once? (This is the current logic)
> 
> The same flow. The current logic it correct.
> 
> -Toke


Thanks for the explanations. I have understood NET_XMIT_CN wrong. A patch
(removing var droppped and handling ACK branch correctly) will be tested
and sent.

Thanks,
Xiang

  reply	other threads:[~2025-11-21 20:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-13  3:53 [Cake] " Xiang Mei
2025-11-13  4:05 ` [Cake] " Xiang Mei
2025-11-13 13:35   ` Toke Høiland-Jørgensen
2025-11-15  9:23     ` Xiang Mei
2025-11-17 13:23       ` Toke Høiland-Jørgensen
2025-11-21 20:33         ` Xiang Mei [this message]
2025-11-13 13:21 ` Toke Høiland-Jørgensen
2025-11-15  9:03   ` Xiang Mei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7guebhjv734hjkgtnmloyj7lwaaxj6nz5as4bjruo24t3vs72r@54ryrobt6tdo \
    --to=xmei5@asu.edu \
    --cc=bestswngs@gmail.com \
    --cc=cake@lists.bufferbloat.net \
    --cc=netdev@vger.kernel.org \
    --cc=security@kernel.org \
    --cc=toke@toke.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox