* [Cake] [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop
@ 2025-11-13 3:53 Xiang Mei
2025-11-13 4:05 ` [Cake] " Xiang Mei
2025-11-13 13:21 ` Toke Høiland-Jørgensen
0 siblings, 2 replies; 8+ messages in thread
From: Xiang Mei @ 2025-11-13 3:53 UTC (permalink / raw)
To: security; +Cc: netdev, toke, cake, bestswngs, Xiang Mei
In cake_drop(), qdisc_tree_reduce_backlog() is called to decrement
the qlen of the qdisc hierarchy. However, this can incorrectly reduce
qlen when the dropped packet was never enqueued, leading to a possible
NULL dereference (e.g., when QFQ is the parent qdisc).
This happens when cake_enqueue() returns NET_XMIT_CN: the parent
qdisc does not enqueue the skb, but cake_drop() still reduces backlog.
This patch avoids the extra reduction by checking whether the packet
was actually enqueued. It also moves qdisc_tree_reduce_backlog()
out of cake_drop() to keep backlog accounting consistent.
Fixes: 15de71d06a40 ("net/sched: Make cake_enqueue return NET_XMIT_CN when past buffer_limit")
Signed-off-by: Xiang Mei <xmei5@asu.edu>
---
v2: add missing cc
v3: move qdisc_tree_reduce_backlog out of cake_drop
net/sched/sch_cake.c | 40 ++++++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 16 deletions(-)
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
index 32bacfc314c2..179cafe05085 100644
--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -1597,7 +1597,6 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free)
qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_OVERLIMIT);
sch->q.qlen--;
- qdisc_tree_reduce_backlog(sch, 1, len);
cake_heapify(q, 0);
@@ -1750,7 +1749,9 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
ktime_t now = ktime_get();
struct cake_tin_data *b;
struct cake_flow *flow;
- u32 idx, tin;
+ u32 dropped = 0;
+ u32 idx, tin, prev_qlen, prev_backlog, drop_id;
+ bool same_flow = false;
/* choose flow to insert into */
idx = cake_classify(sch, &b, skb, q->flow_mode, &ret);
@@ -1927,24 +1928,31 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
if (q->buffer_used > q->buffer_max_used)
q->buffer_max_used = q->buffer_used;
- if (q->buffer_used > q->buffer_limit) {
- bool same_flow = false;
- u32 dropped = 0;
- u32 drop_id;
+ if (q->buffer_used <= q->buffer_limit)
+ return NET_XMIT_SUCCESS;
- while (q->buffer_used > q->buffer_limit) {
- dropped++;
- drop_id = cake_drop(sch, to_free);
+ prev_qlen = sch->q.qlen;
+ prev_backlog = sch->qstats.backlog;
- if ((drop_id >> 16) == tin &&
- (drop_id & 0xFFFF) == idx)
- same_flow = true;
- }
- b->drop_overlimit += dropped;
+ while (q->buffer_used > q->buffer_limit) {
+ dropped++;
+ drop_id = cake_drop(sch, to_free);
+ if ((drop_id >> 16) == tin &&
+ (drop_id & 0xFFFF) == idx)
+ same_flow = true;
+ }
+ b->drop_overlimit += dropped;
+
+ /* Compute the droppped qlen and pkt length */
+ prev_qlen -= sch->q.qlen;
+ prev_backlog -= sch->qstats.backlog;
- if (same_flow)
- return NET_XMIT_CN;
+ if (same_flow) {
+ qdisc_tree_reduce_backlog(sch, prev_qlen - 1,
+ prev_backlog - len);
+ return NET_XMIT_CN;
}
+ qdisc_tree_reduce_backlog(sch, prev_qlen, prev_backlog);
return NET_XMIT_SUCCESS;
}
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop 2025-11-13 3:53 [Cake] [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop Xiang Mei @ 2025-11-13 4:05 ` Xiang Mei 2025-11-13 13:35 ` Toke Høiland-Jørgensen 2025-11-13 13:21 ` Toke Høiland-Jørgensen 1 sibling, 1 reply; 8+ messages in thread From: Xiang Mei @ 2025-11-13 4:05 UTC (permalink / raw) To: security; +Cc: netdev, toke, cake, bestswngs On Wed, Nov 12, 2025 at 08:53:03PM -0700, Xiang Mei wrote: > In cake_drop(), qdisc_tree_reduce_backlog() is called to decrement > the qlen of the qdisc hierarchy. However, this can incorrectly reduce > qlen when the dropped packet was never enqueued, leading to a possible > NULL dereference (e.g., when QFQ is the parent qdisc). > > This happens when cake_enqueue() returns NET_XMIT_CN: the parent > qdisc does not enqueue the skb, but cake_drop() still reduces backlog. > > This patch avoids the extra reduction by checking whether the packet > was actually enqueued. It also moves qdisc_tree_reduce_backlog() > out of cake_drop() to keep backlog accounting consistent. > > Fixes: 15de71d06a40 ("net/sched: Make cake_enqueue return NET_XMIT_CN when past buffer_limit") > Signed-off-by: Xiang Mei <xmei5@asu.edu> > --- > v2: add missing cc > v3: move qdisc_tree_reduce_backlog out of cake_drop > > net/sched/sch_cake.c | 40 ++++++++++++++++++++++++---------------- > 1 file changed, 24 insertions(+), 16 deletions(-) > > diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c > index 32bacfc314c2..179cafe05085 100644 > --- a/net/sched/sch_cake.c > +++ b/net/sched/sch_cake.c > @@ -1597,7 +1597,6 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free) > > qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_OVERLIMIT); > sch->q.qlen--; > - qdisc_tree_reduce_backlog(sch, 1, len); > > cake_heapify(q, 0); > > @@ -1750,7 +1749,9 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, > ktime_t now = ktime_get(); > struct cake_tin_data *b; > struct cake_flow *flow; > - u32 idx, tin; > + u32 dropped = 0; > + u32 idx, tin, prev_qlen, prev_backlog, drop_id; > + bool same_flow = false; > > /* choose flow to insert into */ > idx = cake_classify(sch, &b, skb, q->flow_mode, &ret); > @@ -1927,24 +1928,31 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, > if (q->buffer_used > q->buffer_max_used) > q->buffer_max_used = q->buffer_used; > > - if (q->buffer_used > q->buffer_limit) { > - bool same_flow = false; > - u32 dropped = 0; > - u32 drop_id; > + if (q->buffer_used <= q->buffer_limit) > + return NET_XMIT_SUCCESS; > > - while (q->buffer_used > q->buffer_limit) { > - dropped++; > - drop_id = cake_drop(sch, to_free); > + prev_qlen = sch->q.qlen; > + prev_backlog = sch->qstats.backlog; > > - if ((drop_id >> 16) == tin && > - (drop_id & 0xFFFF) == idx) > - same_flow = true; > - } > - b->drop_overlimit += dropped; > + while (q->buffer_used > q->buffer_limit) { > + dropped++; > + drop_id = cake_drop(sch, to_free); > + if ((drop_id >> 16) == tin && > + (drop_id & 0xFFFF) == idx) > + same_flow = true; > + } > + b->drop_overlimit += dropped; > + > + /* Compute the droppped qlen and pkt length */ > + prev_qlen -= sch->q.qlen; > + prev_backlog -= sch->qstats.backlog; > > - if (same_flow) > - return NET_XMIT_CN; > + if (same_flow) { > + qdisc_tree_reduce_backlog(sch, prev_qlen - 1, > + prev_backlog - len); > + return NET_XMIT_CN; > } > + qdisc_tree_reduce_backlog(sch, prev_qlen, prev_backlog); > return NET_XMIT_SUCCESS; > } > > -- > 2.43.0 > Thank Toke for the suggestion to move qdisc_tree_reduce_backlog out of cake_drop. It makes the logic cleaner. The patch passed CAKE's self-test: ```log ok 1 1212 - Create CAKE with default setting ok 2 3281 - Create CAKE with bandwidth limit ok 3 c940 - Create CAKE with autorate-ingress flag ok 4 2310 - Create CAKE with rtt time ok 5 2385 - Create CAKE with besteffort flag ok 6 a032 - Create CAKE with diffserv8 flag ok 7 2349 - Create CAKE with diffserv4 flag ok 8 8472 - Create CAKE with flowblind flag ok 9 2341 - Create CAKE with dsthost and nat flag ok 10 5134 - Create CAKE with wash flag ok 11 2302 - Create CAKE with flowblind and no-split-gso flag ok 12 0768 - Create CAKE with dual-srchost and ack-filter flag ok 13 0238 - Create CAKE with dual-dsthost and ack-filter-aggressive flag ok 14 6572 - Create CAKE with memlimit and ptm flag ok 15 2436 - Create CAKE with fwmark and atm flag ok 16 3984 - Create CAKE with overhead and mpu ok 17 5421 - Create CAKE with conservative and ingress flag ok 18 6854 - Delete CAKE with conservative and ingress flag ok 19 2342 - Replace CAKE with mpu ok 20 2313 - Change CAKE with mpu ok 21 4365 - Show CAKE class ``` There is still one problem I am not very sure since I am not very experienced with cake and gso. It's about the gso branch [1]. The slen is the lenth added to the cake sch and that branch uses `qdisc_tree_reduce_backlog(sch, 1-numsegs, len-slen);` to inform the parent sched. However, when we drop the packet, it could be probmatic since we should reduce slen instead of len. Is this a potential problem? [1] https://elixir.bootlin.com/linux/v6.6.116/source/net/sched/sch_cake.c#L1803 Thanks, Xiang ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop 2025-11-13 4:05 ` [Cake] " Xiang Mei @ 2025-11-13 13:35 ` Toke Høiland-Jørgensen 2025-11-15 9:23 ` Xiang Mei 0 siblings, 1 reply; 8+ messages in thread From: Toke Høiland-Jørgensen @ 2025-11-13 13:35 UTC (permalink / raw) To: Xiang Mei, security; +Cc: netdev, cake, bestswngs Xiang Mei <xmei5@asu.edu> writes: > There is still one problem I am not very sure since I am not very > experienced with cake and gso. It's about the gso branch [1]. The slen > is the lenth added to the cake sch and that branch uses > `qdisc_tree_reduce_backlog(sch, 1-numsegs, len-slen);` to inform the > parent sched. However, when we drop the packet, it could be probmatic > since we should reduce slen instead of len. Is this a potential > problem? Hmm, no I think it's fine? The qdisc_tree_reduce_backlog(sch, 1-numsegs, len-slen) *increases* the backlog with the difference between the original length and the number of new segments. And then we *decrease* the backlog with the number of bytes we dropped. The compensation we're doing is for the backlog update of the parent, which is still using the original packet length regardless of any splitting, so that doesn't change the compensation value. -Toke ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop 2025-11-13 13:35 ` Toke Høiland-Jørgensen @ 2025-11-15 9:23 ` Xiang Mei 2025-11-17 13:23 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 8+ messages in thread From: Xiang Mei @ 2025-11-15 9:23 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: security, netdev, cake, bestswngs On Thu, Nov 13, 2025 at 02:35:18PM +0100, Toke Høiland-Jørgensen wrote: > Xiang Mei <xmei5@asu.edu> writes: > > > There is still one problem I am not very sure since I am not very > > experienced with cake and gso. It's about the gso branch [1]. The slen > > is the lenth added to the cake sch and that branch uses > > `qdisc_tree_reduce_backlog(sch, 1-numsegs, len-slen);` to inform the > > parent sched. However, when we drop the packet, it could be probmatic > > since we should reduce slen instead of len. Is this a potential > > problem? > > Hmm, no I think it's fine? The qdisc_tree_reduce_backlog(sch, 1-numsegs, > len-slen) *increases* the backlog with the difference between the > original length and the number of new segments. And then we *decrease* > the backlog with the number of bytes we dropped. > > The compensation we're doing is for the backlog update of the parent, > which is still using the original packet length regardless of any > splitting, so that doesn't change the compensation value. > > -Toke I still think current method to reduce backlog may be problematic: What you said is stated for the GSO branch when cake_queue returns NET_XMIT_SUCCESS, but it may lead to issues when it returns NET_XMIT_CN. For the normal case where no dropping happens, the current implementation is correct. We can see how qlen and backlog change as follows: backlog: -(len - slen) Reason: qdisc_tree_reduce_backlog(sch, 1 - numsegs, len - slen); +len Reason: parent enqueue) Total: slen qlen: -(1 - numsegs) Reason: qdisc_tree_reduce_backlog(sch, 1 - numsegs, len - slen); +1 Reason: parent enqueue Total: numsegs This makes sense because we split one packet into numsegs packets of total length slen and enqueue them all. When a drop happens, we must fix both qlen and backlog. In the not patched code, cake_drop() calls qdisc_tree_reduce_backlog() for dropped packets. This works in most cases but ignores the scenario where we drop (parts of) the incoming packet, meaning the expected: ``` backlog += len qlen += 1 ``` will not run because the parent scheduler stops enqueueing after seeing NET_XMIT_CN. For normal packets (non-GSO), it's easy to fix: just do qdisc_tree_reduce_backlog(sch, 1, len). However, GSO splitting makes this difficult because we may have already added multiple segments into the flow, and we don’t know how many of them were dequeued. The number of dequeued segments can be anywhere in [0, numsegs], and the dequeued length in [0, slen]. We cannot know the exact number without checking the tin/flow index of each dropped packet. Therefore, we should check inside the loop (as v1 did): ``` cake_drop(...) { ... if (likely(current_flow != idx + (tin << 16))) qdisc_tree_reduce_backlog(sch, 1, len); ... } ``` This solution also has a problem, as you mentioned: if the flow already contains packets, dropping those packets should trigger backlog reduction, but our check would incorrectly skip that. One possible solution is to track the number of packets/segments enqueued in the current cake_enqueue (numsegs or 1), and then avoid calling `qdisc_tree_reduce_backlog(sch, 1, len)` for the 1 or numsegs dropped packets. If that makes sense, I'll make the patch and test it. ----- Besides, I have a question about the condition for returning NET_XMIT_CN. Do we return NET_XMIT_CN when: The incoming packet itself is dropped? (makes more sense to me) or The same flow dequeued once? (This is the current logic) If we keep the current logic, the above patch approach works. If not, we need additional checks because we append the incoming packet to the tail but drops occur at the head. Thanks, Xiang ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop 2025-11-15 9:23 ` Xiang Mei @ 2025-11-17 13:23 ` Toke Høiland-Jørgensen 2025-11-21 20:33 ` Xiang Mei 0 siblings, 1 reply; 8+ messages in thread From: Toke Høiland-Jørgensen @ 2025-11-17 13:23 UTC (permalink / raw) To: Xiang Mei; +Cc: security, netdev, cake, bestswngs > will not run because the parent scheduler stops enqueueing after seeing > NET_XMIT_CN. For normal packets (non-GSO), it's easy to fix: just do > qdisc_tree_reduce_backlog(sch, 1, len). However, GSO splitting makes this > difficult because we may have already added multiple segments into the > flow, and we don’t know how many of them were dequeued. Huh, dequeued? This is all running under the qdisc lock, nothing gets dequeued in the meantime. Besides, the ACK thinning is irrelevant to the drop compensation. Here's an example: Without ACK splitting - we enqueue 1 packet of 100 bytes, then drop 1 packet of 200 bytes, so we should end up with the same qlen, but 100 fewer bytes in the queue: start: parent qlen = X, parent backlog = Y len = 100; cake_drop() drops 1 pkt / 200 bytes if (same_flow) { qdisc_reduce_backlog(0, 100) // parent qlen == X, parent backlog == Y - 100 return NET_XMIT_CN; // no change in parent, so parent qlen == X, parent backlog == Y - 100 } else { qdisc_reduce_backlog(1, 200); // parent qlen == X - 1, backlog == Y - 200 return NET_XMIT_SUCCESS; // parent does qlen +=1, backlog += 100, so parent qlen == x, parent backlog == Y - 100 } With ACK splitting - we enqueue 10 segments totalling 110 bytes, then drop 1 packet of 200 bytes, so we should end up with 9 packets more in the queue, but 90 bytes less: start: parent qlen = X, parent backlog = Y len = 100; /* split ack: slen == 110, numsegs == 10 */ qdisc_tree_reduce_backlog(-9, -10); // parent qlen == X + 9, backlog == Y + 10 cake_drop() drops 1 pkt / 200 bytes if (same_flow) { qdisc_reduce_backlog(0, 100) // parent qlen == X + 9, backlog == Y - 90 return NET_XMIT_CN; // no change in parent, so parent qlen == X + 9, backlog == Y - 90 } else { qdisc_reduce_backlog(1, 200); // parent qlen == X + 8, backlog == Y - 190 return NET_XMIT_SUCCESS; // parent does qlen +=1, backlog += 100, so parent qlen == X + 9, backlog == Y - 90 } In both cases, what happens is that we drop one or more packets, reduce the backlog by the number of packets/bytes dropped *while compensating for what the parent qdisc does on return*. So the adjustments made by the segmentation makes no difference to the final outcome. However, we do have one problem with the ACK thinning code: in the 'if (ack)' branch, we currently adjust 'len' if we drop an ACK. Meaning that if we use that value later to adjust for what the parent qdisc, the value will no longer agree with what the parent does. So we'll have to introduce a new variable for the length used in the ACK thinning calculation. > The number of dequeued segments can be anywhere in [0, numsegs], and the > dequeued length in [0, slen]. We cannot know the exact number without > checking the tin/flow index of each dropped packet. Therefore, we should > check inside the loop (as v1 did): > > ``` > cake_drop(...) > { > ... > if (likely(current_flow != idx + (tin << 16))) > qdisc_tree_reduce_backlog(sch, 1, len); > ... > } > ``` No, this is not needed - the calculation involving prev_qlen and prev_backlog will correctly give us the total number of packets/bytes dropped. > > This solution also has a problem, as you mentioned: > if the flow already contains packets, dropping those packets should > trigger backlog reduction, but our check would incorrectly skip that. One > possible solution is to track the number of packets/segments enqueued > in the current cake_enqueue (numsegs or 1), and then avoid calling > `qdisc_tree_reduce_backlog(sch, 1, len)` for the 1 or numsegs dropped > packets. If that makes sense, I'll make the patch and test it. It does not - see above. > ----- > > Besides, I have a question about the condition for returning NET_XMIT_CN. > Do we return NET_XMIT_CN when: > > The incoming packet itself is dropped? (makes more sense to me) > or > The same flow dequeued once? (This is the current logic) The same flow. The current logic it correct. -Toke ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop 2025-11-17 13:23 ` Toke Høiland-Jørgensen @ 2025-11-21 20:33 ` Xiang Mei 0 siblings, 0 replies; 8+ messages in thread From: Xiang Mei @ 2025-11-21 20:33 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: security, netdev, cake, bestswngs On Mon, Nov 17, 2025 at 02:23:45PM +0100, Toke Høiland-Jørgensen wrote: > > will not run because the parent scheduler stops enqueueing after seeing > > NET_XMIT_CN. For normal packets (non-GSO), it's easy to fix: just do > > qdisc_tree_reduce_backlog(sch, 1, len). However, GSO splitting makes this > > difficult because we may have already added multiple segments into the > > flow, and we don’t know how many of them were dequeued. > > Huh, dequeued? This is all running under the qdisc lock, nothing gets > dequeued in the meantime. Sorry for using wrong word. The "dequeued" should be "dropped". > > Besides, the ACK thinning is irrelevant to the drop compensation. Here's > an example: > > Without ACK splitting - we enqueue 1 packet of 100 bytes, then drop 1 > packet of 200 bytes, so we should end up with the same qlen, but 100 > fewer bytes in the queue: > > start: parent qlen = X, parent backlog = Y > > len = 100; > cake_drop() drops 1 pkt / 200 bytes > > if (same_flow) { > qdisc_reduce_backlog(0, 100) // parent qlen == X, parent backlog == Y - 100 > return NET_XMIT_CN; > // no change in parent, so parent qlen == X, parent backlog == Y - 100 > } else { > qdisc_reduce_backlog(1, 200); // parent qlen == X - 1, backlog == Y - 200 > return NET_XMIT_SUCCESS; > // parent does qlen +=1, backlog += 100, so parent qlen == x, parent backlog == Y - 100 > } > > With ACK splitting - we enqueue 10 segments totalling 110 bytes, then > drop 1 packet of 200 bytes, so we should end up with 9 packets more in > the queue, but 90 bytes less: > > start: parent qlen = X, parent backlog = Y > > len = 100; > /* split ack: slen == 110, numsegs == 10 */ > qdisc_tree_reduce_backlog(-9, -10); // parent qlen == X + 9, backlog == Y + 10 > > cake_drop() drops 1 pkt / 200 bytes > > if (same_flow) { > qdisc_reduce_backlog(0, 100) // parent qlen == X + 9, backlog == Y - 90 > return NET_XMIT_CN; > // no change in parent, so parent qlen == X + 9, backlog == Y - 90 > > } else { > qdisc_reduce_backlog(1, 200); // parent qlen == X + 8, backlog == Y - 190 > return NET_XMIT_SUCCESS; > // parent does qlen +=1, backlog += 100, so parent qlen == X + 9, backlog == Y - 90 > } > > > In both cases, what happens is that we drop one or more packets, reduce > the backlog by the number of packets/bytes dropped *while compensating > for what the parent qdisc does on return*. So the adjustments made by > the segmentation makes no difference to the final outcome. Thanks for the detailed explanations. You are right. The current patch logic is correct to handle these cases. > > However, we do have one problem with the ACK thinning code: in the 'if > (ack)' branch, we currently adjust 'len' if we drop an ACK. Meaning that > if we use that value later to adjust for what the parent qdisc, the > value will no longer agree with what the parent does. So we'll have to > introduce a new variable for the length used in the ACK thinning > calculation. > I see the issue. It will be resolved in the new patch. > > The number of dequeued segments can be anywhere in [0, numsegs], and the > > dequeued length in [0, slen]. We cannot know the exact number without > > checking the tin/flow index of each dropped packet. Therefore, we should > > check inside the loop (as v1 did): > > > > ``` > > cake_drop(...) > > { > > ... > > if (likely(current_flow != idx + (tin << 16))) > > qdisc_tree_reduce_backlog(sch, 1, len); > > ... > > } > > ``` > > No, this is not needed - the calculation involving prev_qlen and > prev_backlog will correctly give us the total number of packets/bytes > dropped. > > > > This solution also has a problem, as you mentioned: > > if the flow already contains packets, dropping those packets should > > trigger backlog reduction, but our check would incorrectly skip that. One > > possible solution is to track the number of packets/segments enqueued > > in the current cake_enqueue (numsegs or 1), and then avoid calling > > `qdisc_tree_reduce_backlog(sch, 1, len)` for the 1 or numsegs dropped > > packets. If that makes sense, I'll make the patch and test it. > > It does not - see above. > > > ----- > > > > Besides, I have a question about the condition for returning NET_XMIT_CN. > > Do we return NET_XMIT_CN when: > > > > The incoming packet itself is dropped? (makes more sense to me) > > or > > The same flow dequeued once? (This is the current logic) > > The same flow. The current logic it correct. > > -Toke Thanks for the explanations. I have understood NET_XMIT_CN wrong. A patch (removing var droppped and handling ACK branch correctly) will be tested and sent. Thanks, Xiang ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop 2025-11-13 3:53 [Cake] [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop Xiang Mei 2025-11-13 4:05 ` [Cake] " Xiang Mei @ 2025-11-13 13:21 ` Toke Høiland-Jørgensen 2025-11-15 9:03 ` Xiang Mei 1 sibling, 1 reply; 8+ messages in thread From: Toke Høiland-Jørgensen @ 2025-11-13 13:21 UTC (permalink / raw) To: Xiang Mei, security; +Cc: netdev, cake, bestswngs, Xiang Mei Xiang Mei <xmei5@asu.edu> writes: > In cake_drop(), qdisc_tree_reduce_backlog() is called to decrement > the qlen of the qdisc hierarchy. However, this can incorrectly reduce > qlen when the dropped packet was never enqueued, leading to a possible > NULL dereference (e.g., when QFQ is the parent qdisc). > > This happens when cake_enqueue() returns NET_XMIT_CN: the parent > qdisc does not enqueue the skb, but cake_drop() still reduces backlog. > > This patch avoids the extra reduction by checking whether the packet > was actually enqueued. It also moves qdisc_tree_reduce_backlog() > out of cake_drop() to keep backlog accounting consistent. > > Fixes: 15de71d06a40 ("net/sched: Make cake_enqueue return NET_XMIT_CN when past buffer_limit") > Signed-off-by: Xiang Mei <xmei5@asu.edu> > --- > v2: add missing cc > v3: move qdisc_tree_reduce_backlog out of cake_drop > > net/sched/sch_cake.c | 40 ++++++++++++++++++++++++---------------- > 1 file changed, 24 insertions(+), 16 deletions(-) > > diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c > index 32bacfc314c2..179cafe05085 100644 > --- a/net/sched/sch_cake.c > +++ b/net/sched/sch_cake.c > @@ -1597,7 +1597,6 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free) > > qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_OVERLIMIT); > sch->q.qlen--; > - qdisc_tree_reduce_backlog(sch, 1, len); > > cake_heapify(q, 0); > > @@ -1750,7 +1749,9 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, > ktime_t now = ktime_get(); > struct cake_tin_data *b; > struct cake_flow *flow; > - u32 idx, tin; > + u32 dropped = 0; > + u32 idx, tin, prev_qlen, prev_backlog, drop_id; > + bool same_flow = false; > > /* choose flow to insert into */ > idx = cake_classify(sch, &b, skb, q->flow_mode, &ret); > @@ -1927,24 +1928,31 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, > if (q->buffer_used > q->buffer_max_used) > q->buffer_max_used = q->buffer_used; > > - if (q->buffer_used > q->buffer_limit) { > - bool same_flow = false; > - u32 dropped = 0; > - u32 drop_id; > + if (q->buffer_used <= q->buffer_limit) > + return NET_XMIT_SUCCESS; While this does reduce indentation, I don't think it's an overall improvement; the overflow condition is the exceptional case, so it's clearer to keep it inside the if statement (which also keeps the variable scope smaller). > - while (q->buffer_used > q->buffer_limit) { > - dropped++; > - drop_id = cake_drop(sch, to_free); > + prev_qlen = sch->q.qlen; > + prev_backlog = sch->qstats.backlog; > > - if ((drop_id >> 16) == tin && > - (drop_id & 0xFFFF) == idx) > - same_flow = true; > - } > - b->drop_overlimit += dropped; > + while (q->buffer_used > q->buffer_limit) { > + dropped++; > + drop_id = cake_drop(sch, to_free); > + if ((drop_id >> 16) == tin && > + (drop_id & 0xFFFF) == idx) > + same_flow = true; > + } > + b->drop_overlimit += dropped; The 'dropped' variable is wholly redundant after this change, so let's just get rid of it and use the prev_qlen for this statistic instead. -Toke ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cake] Re: [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop 2025-11-13 13:21 ` Toke Høiland-Jørgensen @ 2025-11-15 9:03 ` Xiang Mei 0 siblings, 0 replies; 8+ messages in thread From: Xiang Mei @ 2025-11-15 9:03 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: security, netdev, cake, bestswngs On Thu, Nov 13, 2025 at 02:21:54PM +0100, Toke Høiland-Jørgensen wrote: > Xiang Mei <xmei5@asu.edu> writes: > > > In cake_drop(), qdisc_tree_reduce_backlog() is called to decrement > > the qlen of the qdisc hierarchy. However, this can incorrectly reduce > > qlen when the dropped packet was never enqueued, leading to a possible > > NULL dereference (e.g., when QFQ is the parent qdisc). > > > > This happens when cake_enqueue() returns NET_XMIT_CN: the parent > > qdisc does not enqueue the skb, but cake_drop() still reduces backlog. > > > > This patch avoids the extra reduction by checking whether the packet > > was actually enqueued. It also moves qdisc_tree_reduce_backlog() > > out of cake_drop() to keep backlog accounting consistent. > > > > Fixes: 15de71d06a40 ("net/sched: Make cake_enqueue return NET_XMIT_CN when past buffer_limit") > > Signed-off-by: Xiang Mei <xmei5@asu.edu> > > --- > > v2: add missing cc > > v3: move qdisc_tree_reduce_backlog out of cake_drop > > > > net/sched/sch_cake.c | 40 ++++++++++++++++++++++++---------------- > > 1 file changed, 24 insertions(+), 16 deletions(-) > > > > diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c > > index 32bacfc314c2..179cafe05085 100644 > > --- a/net/sched/sch_cake.c > > +++ b/net/sched/sch_cake.c > > @@ -1597,7 +1597,6 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free) > > > > qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_OVERLIMIT); > > sch->q.qlen--; > > - qdisc_tree_reduce_backlog(sch, 1, len); > > > > cake_heapify(q, 0); > > > > @@ -1750,7 +1749,9 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, > > ktime_t now = ktime_get(); > > struct cake_tin_data *b; > > struct cake_flow *flow; > > - u32 idx, tin; > > + u32 dropped = 0; > > + u32 idx, tin, prev_qlen, prev_backlog, drop_id; > > + bool same_flow = false; > > > > /* choose flow to insert into */ > > idx = cake_classify(sch, &b, skb, q->flow_mode, &ret); > > @@ -1927,24 +1928,31 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, > > if (q->buffer_used > q->buffer_max_used) > > q->buffer_max_used = q->buffer_used; > > > > - if (q->buffer_used > q->buffer_limit) { > > - bool same_flow = false; > > - u32 dropped = 0; > > - u32 drop_id; > > + if (q->buffer_used <= q->buffer_limit) > > + return NET_XMIT_SUCCESS; > > While this does reduce indentation, I don't think it's an overall > improvement; the overflow condition is the exceptional case, so it's > clearer to keep it inside the if statement (which also keeps the > variable scope smaller). > > > - while (q->buffer_used > q->buffer_limit) { > > - dropped++; > > - drop_id = cake_drop(sch, to_free); > > + prev_qlen = sch->q.qlen; > > + prev_backlog = sch->qstats.backlog; > > > > - if ((drop_id >> 16) == tin && > > - (drop_id & 0xFFFF) == idx) > > - same_flow = true; > > - } > > - b->drop_overlimit += dropped; > > + while (q->buffer_used > q->buffer_limit) { > > + dropped++; > > + drop_id = cake_drop(sch, to_free); > > + if ((drop_id >> 16) == tin && > > + (drop_id & 0xFFFF) == idx) > > + same_flow = true; > > + } > > + b->drop_overlimit += dropped; > > The 'dropped' variable is wholly redundant after this change, so let's > just get rid of it and use the prev_qlen for this statistic instead. > > -Toke Thanks for the reminder, and sorry for the delayed reply. You are right that we should remove the unused variable. Xiang ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-11-21 20:33 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-11-13 3:53 [Cake] [PATCH net v3] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop Xiang Mei 2025-11-13 4:05 ` [Cake] " Xiang Mei 2025-11-13 13:35 ` Toke Høiland-Jørgensen 2025-11-15 9:23 ` Xiang Mei 2025-11-17 13:23 ` Toke Høiland-Jørgensen 2025-11-21 20:33 ` Xiang Mei 2025-11-13 13:21 ` Toke Høiland-Jørgensen 2025-11-15 9:03 ` Xiang Mei
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox