Cake - FQ_codel the next generation
 help / color / mirror / Atom feed
* [Cake] [PATCH RFC net-next 0/4] Multi-queue aware sch_cake
@ 2025-09-24 12:16 Toke Høiland-Jørgensen
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 1/4] net/sched: sch_cake: Factor out config variables into separate struct Toke Høiland-Jørgensen
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-09-24 12:16 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman
  Cc: Jonas Köppeler, cake, netdev, Toke Høiland-Jørgensen

This series adds a multi-queue aware variant of the sch_cake scheduler,
called 'cake_mq'. Using this makes it possible to scale the rate shaper
of sch_cake across multiple CPUs, while still enforcing a single global
rate on the interface.

The approach taken in this patch series is to implement a separate qdisc
called 'cake_mq', which is based on the existing 'mq' qdisc, but differs
in a couple of aspects:

- It will always install a cake instance on each hardware queue (instead
  of using the default qdisc for each queue like 'mq' does).

- The cake instances on the queues will share their configuration, which
  can only be modified through the parent cake_mq instance.

Doing things this way does incur a bit of code duplication (reusing the
'mq' qdisc code), but it simplifies user configuration by centralising
all configuration through the cake_mq qdisc (which also serves as an
obvious way of opting into the multi-queue aware behaviour).

The cake_mq qdisc takes all the same configuration parameters as the
cake qdisc, plus on additional parameter to control the sync time
between the individual cake instances.

We are posting this series to solicit feedback on the API, as well as
wider testing of the multi-core shaper.

An earlier version of this work was presented at this year's Netdevconf:
https://netdevconf.info/0x19/sessions/talk/mq-cake-scaling-software-rate-limiting-across-cpu-cores.html

The patch series is structured as follows:

- Patch 1 factors out the sch_cake configuration variables into a
  separate struct that can be shared between instances.

- Patch 2 adds the basic cake_mq qdisc, based on the mq code

- Patch 3 adds configuration sharing across the cake instances installed
  under cake_mq

- Patch 4 adds the shared shaper state that enables the multi-core rate
  shaping

A patch to iproute2 to make it aware of the cake_mq qdisc is included as
a separate patch as part of this series.

---
Jonas Köppeler (1):
      net/sched: sch_cake: share shaper state across sub-instances of cake_mq

Toke Høiland-Jørgensen (3):
      net/sched: sch_cake: Factor out config variables into separate struct
      net/sched: sch_cake: Add cake_mq qdisc for using cake on mq devices
      net/sched: sch_cake: Share config across cake_mq sub-qdiscs

 include/uapi/linux/pkt_sched.h |   2 +
 net/sched/sch_cake.c           | 635 +++++++++++++++++++++++++++++++++--------
 2 files changed, 514 insertions(+), 123 deletions(-)
---
base-commit: dc1dea796b197aba2c3cae25bfef45f4b3ad46fe
change-id: 20250902-mq-cake-sub-qdisc-cdf0b59d2fe5


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cake] [PATCH RFC net-next 1/4] net/sched: sch_cake: Factor out config variables into separate struct
  2025-09-24 12:16 [Cake] [PATCH RFC net-next 0/4] Multi-queue aware sch_cake Toke Høiland-Jørgensen
@ 2025-09-24 12:16 ` Toke Høiland-Jørgensen
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 2/4] net/sched: sch_cake: Add cake_mq qdisc for using cake on mq devices Toke Høiland-Jørgensen
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-09-24 12:16 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman
  Cc: Jonas Köppeler, cake, netdev, Toke Høiland-Jørgensen

Factor out all the user-configurable variables into a separate struct
and embed it into struct cake_sched_data. This is done in preparation
for sharing the configuration across multiple instances of cake in an mq
setup.

No functional change is intended with this patch.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 net/sched/sch_cake.c | 245 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 133 insertions(+), 112 deletions(-)

diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
index 32bacfc314c260dccf94178d309ccb2be22d69e4..a02f3cfcb09b50bda6ee66dfc8a8df584ae6a365 100644
--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -197,40 +197,42 @@ struct cake_tin_data {
 	u32	way_collisions;
 }; /* number of tins is small, so size of this struct doesn't matter much */
 
+struct cake_sched_config {
+	u64		rate_bps;
+	u64		interval;
+	u64		target;
+	u32		buffer_config_limit;
+	u32		fwmark_mask;
+	u16		fwmark_shft;
+	s16		rate_overhead;
+	u16		rate_mpu;
+	u16		rate_flags;
+	u8		tin_mode;
+	u8		flow_mode;
+	u8		atm_mode;
+	u8		ack_filter;
+};
+
 struct cake_sched_data {
 	struct tcf_proto __rcu *filter_list; /* optional external classifier */
 	struct tcf_block *block;
 	struct cake_tin_data *tins;
+	struct cake_sched_config *config;
 
 	struct cake_heap_entry overflow_heap[CAKE_QUEUES * CAKE_MAX_TINS];
-	u16		overflow_timeout;
-
-	u16		tin_cnt;
-	u8		tin_mode;
-	u8		flow_mode;
-	u8		ack_filter;
-	u8		atm_mode;
-
-	u32		fwmark_mask;
-	u16		fwmark_shft;
 
 	/* time_next = time_this + ((len * rate_ns) >> rate_shft) */
-	u16		rate_shft;
 	ktime_t		time_next_packet;
 	ktime_t		failsafe_next_packet;
 	u64		rate_ns;
-	u64		rate_bps;
-	u16		rate_flags;
-	s16		rate_overhead;
-	u16		rate_mpu;
-	u64		interval;
-	u64		target;
+	u16		rate_shft;
+	u16		overflow_timeout;
+	u16		tin_cnt;
 
 	/* resource tracking */
 	u32		buffer_used;
 	u32		buffer_max_used;
 	u32		buffer_limit;
-	u32		buffer_config_limit;
 
 	/* indices for dequeue */
 	u16		cur_tin;
@@ -1198,7 +1200,7 @@ static bool cake_tcph_may_drop(const struct tcphdr *tcph,
 static struct sk_buff *cake_ack_filter(struct cake_sched_data *q,
 				       struct cake_flow *flow)
 {
-	bool aggressive = q->ack_filter == CAKE_ACK_AGGRESSIVE;
+	bool aggressive = q->config->ack_filter == CAKE_ACK_AGGRESSIVE;
 	struct sk_buff *elig_ack = NULL, *elig_ack_prev = NULL;
 	struct sk_buff *skb_check, *skb_prev = NULL;
 	const struct ipv6hdr *ipv6h, *ipv6h_check;
@@ -1358,15 +1360,17 @@ static u64 cake_ewma(u64 avg, u64 sample, u32 shift)
 	return avg;
 }
 
-static u32 cake_calc_overhead(struct cake_sched_data *q, u32 len, u32 off)
+static u32 cake_calc_overhead(struct cake_sched_data *qd, u32 len, u32 off)
 {
+	struct cake_sched_config *q = qd->config;
+
 	if (q->rate_flags & CAKE_FLAG_OVERHEAD)
 		len -= off;
 
-	if (q->max_netlen < len)
-		q->max_netlen = len;
-	if (q->min_netlen > len)
-		q->min_netlen = len;
+	if (qd->max_netlen < len)
+		qd->max_netlen = len;
+	if (qd->min_netlen > len)
+		qd->min_netlen = len;
 
 	len += q->rate_overhead;
 
@@ -1385,10 +1389,10 @@ static u32 cake_calc_overhead(struct cake_sched_data *q, u32 len, u32 off)
 		len += (len + 63) / 64;
 	}
 
-	if (q->max_adjlen < len)
-		q->max_adjlen = len;
-	if (q->min_adjlen > len)
-		q->min_adjlen = len;
+	if (qd->max_adjlen < len)
+		qd->max_adjlen = len;
+	if (qd->min_adjlen > len)
+		qd->min_adjlen = len;
 
 	return len;
 }
@@ -1592,7 +1596,7 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free)
 	flow->dropped++;
 	b->tin_dropped++;
 
-	if (q->rate_flags & CAKE_FLAG_INGRESS)
+	if (q->config->rate_flags & CAKE_FLAG_INGRESS)
 		cake_advance_shaper(q, b, skb, now, true);
 
 	qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_OVERLIMIT);
@@ -1663,7 +1667,8 @@ static u8 cake_handle_diffserv(struct sk_buff *skb, bool wash)
 static struct cake_tin_data *cake_select_tin(struct Qdisc *sch,
 					     struct sk_buff *skb)
 {
-	struct cake_sched_data *q = qdisc_priv(sch);
+	struct cake_sched_data *qd = qdisc_priv(sch);
+	struct cake_sched_config *q = qd->config;
 	u32 tin, mark;
 	bool wash;
 	u8 dscp;
@@ -1680,24 +1685,24 @@ static struct cake_tin_data *cake_select_tin(struct Qdisc *sch,
 	if (q->tin_mode == CAKE_DIFFSERV_BESTEFFORT)
 		tin = 0;
 
-	else if (mark && mark <= q->tin_cnt)
-		tin = q->tin_order[mark - 1];
+	else if (mark && mark <= qd->tin_cnt)
+		tin = qd->tin_order[mark - 1];
 
 	else if (TC_H_MAJ(skb->priority) == sch->handle &&
 		 TC_H_MIN(skb->priority) > 0 &&
-		 TC_H_MIN(skb->priority) <= q->tin_cnt)
-		tin = q->tin_order[TC_H_MIN(skb->priority) - 1];
+		 TC_H_MIN(skb->priority) <= qd->tin_cnt)
+		tin = qd->tin_order[TC_H_MIN(skb->priority) - 1];
 
 	else {
 		if (!wash)
 			dscp = cake_handle_diffserv(skb, wash);
-		tin = q->tin_index[dscp];
+		tin = qd->tin_index[dscp];
 
-		if (unlikely(tin >= q->tin_cnt))
+		if (unlikely(tin >= qd->tin_cnt))
 			tin = 0;
 	}
 
-	return &q->tins[tin];
+	return &qd->tins[tin];
 }
 
 static u32 cake_classify(struct Qdisc *sch, struct cake_tin_data **t,
@@ -1753,7 +1758,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 	u32 idx, tin;
 
 	/* choose flow to insert into */
-	idx = cake_classify(sch, &b, skb, q->flow_mode, &ret);
+	idx = cake_classify(sch, &b, skb, q->config->flow_mode, &ret);
 	if (idx == 0) {
 		if (ret & __NET_XMIT_BYPASS)
 			qdisc_qstats_drop(sch);
@@ -1788,7 +1793,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 	if (unlikely(len > b->max_skblen))
 		b->max_skblen = len;
 
-	if (skb_is_gso(skb) && q->rate_flags & CAKE_FLAG_SPLIT_GSO) {
+	if (skb_is_gso(skb) && q->config->rate_flags & CAKE_FLAG_SPLIT_GSO) {
 		struct sk_buff *segs, *nskb;
 		netdev_features_t features = netif_skb_features(skb);
 		unsigned int slen = 0, numsegs = 0;
@@ -1827,7 +1832,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 		get_cobalt_cb(skb)->adjusted_len = cake_overhead(q, skb);
 		flow_queue_add(flow, skb);
 
-		if (q->ack_filter)
+		if (q->config->ack_filter)
 			ack = cake_ack_filter(q, flow);
 
 		if (ack) {
@@ -1836,7 +1841,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 			b->bytes += qdisc_pkt_len(ack);
 			len -= qdisc_pkt_len(ack);
 			q->buffer_used += skb->truesize - ack->truesize;
-			if (q->rate_flags & CAKE_FLAG_INGRESS)
+			if (q->config->rate_flags & CAKE_FLAG_INGRESS)
 				cake_advance_shaper(q, b, ack, now, true);
 
 			qdisc_tree_reduce_backlog(sch, 1, qdisc_pkt_len(ack));
@@ -1859,7 +1864,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 		cake_heapify_up(q, b->overflow_idx[idx]);
 
 	/* incoming bandwidth capacity estimate */
-	if (q->rate_flags & CAKE_FLAG_AUTORATE_INGRESS) {
+	if (q->config->rate_flags & CAKE_FLAG_AUTORATE_INGRESS) {
 		u64 packet_interval = \
 			ktime_to_ns(ktime_sub(now, q->last_packet_time));
 
@@ -1891,7 +1896,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 			if (ktime_after(now,
 					ktime_add_ms(q->last_reconfig_time,
 						     250))) {
-				q->rate_bps = (q->avg_peak_bandwidth * 15) >> 4;
+				q->config->rate_bps = (q->avg_peak_bandwidth * 15) >> 4;
 				cake_reconfigure(sch);
 			}
 		}
@@ -1911,7 +1916,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 		flow->set = CAKE_SET_SPARSE;
 		b->sparse_flow_count++;
 
-		flow->deficit = cake_get_flow_quantum(b, flow, q->flow_mode);
+		flow->deficit = cake_get_flow_quantum(b, flow, q->config->flow_mode);
 	} else if (flow->set == CAKE_SET_SPARSE_WAIT) {
 		/* this flow was empty, accounted as a sparse flow, but actually
 		 * in the bulk rotation.
@@ -1920,8 +1925,8 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 		b->sparse_flow_count--;
 		b->bulk_flow_count++;
 
-		cake_inc_srchost_bulk_flow_count(b, flow, q->flow_mode);
-		cake_inc_dsthost_bulk_flow_count(b, flow, q->flow_mode);
+		cake_inc_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
+		cake_inc_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
 	}
 
 	if (q->buffer_used > q->buffer_max_used)
@@ -2103,8 +2108,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
 				b->sparse_flow_count--;
 				b->bulk_flow_count++;
 
-				cake_inc_srchost_bulk_flow_count(b, flow, q->flow_mode);
-				cake_inc_dsthost_bulk_flow_count(b, flow, q->flow_mode);
+				cake_inc_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
+				cake_inc_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
 
 				flow->set = CAKE_SET_BULK;
 			} else {
@@ -2116,7 +2121,7 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
 			}
 		}
 
-		flow->deficit += cake_get_flow_quantum(b, flow, q->flow_mode);
+		flow->deficit += cake_get_flow_quantum(b, flow, q->config->flow_mode);
 		list_move_tail(&flow->flowchain, &b->old_flows);
 
 		goto retry;
@@ -2140,8 +2145,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
 				if (flow->set == CAKE_SET_BULK) {
 					b->bulk_flow_count--;
 
-					cake_dec_srchost_bulk_flow_count(b, flow, q->flow_mode);
-					cake_dec_dsthost_bulk_flow_count(b, flow, q->flow_mode);
+					cake_dec_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
+					cake_dec_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
 
 					b->decaying_flow_count++;
 				} else if (flow->set == CAKE_SET_SPARSE ||
@@ -2159,8 +2164,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
 				else if (flow->set == CAKE_SET_BULK) {
 					b->bulk_flow_count--;
 
-					cake_dec_srchost_bulk_flow_count(b, flow, q->flow_mode);
-					cake_dec_dsthost_bulk_flow_count(b, flow, q->flow_mode);
+					cake_dec_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
+					cake_dec_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
 				} else
 					b->decaying_flow_count--;
 
@@ -2171,14 +2176,14 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
 
 		reason = cobalt_should_drop(&flow->cvars, &b->cparams, now, skb,
 					    (b->bulk_flow_count *
-					     !!(q->rate_flags &
+					     !!(q->config->rate_flags &
 						CAKE_FLAG_INGRESS)));
 		/* Last packet in queue may be marked, shouldn't be dropped */
 		if (reason == SKB_NOT_DROPPED_YET || !flow->head)
 			break;
 
 		/* drop this packet, get another one */
-		if (q->rate_flags & CAKE_FLAG_INGRESS) {
+		if (q->config->rate_flags & CAKE_FLAG_INGRESS) {
 			len = cake_advance_shaper(q, b, skb,
 						  now, true);
 			flow->deficit -= len;
@@ -2189,7 +2194,7 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
 		qdisc_tree_reduce_backlog(sch, 1, qdisc_pkt_len(skb));
 		qdisc_qstats_drop(sch);
 		kfree_skb_reason(skb, reason);
-		if (q->rate_flags & CAKE_FLAG_INGRESS)
+		if (q->config->rate_flags & CAKE_FLAG_INGRESS)
 			goto retry;
 	}
 
@@ -2311,7 +2316,7 @@ static int cake_config_besteffort(struct Qdisc *sch)
 	struct cake_sched_data *q = qdisc_priv(sch);
 	struct cake_tin_data *b = &q->tins[0];
 	u32 mtu = psched_mtu(qdisc_dev(sch));
-	u64 rate = q->rate_bps;
+	u64 rate = q->config->rate_bps;
 
 	q->tin_cnt = 1;
 
@@ -2319,7 +2324,7 @@ static int cake_config_besteffort(struct Qdisc *sch)
 	q->tin_order = normal_order;
 
 	cake_set_rate(b, rate, mtu,
-		      us_to_ns(q->target), us_to_ns(q->interval));
+		      us_to_ns(q->config->target), us_to_ns(q->config->interval));
 	b->tin_quantum = 65535;
 
 	return 0;
@@ -2330,7 +2335,7 @@ static int cake_config_precedence(struct Qdisc *sch)
 	/* convert high-level (user visible) parameters into internal format */
 	struct cake_sched_data *q = qdisc_priv(sch);
 	u32 mtu = psched_mtu(qdisc_dev(sch));
-	u64 rate = q->rate_bps;
+	u64 rate = q->config->rate_bps;
 	u32 quantum = 256;
 	u32 i;
 
@@ -2341,8 +2346,8 @@ static int cake_config_precedence(struct Qdisc *sch)
 	for (i = 0; i < q->tin_cnt; i++) {
 		struct cake_tin_data *b = &q->tins[i];
 
-		cake_set_rate(b, rate, mtu, us_to_ns(q->target),
-			      us_to_ns(q->interval));
+		cake_set_rate(b, rate, mtu, us_to_ns(q->config->target),
+			      us_to_ns(q->config->interval));
 
 		b->tin_quantum = max_t(u16, 1U, quantum);
 
@@ -2419,7 +2424,7 @@ static int cake_config_diffserv8(struct Qdisc *sch)
 
 	struct cake_sched_data *q = qdisc_priv(sch);
 	u32 mtu = psched_mtu(qdisc_dev(sch));
-	u64 rate = q->rate_bps;
+	u64 rate = q->config->rate_bps;
 	u32 quantum = 256;
 	u32 i;
 
@@ -2433,8 +2438,8 @@ static int cake_config_diffserv8(struct Qdisc *sch)
 	for (i = 0; i < q->tin_cnt; i++) {
 		struct cake_tin_data *b = &q->tins[i];
 
-		cake_set_rate(b, rate, mtu, us_to_ns(q->target),
-			      us_to_ns(q->interval));
+		cake_set_rate(b, rate, mtu, us_to_ns(q->config->target),
+			      us_to_ns(q->config->interval));
 
 		b->tin_quantum = max_t(u16, 1U, quantum);
 
@@ -2463,7 +2468,7 @@ static int cake_config_diffserv4(struct Qdisc *sch)
 
 	struct cake_sched_data *q = qdisc_priv(sch);
 	u32 mtu = psched_mtu(qdisc_dev(sch));
-	u64 rate = q->rate_bps;
+	u64 rate = q->config->rate_bps;
 	u32 quantum = 1024;
 
 	q->tin_cnt = 4;
@@ -2474,13 +2479,13 @@ static int cake_config_diffserv4(struct Qdisc *sch)
 
 	/* class characteristics */
 	cake_set_rate(&q->tins[0], rate, mtu,
-		      us_to_ns(q->target), us_to_ns(q->interval));
+		      us_to_ns(q->config->target), us_to_ns(q->config->interval));
 	cake_set_rate(&q->tins[1], rate >> 4, mtu,
-		      us_to_ns(q->target), us_to_ns(q->interval));
+		      us_to_ns(q->config->target), us_to_ns(q->config->interval));
 	cake_set_rate(&q->tins[2], rate >> 1, mtu,
-		      us_to_ns(q->target), us_to_ns(q->interval));
+		      us_to_ns(q->config->target), us_to_ns(q->config->interval));
 	cake_set_rate(&q->tins[3], rate >> 2, mtu,
-		      us_to_ns(q->target), us_to_ns(q->interval));
+		      us_to_ns(q->config->target), us_to_ns(q->config->interval));
 
 	/* bandwidth-sharing weights */
 	q->tins[0].tin_quantum = quantum;
@@ -2500,7 +2505,7 @@ static int cake_config_diffserv3(struct Qdisc *sch)
  */
 	struct cake_sched_data *q = qdisc_priv(sch);
 	u32 mtu = psched_mtu(qdisc_dev(sch));
-	u64 rate = q->rate_bps;
+	u64 rate = q->config->rate_bps;
 	u32 quantum = 1024;
 
 	q->tin_cnt = 3;
@@ -2511,11 +2516,11 @@ static int cake_config_diffserv3(struct Qdisc *sch)
 
 	/* class characteristics */
 	cake_set_rate(&q->tins[0], rate, mtu,
-		      us_to_ns(q->target), us_to_ns(q->interval));
+		      us_to_ns(q->config->target), us_to_ns(q->config->interval));
 	cake_set_rate(&q->tins[1], rate >> 4, mtu,
-		      us_to_ns(q->target), us_to_ns(q->interval));
+		      us_to_ns(q->config->target), us_to_ns(q->config->interval));
 	cake_set_rate(&q->tins[2], rate >> 2, mtu,
-		      us_to_ns(q->target), us_to_ns(q->interval));
+		      us_to_ns(q->config->target), us_to_ns(q->config->interval));
 
 	/* bandwidth-sharing weights */
 	q->tins[0].tin_quantum = quantum;
@@ -2527,7 +2532,8 @@ static int cake_config_diffserv3(struct Qdisc *sch)
 
 static void cake_reconfigure(struct Qdisc *sch)
 {
-	struct cake_sched_data *q = qdisc_priv(sch);
+	struct cake_sched_data *qd = qdisc_priv(sch);
+	struct cake_sched_config *q = qd->config;
 	int c, ft;
 
 	switch (q->tin_mode) {
@@ -2553,36 +2559,37 @@ static void cake_reconfigure(struct Qdisc *sch)
 		break;
 	}
 
-	for (c = q->tin_cnt; c < CAKE_MAX_TINS; c++) {
+	for (c = qd->tin_cnt; c < CAKE_MAX_TINS; c++) {
 		cake_clear_tin(sch, c);
-		q->tins[c].cparams.mtu_time = q->tins[ft].cparams.mtu_time;
+		qd->tins[c].cparams.mtu_time = qd->tins[ft].cparams.mtu_time;
 	}
 
-	q->rate_ns   = q->tins[ft].tin_rate_ns;
-	q->rate_shft = q->tins[ft].tin_rate_shft;
+	qd->rate_ns   = qd->tins[ft].tin_rate_ns;
+	qd->rate_shft = qd->tins[ft].tin_rate_shft;
 
 	if (q->buffer_config_limit) {
-		q->buffer_limit = q->buffer_config_limit;
+		qd->buffer_limit = q->buffer_config_limit;
 	} else if (q->rate_bps) {
 		u64 t = q->rate_bps * q->interval;
 
 		do_div(t, USEC_PER_SEC / 4);
-		q->buffer_limit = max_t(u32, t, 4U << 20);
+		qd->buffer_limit = max_t(u32, t, 4U << 20);
 	} else {
-		q->buffer_limit = ~0;
+		qd->buffer_limit = ~0;
 	}
 
 	sch->flags &= ~TCQ_F_CAN_BYPASS;
 
-	q->buffer_limit = min(q->buffer_limit,
-			      max(sch->limit * psched_mtu(qdisc_dev(sch)),
-				  q->buffer_config_limit));
+	qd->buffer_limit = min(qd->buffer_limit,
+			       max(sch->limit * psched_mtu(qdisc_dev(sch)),
+				   q->buffer_config_limit));
 }
 
 static int cake_change(struct Qdisc *sch, struct nlattr *opt,
 		       struct netlink_ext_ack *extack)
 {
-	struct cake_sched_data *q = qdisc_priv(sch);
+	struct cake_sched_data *qd = qdisc_priv(sch);
+	struct cake_sched_config *q = qd->config;
 	struct nlattr *tb[TCA_CAKE_MAX + 1];
 	u16 rate_flags;
 	u8 flow_mode;
@@ -2636,19 +2643,19 @@ static int cake_change(struct Qdisc *sch, struct nlattr *opt,
 			   nla_get_s32(tb[TCA_CAKE_OVERHEAD]));
 		rate_flags |= CAKE_FLAG_OVERHEAD;
 
-		q->max_netlen = 0;
-		q->max_adjlen = 0;
-		q->min_netlen = ~0;
-		q->min_adjlen = ~0;
+		qd->max_netlen = 0;
+		qd->max_adjlen = 0;
+		qd->min_netlen = ~0;
+		qd->min_adjlen = ~0;
 	}
 
 	if (tb[TCA_CAKE_RAW]) {
 		rate_flags &= ~CAKE_FLAG_OVERHEAD;
 
-		q->max_netlen = 0;
-		q->max_adjlen = 0;
-		q->min_netlen = ~0;
-		q->min_adjlen = ~0;
+		qd->max_netlen = 0;
+		qd->max_adjlen = 0;
+		qd->min_netlen = ~0;
+		qd->min_adjlen = ~0;
 	}
 
 	if (tb[TCA_CAKE_MPU])
@@ -2704,7 +2711,7 @@ static int cake_change(struct Qdisc *sch, struct nlattr *opt,
 
 	WRITE_ONCE(q->rate_flags, rate_flags);
 	WRITE_ONCE(q->flow_mode, flow_mode);
-	if (q->tins) {
+	if (qd->tins) {
 		sch_tree_lock(sch);
 		cake_reconfigure(sch);
 		sch_tree_unlock(sch);
@@ -2720,14 +2727,20 @@ static void cake_destroy(struct Qdisc *sch)
 	qdisc_watchdog_cancel(&q->watchdog);
 	tcf_block_put(q->block);
 	kvfree(q->tins);
+	kvfree(q->config);
 }
 
 static int cake_init(struct Qdisc *sch, struct nlattr *opt,
 		     struct netlink_ext_ack *extack)
 {
-	struct cake_sched_data *q = qdisc_priv(sch);
+	struct cake_sched_data *qd = qdisc_priv(sch);
+	struct cake_sched_config *q;
 	int i, j, err;
 
+	q = kvcalloc(1, sizeof(struct cake_sched_config), GFP_KERNEL);
+	if (!q)
+		return -ENOMEM;
+
 	sch->limit = 10240;
 	q->tin_mode = CAKE_DIFFSERV_DIFFSERV3;
 	q->flow_mode  = CAKE_FLOW_TRIPLE;
@@ -2739,33 +2752,36 @@ static int cake_init(struct Qdisc *sch, struct nlattr *opt,
 			       * for 5 to 10% of interval
 			       */
 	q->rate_flags |= CAKE_FLAG_SPLIT_GSO;
-	q->cur_tin = 0;
-	q->cur_flow  = 0;
+	qd->cur_tin = 0;
+	qd->cur_flow  = 0;
+	qd->config = q;
 
-	qdisc_watchdog_init(&q->watchdog, sch);
+	qdisc_watchdog_init(&qd->watchdog, sch);
 
 	if (opt) {
 		err = cake_change(sch, opt, extack);
 
 		if (err)
-			return err;
+			goto err;
 	}
 
-	err = tcf_block_get(&q->block, &q->filter_list, sch, extack);
+	err = tcf_block_get(&qd->block, &qd->filter_list, sch, extack);
 	if (err)
-		return err;
+		goto err;
 
 	quantum_div[0] = ~0;
 	for (i = 1; i <= CAKE_QUEUES; i++)
 		quantum_div[i] = 65535 / i;
 
-	q->tins = kvcalloc(CAKE_MAX_TINS, sizeof(struct cake_tin_data),
-			   GFP_KERNEL);
-	if (!q->tins)
-		return -ENOMEM;
+	qd->tins = kvcalloc(CAKE_MAX_TINS, sizeof(struct cake_tin_data),
+			    GFP_KERNEL);
+	if (!qd->tins) {
+		err = -ENOMEM;
+		goto err;
+	}
 
 	for (i = 0; i < CAKE_MAX_TINS; i++) {
-		struct cake_tin_data *b = q->tins + i;
+		struct cake_tin_data *b = qd->tins + i;
 
 		INIT_LIST_HEAD(&b->new_flows);
 		INIT_LIST_HEAD(&b->old_flows);
@@ -2781,22 +2797,27 @@ static int cake_init(struct Qdisc *sch, struct nlattr *opt,
 			INIT_LIST_HEAD(&flow->flowchain);
 			cobalt_vars_init(&flow->cvars);
 
-			q->overflow_heap[k].t = i;
-			q->overflow_heap[k].b = j;
+			qd->overflow_heap[k].t = i;
+			qd->overflow_heap[k].b = j;
 			b->overflow_idx[j] = k;
 		}
 	}
 
 	cake_reconfigure(sch);
-	q->avg_peak_bandwidth = q->rate_bps;
-	q->min_netlen = ~0;
-	q->min_adjlen = ~0;
+	qd->avg_peak_bandwidth = q->rate_bps;
+	qd->min_netlen = ~0;
+	qd->min_adjlen = ~0;
 	return 0;
+err:
+	kvfree(qd->config);
+	qd->config = NULL;
+	return err;
 }
 
 static int cake_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
-	struct cake_sched_data *q = qdisc_priv(sch);
+	struct cake_sched_data *qd = qdisc_priv(sch);
+	struct cake_sched_config *q = qd->config;
 	struct nlattr *opts;
 	u16 rate_flags;
 	u8 flow_mode;

-- 
2.51.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cake] [PATCH RFC net-next 2/4] net/sched: sch_cake: Add cake_mq qdisc for using cake on mq devices
  2025-09-24 12:16 [Cake] [PATCH RFC net-next 0/4] Multi-queue aware sch_cake Toke Høiland-Jørgensen
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 1/4] net/sched: sch_cake: Factor out config variables into separate struct Toke Høiland-Jørgensen
@ 2025-09-24 12:16 ` Toke Høiland-Jørgensen
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 3/4] net/sched: sch_cake: Share config across cake_mq sub-qdiscs Toke Høiland-Jørgensen
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-09-24 12:16 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman
  Cc: Jonas Köppeler, cake, netdev, Toke Høiland-Jørgensen

Add a cake_mq qdisc which installs cake instances on each hardware
queue on a multi-queue device.

This is just a copy of sch_mq that installs cake instead of the default
qdisc on each queue. Subsequent commits will add sharing of the ckae
cake config between cake instances, as well as a multi-queue aware
shaper algorithm.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 net/sched/sch_cake.c | 214 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 213 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
index a02f3cfcb09b50bda6ee66dfc8a8df584ae6a365..d17d7669de389bb21ca6ce3b209e0272cfaa5112 100644
--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -3154,14 +3154,226 @@ static struct Qdisc_ops cake_qdisc_ops __read_mostly = {
 };
 MODULE_ALIAS_NET_SCH("cake");
 
+struct cake_mq_sched {
+	struct Qdisc		**qdiscs;
+};
+
+static void cake_mq_destroy(struct Qdisc *sch)
+{
+	struct net_device *dev = qdisc_dev(sch);
+	struct cake_mq_sched *priv = qdisc_priv(sch);
+	unsigned int ntx;
+
+	if (!priv->qdiscs)
+		return;
+	for (ntx = 0; ntx < dev->num_tx_queues && priv->qdiscs[ntx]; ntx++)
+		qdisc_put(priv->qdiscs[ntx]);
+	kfree(priv->qdiscs);
+}
+
+static int cake_mq_init(struct Qdisc *sch, struct nlattr *opt,
+			struct netlink_ext_ack *extack)
+{
+	struct net_device *dev = qdisc_dev(sch);
+	struct cake_mq_sched *priv = qdisc_priv(sch);
+	struct netdev_queue *dev_queue;
+	struct Qdisc *qdisc;
+	unsigned int ntx;
+
+	if (sch->parent != TC_H_ROOT)
+		return -EOPNOTSUPP;
+
+	if (!netif_is_multiqueue(dev))
+		return -EOPNOTSUPP;
+
+	/* pre-allocate qdiscs, attachment can't fail */
+	priv->qdiscs = kcalloc(dev->num_tx_queues, sizeof(priv->qdiscs[0]),
+			       GFP_KERNEL);
+	if (!priv->qdiscs)
+		return -ENOMEM;
+
+	for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
+		dev_queue = netdev_get_tx_queue(dev, ntx);
+		qdisc = qdisc_create_dflt(dev_queue, &cake_qdisc_ops,
+					  TC_H_MAKE(TC_H_MAJ(sch->handle),
+						    TC_H_MIN(ntx + 1)),
+					  extack);
+		if (!qdisc) {
+			kfree(priv->qdiscs);
+			return -ENOMEM;
+		}
+		priv->qdiscs[ntx] = qdisc;
+		qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
+	}
+
+	sch->flags |= TCQ_F_MQROOT;
+
+	return 0;
+}
+
+static void cake_mq_attach(struct Qdisc *sch)
+{
+	struct net_device *dev = qdisc_dev(sch);
+	struct cake_mq_sched *priv = qdisc_priv(sch);
+	struct Qdisc *qdisc, *old;
+	unsigned int ntx;
+
+	for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
+		qdisc = priv->qdiscs[ntx];
+		old = dev_graft_qdisc(qdisc->dev_queue, qdisc);
+		if (old)
+			qdisc_put(old);
+#ifdef CONFIG_NET_SCHED
+		if (ntx < dev->real_num_tx_queues)
+			qdisc_hash_add(qdisc, false);
+#endif
+	}
+	kfree(priv->qdiscs);
+	priv->qdiscs = NULL;
+}
+
+static int cake_mq_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct net_device *dev = qdisc_dev(sch);
+	struct Qdisc *qdisc;
+	unsigned int ntx;
+
+	sch->q.qlen = 0;
+	gnet_stats_basic_sync_init(&sch->bstats);
+	memset(&sch->qstats, 0, sizeof(sch->qstats));
+
+	/* MQ supports lockless qdiscs. However, statistics accounting needs
+	 * to account for all, none, or a mix of locked and unlocked child
+	 * qdiscs. Percpu stats are added to counters in-band and locking
+	 * qdisc totals are added at end.
+	 */
+	for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
+		qdisc = rtnl_dereference(netdev_get_tx_queue(dev, ntx)->qdisc_sleeping);
+		spin_lock_bh(qdisc_lock(qdisc));
+
+		gnet_stats_add_basic(&sch->bstats, qdisc->cpu_bstats,
+				     &qdisc->bstats, false);
+		gnet_stats_add_queue(&sch->qstats, qdisc->cpu_qstats,
+				     &qdisc->qstats);
+		sch->q.qlen += qdisc_qlen(qdisc);
+
+		spin_unlock_bh(qdisc_lock(qdisc));
+	}
+
+	return 0;
+}
+
+static struct netdev_queue *cake_mq_queue_get(struct Qdisc *sch, unsigned long cl)
+{
+	struct net_device *dev = qdisc_dev(sch);
+	unsigned long ntx = cl - 1;
+
+	if (ntx >= dev->num_tx_queues)
+		return NULL;
+	return netdev_get_tx_queue(dev, ntx);
+}
+
+static struct netdev_queue *cake_mq_select_queue(struct Qdisc *sch,
+						 struct tcmsg *tcm)
+{
+	return cake_mq_queue_get(sch, TC_H_MIN(tcm->tcm_parent));
+}
+
+static int cake_mq_graft(struct Qdisc *sch, unsigned long cl, struct Qdisc *new,
+			 struct Qdisc **old, struct netlink_ext_ack *extack)
+{
+	NL_SET_ERR_MSG(extack, "can't replace cake_mq sub-qdiscs");
+	return -EOPNOTSUPP;
+}
+
+static struct Qdisc *cake_mq_leaf(struct Qdisc *sch, unsigned long cl)
+{
+	struct netdev_queue *dev_queue = cake_mq_queue_get(sch, cl);
+
+	return rtnl_dereference(dev_queue->qdisc_sleeping);
+}
+
+static unsigned long cake_mq_find(struct Qdisc *sch, u32 classid)
+{
+	unsigned int ntx = TC_H_MIN(classid);
+
+	if (!cake_mq_queue_get(sch, ntx))
+		return 0;
+	return ntx;
+}
+
+static int cake_mq_dump_class(struct Qdisc *sch, unsigned long cl,
+			      struct sk_buff *skb, struct tcmsg *tcm)
+{
+	struct netdev_queue *dev_queue = cake_mq_queue_get(sch, cl);
+
+	tcm->tcm_parent = TC_H_ROOT;
+	tcm->tcm_handle |= TC_H_MIN(cl);
+	tcm->tcm_info = rtnl_dereference(dev_queue->qdisc_sleeping)->handle;
+	return 0;
+}
+
+static int cake_mq_dump_class_stats(struct Qdisc *sch, unsigned long cl,
+				    struct gnet_dump *d)
+{
+	struct netdev_queue *dev_queue = cake_mq_queue_get(sch, cl);
+
+	sch = rtnl_dereference(dev_queue->qdisc_sleeping);
+	if (gnet_stats_copy_basic(d, sch->cpu_bstats, &sch->bstats, true) < 0 ||
+	    qdisc_qstats_copy(d, sch) < 0)
+		return -1;
+	return 0;
+}
+
+static void cake_mq_walk(struct Qdisc *sch, struct qdisc_walker *arg)
+{
+	struct net_device *dev = qdisc_dev(sch);
+	unsigned int ntx;
+
+	if (arg->stop)
+		return;
+
+	arg->count = arg->skip;
+	for (ntx = arg->skip; ntx < dev->num_tx_queues; ntx++) {
+		if (!tc_qdisc_stats_dump(sch, ntx + 1, arg))
+			break;
+	}
+}
+
+static const struct Qdisc_class_ops cake_mq_class_ops = {
+	.select_queue	= cake_mq_select_queue,
+	.graft		= cake_mq_graft,
+	.leaf		= cake_mq_leaf,
+	.find		= cake_mq_find,
+	.walk		= cake_mq_walk,
+	.dump		= cake_mq_dump_class,
+	.dump_stats	= cake_mq_dump_class_stats,
+};
+
+static struct Qdisc_ops cake_mq_qdisc_ops __read_mostly = {
+	.cl_ops		=	&cake_mq_class_ops,
+	.id		=	"cake_mq",
+	.priv_size	=	sizeof(struct cake_mq_sched),
+	.init		=	cake_mq_init,
+	.destroy	=	cake_mq_destroy,
+	.attach		= cake_mq_attach,
+	//	.change		=	cake_mq_change,
+	.change_real_num_tx = mq_change_real_num_tx,
+	.dump		=	cake_mq_dump,
+	.owner		=	THIS_MODULE,
+};
+MODULE_ALIAS_NET_SCH("cake_mq");
+
 static int __init cake_module_init(void)
 {
-	return register_qdisc(&cake_qdisc_ops);
+	return register_qdisc(&cake_qdisc_ops) ?:
+		register_qdisc(&cake_mq_qdisc_ops);
 }
 
 static void __exit cake_module_exit(void)
 {
 	unregister_qdisc(&cake_qdisc_ops);
+	unregister_qdisc(&cake_mq_qdisc_ops);
 }
 
 module_init(cake_module_init)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cake] [PATCH RFC net-next 3/4] net/sched: sch_cake: Share config across cake_mq sub-qdiscs
  2025-09-24 12:16 [Cake] [PATCH RFC net-next 0/4] Multi-queue aware sch_cake Toke Høiland-Jørgensen
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 1/4] net/sched: sch_cake: Factor out config variables into separate struct Toke Høiland-Jørgensen
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 2/4] net/sched: sch_cake: Add cake_mq qdisc for using cake on mq devices Toke Høiland-Jørgensen
@ 2025-09-24 12:16 ` Toke Høiland-Jørgensen
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 4/4] net/sched: sch_cake: share shaper state across sub-instances of cake_mq Toke Høiland-Jørgensen
  2025-09-24 12:18 ` [Cake] [PATCH RFC net-next] tc: cake: add cake_mq support Toke Høiland-Jørgensen
  4 siblings, 0 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-09-24 12:16 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman
  Cc: Jonas Köppeler, cake, netdev, Toke Høiland-Jørgensen

This adds support for configuring the cake_mq instance directly, sharing
the config across the cake sub-qdiscs.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 net/sched/sch_cake.c | 147 +++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 118 insertions(+), 29 deletions(-)

diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
index d17d7669de389bb21ca6ce3b209e0272cfaa5112..7ceccbfaa9b6b4cdeaf17b416c2b65709c22a60a 100644
--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -211,6 +211,7 @@ struct cake_sched_config {
 	u8		flow_mode;
 	u8		atm_mode;
 	u8		ack_filter;
+	u8		is_shared;
 };
 
 struct cake_sched_data {
@@ -2585,11 +2586,9 @@ static void cake_reconfigure(struct Qdisc *sch)
 				   q->buffer_config_limit));
 }
 
-static int cake_change(struct Qdisc *sch, struct nlattr *opt,
-		       struct netlink_ext_ack *extack)
+static int cake_config_change(struct cake_sched_config *q, struct nlattr *opt,
+			      struct netlink_ext_ack *extack, bool *overhead_changed)
 {
-	struct cake_sched_data *qd = qdisc_priv(sch);
-	struct cake_sched_config *q = qd->config;
 	struct nlattr *tb[TCA_CAKE_MAX + 1];
 	u16 rate_flags;
 	u8 flow_mode;
@@ -2642,20 +2641,12 @@ static int cake_change(struct Qdisc *sch, struct nlattr *opt,
 		WRITE_ONCE(q->rate_overhead,
 			   nla_get_s32(tb[TCA_CAKE_OVERHEAD]));
 		rate_flags |= CAKE_FLAG_OVERHEAD;
-
-		qd->max_netlen = 0;
-		qd->max_adjlen = 0;
-		qd->min_netlen = ~0;
-		qd->min_adjlen = ~0;
+		*overhead_changed = true;
 	}
 
 	if (tb[TCA_CAKE_RAW]) {
 		rate_flags &= ~CAKE_FLAG_OVERHEAD;
-
-		qd->max_netlen = 0;
-		qd->max_adjlen = 0;
-		qd->min_netlen = ~0;
-		qd->min_adjlen = ~0;
+		*overhead_changed = true;
 	}
 
 	if (tb[TCA_CAKE_MPU])
@@ -2711,6 +2702,34 @@ static int cake_change(struct Qdisc *sch, struct nlattr *opt,
 
 	WRITE_ONCE(q->rate_flags, rate_flags);
 	WRITE_ONCE(q->flow_mode, flow_mode);
+
+	return 0;
+}
+
+static int cake_change(struct Qdisc *sch, struct nlattr *opt,
+		       struct netlink_ext_ack *extack)
+{
+	struct cake_sched_data *qd = qdisc_priv(sch);
+	struct cake_sched_config *q = qd->config;
+	bool overhead_changed = false;
+	int ret;
+
+	if (q->is_shared) {
+		NL_SET_ERR_MSG(extack, "can't reconfigure cake_mq sub-qdiscs");
+		return -EOPNOTSUPP;
+	}
+
+	ret = cake_config_change(q, opt, extack, &overhead_changed);
+	if (ret)
+		return ret;
+
+	if (overhead_changed) {
+		qd->max_netlen = 0;
+		qd->max_adjlen = 0;
+		qd->min_netlen = ~0;
+		qd->min_adjlen = ~0;
+	}
+
 	if (qd->tins) {
 		sch_tree_lock(sch);
 		cake_reconfigure(sch);
@@ -2727,7 +2746,23 @@ static void cake_destroy(struct Qdisc *sch)
 	qdisc_watchdog_cancel(&q->watchdog);
 	tcf_block_put(q->block);
 	kvfree(q->tins);
-	kvfree(q->config);
+	if (!q->config->is_shared)
+		kvfree(q->config);
+}
+
+static void cake_config_init(struct cake_sched_config *q, bool is_shared)
+{
+	q->tin_mode = CAKE_DIFFSERV_DIFFSERV3;
+	q->flow_mode  = CAKE_FLOW_TRIPLE;
+
+	q->rate_bps = 0; /* unlimited by default */
+
+	q->interval = 100000; /* 100ms default */
+	q->target   =   5000; /* 5ms: codel RFC argues
+			       * for 5 to 10% of interval
+			       */
+	q->rate_flags |= CAKE_FLAG_SPLIT_GSO;
+	q->is_shared = is_shared;
 }
 
 static int cake_init(struct Qdisc *sch, struct nlattr *opt,
@@ -2741,17 +2776,9 @@ static int cake_init(struct Qdisc *sch, struct nlattr *opt,
 	if (!q)
 		return -ENOMEM;
 
+	cake_config_init(q, false);
+
 	sch->limit = 10240;
-	q->tin_mode = CAKE_DIFFSERV_DIFFSERV3;
-	q->flow_mode  = CAKE_FLOW_TRIPLE;
-
-	q->rate_bps = 0; /* unlimited by default */
-
-	q->interval = 100000; /* 100ms default */
-	q->target   =   5000; /* 5ms: codel RFC argues
-			       * for 5 to 10% of interval
-			       */
-	q->rate_flags |= CAKE_FLAG_SPLIT_GSO;
 	qd->cur_tin = 0;
 	qd->cur_flow  = 0;
 	qd->config = q;
@@ -2814,10 +2841,21 @@ static int cake_init(struct Qdisc *sch, struct nlattr *opt,
 	return err;
 }
 
-static int cake_dump(struct Qdisc *sch, struct sk_buff *skb)
+static void cake_config_replace(struct Qdisc *sch, struct cake_sched_config *cfg)
 {
 	struct cake_sched_data *qd = qdisc_priv(sch);
 	struct cake_sched_config *q = qd->config;
+
+	qd->config = cfg;
+
+	if (!q->is_shared)
+		kvfree(q);
+
+	cake_reconfigure(sch);
+}
+
+static int cake_config_dump(struct cake_sched_config *q, struct sk_buff *skb)
+{
 	struct nlattr *opts;
 	u16 rate_flags;
 	u8 flow_mode;
@@ -2893,6 +2931,14 @@ static int cake_dump(struct Qdisc *sch, struct sk_buff *skb)
 	return -1;
 }
 
+static int cake_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct cake_sched_data *qd = qdisc_priv(sch);
+	struct cake_sched_config *q = qd->config;
+
+	return cake_config_dump(q, skb);
+}
+
 static int cake_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
 {
 	struct nlattr *stats = nla_nest_start_noflag(d->skb, TCA_STATS_APP);
@@ -3155,7 +3201,8 @@ static struct Qdisc_ops cake_qdisc_ops __read_mostly = {
 MODULE_ALIAS_NET_SCH("cake");
 
 struct cake_mq_sched {
-	struct Qdisc		**qdiscs;
+	struct Qdisc **qdiscs;
+	struct cake_sched_config cake_config;
 };
 
 static void cake_mq_destroy(struct Qdisc *sch)
@@ -3179,6 +3226,8 @@ static int cake_mq_init(struct Qdisc *sch, struct nlattr *opt,
 	struct netdev_queue *dev_queue;
 	struct Qdisc *qdisc;
 	unsigned int ntx;
+	bool _unused;
+	int ret;
 
 	if (sch->parent != TC_H_ROOT)
 		return -EOPNOTSUPP;
@@ -3186,6 +3235,11 @@ static int cake_mq_init(struct Qdisc *sch, struct nlattr *opt,
 	if (!netif_is_multiqueue(dev))
 		return -EOPNOTSUPP;
 
+	cake_config_init(&priv->cake_config, true);
+	ret = cake_config_change(&priv->cake_config, opt, extack, &_unused);
+	if (ret)
+		return ret;
+
 	/* pre-allocate qdiscs, attachment can't fail */
 	priv->qdiscs = kcalloc(dev->num_tx_queues, sizeof(priv->qdiscs[0]),
 			       GFP_KERNEL);
@@ -3204,6 +3258,7 @@ static int cake_mq_init(struct Qdisc *sch, struct nlattr *opt,
 		}
 		priv->qdiscs[ntx] = qdisc;
 		qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
+		cake_config_replace(qdisc, &priv->cake_config);
 	}
 
 	sch->flags |= TCQ_F_MQROOT;
@@ -3232,8 +3287,42 @@ static void cake_mq_attach(struct Qdisc *sch)
 	priv->qdiscs = NULL;
 }
 
+static int cake_mq_change(struct Qdisc *sch, struct nlattr *opt,
+			  struct netlink_ext_ack *extack)
+{
+	struct net_device *dev = qdisc_dev(sch);
+	struct cake_mq_sched *priv = qdisc_priv(sch);
+	bool overhead_changed = false;
+	unsigned int ntx;
+	int ret;
+
+	ret = cake_config_change(&priv->cake_config, opt, extack, &overhead_changed);
+	if (ret)
+		return ret;
+
+	sch_tree_lock(sch);
+	for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
+		struct Qdisc *chld = rtnl_dereference(netdev_get_tx_queue(dev, ntx)->qdisc_sleeping);
+		struct cake_sched_data *qd = qdisc_priv(chld);
+
+		if (overhead_changed) {
+			qd->max_netlen = 0;
+			qd->max_adjlen = 0;
+			qd->min_netlen = ~0;
+			qd->min_adjlen = ~0;
+		}
+
+		if (qd->tins)
+			cake_reconfigure(chld);
+	}
+	sch_tree_unlock(sch);
+
+	return 0;
+}
+
 static int cake_mq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
+	struct cake_mq_sched *qd = qdisc_priv(sch);
 	struct net_device *dev = qdisc_dev(sch);
 	struct Qdisc *qdisc;
 	unsigned int ntx;
@@ -3260,7 +3349,7 @@ static int cake_mq_dump(struct Qdisc *sch, struct sk_buff *skb)
 		spin_unlock_bh(qdisc_lock(qdisc));
 	}
 
-	return 0;
+	return cake_config_dump(&qd->cake_config, skb);
 }
 
 static struct netdev_queue *cake_mq_queue_get(struct Qdisc *sch, unsigned long cl)
@@ -3357,7 +3446,7 @@ static struct Qdisc_ops cake_mq_qdisc_ops __read_mostly = {
 	.init		=	cake_mq_init,
 	.destroy	=	cake_mq_destroy,
 	.attach		= cake_mq_attach,
-	//	.change		=	cake_mq_change,
+	.change		=	cake_mq_change,
 	.change_real_num_tx = mq_change_real_num_tx,
 	.dump		=	cake_mq_dump,
 	.owner		=	THIS_MODULE,

-- 
2.51.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cake] [PATCH RFC net-next 4/4] net/sched: sch_cake: share shaper state across sub-instances of cake_mq
  2025-09-24 12:16 [Cake] [PATCH RFC net-next 0/4] Multi-queue aware sch_cake Toke Høiland-Jørgensen
                   ` (2 preceding siblings ...)
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 3/4] net/sched: sch_cake: Share config across cake_mq sub-qdiscs Toke Høiland-Jørgensen
@ 2025-09-24 12:16 ` Toke Høiland-Jørgensen
  2025-09-25  9:04   ` [Cake] " Donald Hunter
  2025-09-24 12:18 ` [Cake] [PATCH RFC net-next] tc: cake: add cake_mq support Toke Høiland-Jørgensen
  4 siblings, 1 reply; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-09-24 12:16 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman
  Cc: Jonas Köppeler, cake, netdev, Toke Høiland-Jørgensen

From: Jonas Köppeler <j.koeppeler@tu-berlin.de>

This commit adds shared shaper state across the cake instances beneath a
cake_mq qdisc. It works by periodically tracking the number of active
instances, and scaling the configured rate by the number of active
queues.

The scan is lockless and simply reads the qlen and the last_active state
variable of each of the instances configured beneath the parent cake_mq
instance. Locking is not required since the values are only updated by
the owning instance, and eventual consistency is sufficient for the
purpose of estimating the number of active queues.

The interval for scanning the number of active queues is configurable
and defaults to 200 us. We found this to be a good tradeoff between
overhead and response time. For a detailed analysis of this aspect see
the Netdevconf talk:

https://netdevconf.info/0x19/docs/netdev-0x19-paper16-talk-paper.pdf

Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 include/uapi/linux/pkt_sched.h |  2 ++
 net/sched/sch_cake.c           | 67 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index c2da76e78badebbdf7d5482cef1a3132aec99fe1..a4aa812bfbe86424c502de5bb2e5b1429b440088 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -1014,6 +1014,7 @@ enum {
 	TCA_CAKE_ACK_FILTER,
 	TCA_CAKE_SPLIT_GSO,
 	TCA_CAKE_FWMARK,
+	TCA_CAKE_SYNC_TIME,
 	__TCA_CAKE_MAX
 };
 #define TCA_CAKE_MAX	(__TCA_CAKE_MAX - 1)
@@ -1036,6 +1037,7 @@ enum {
 	TCA_CAKE_STATS_DROP_NEXT_US,
 	TCA_CAKE_STATS_P_DROP,
 	TCA_CAKE_STATS_BLUE_TIMER_US,
+	TCA_CAKE_STATS_ACTIVE_QUEUES,
 	__TCA_CAKE_STATS_MAX
 };
 #define TCA_CAKE_STATS_MAX (__TCA_CAKE_STATS_MAX - 1)
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
index 7ceccbfaa9b6b4cdeaf17b416c2b65709c22a60a..014a399bac30ed4a40ca4ede313db5abfbbb5243 100644
--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -201,6 +201,7 @@ struct cake_sched_config {
 	u64		rate_bps;
 	u64		interval;
 	u64		target;
+	u64		sync_time;
 	u32		buffer_config_limit;
 	u32		fwmark_mask;
 	u16		fwmark_shft;
@@ -257,6 +258,10 @@ struct cake_sched_data {
 	u16		max_adjlen;
 	u16		min_netlen;
 	u16		min_adjlen;
+
+	u64 last_checked_active;
+	u64 last_active;
+	u32 active_queues;
 };
 
 enum {
@@ -383,6 +388,8 @@ static const u32 inv_sqrt_cache[REC_INV_SQRT_CACHE] = {
 	1239850263, 1191209601, 1147878294, 1108955788
 };
 
+static void cake_set_rate(struct cake_tin_data *b, u64 rate, u32 mtu,
+			  u64 target_ns, u64 rtt_est_ns);
 /* http://en.wikipedia.org/wiki/Methods_of_computing_square_roots
  * new_invsqrt = (invsqrt / 2) * (3 - count * invsqrt^2)
  *
@@ -2002,6 +2009,40 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
 	u64 delay;
 	u32 len;
 
+	if (q->config->is_shared &&
+	    now - q->last_checked_active >= q->config->sync_time) { //check every 1ms is the default
+		struct net_device *dev = qdisc_dev(sch);
+		struct cake_sched_data *other_priv;
+		u64 new_rate = q->config->rate_bps;
+		u64 other_qlen, other_last_active;
+		struct Qdisc *other_sch;
+		u32 num_active_qs = 1;
+		unsigned int ntx;
+
+		for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
+			other_sch = rtnl_dereference(netdev_get_tx_queue(dev, ntx)->qdisc_sleeping);
+			other_priv = qdisc_priv(other_sch);
+
+			if (other_priv == q)
+				continue;
+
+			other_qlen = READ_ONCE(other_sch->q.qlen);
+			other_last_active = READ_ONCE(other_priv->last_active);
+
+			if (other_qlen || other_last_active > q->last_checked_active)
+				num_active_qs++;
+		}
+
+		if (num_active_qs > 1)
+			new_rate = div64_u64(q->config->rate_bps, num_active_qs);
+
+		// mtu = 0 is used to only update the rate and not mess with cobalt params
+		cake_set_rate(b, new_rate, 0, 0, 0);
+		q->last_checked_active = now;
+		q->rate_ns = b->tin_rate_ns;
+		q->rate_shft = b->tin_rate_shft;
+	}
+
 begin:
 	if (!sch->q.qlen)
 		return NULL;
@@ -2201,6 +2242,7 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
 
 	b->tin_ecn_mark += !!flow->cvars.ecn_marked;
 	qdisc_bstats_update(sch, skb);
+	q->last_active = now;
 
 	/* collect delay stats */
 	delay = ktime_to_ns(ktime_sub(now, cobalt_get_enqueue_time(skb)));
@@ -2271,6 +2313,7 @@ static const struct nla_policy cake_policy[TCA_CAKE_MAX + 1] = {
 	[TCA_CAKE_ACK_FILTER]	 = { .type = NLA_U32 },
 	[TCA_CAKE_SPLIT_GSO]	 = { .type = NLA_U32 },
 	[TCA_CAKE_FWMARK]	 = { .type = NLA_U32 },
+	[TCA_CAKE_SYNC_TIME]	 = { .type = NLA_U32 },
 };
 
 static void cake_set_rate(struct cake_tin_data *b, u64 rate, u32 mtu,
@@ -2301,6 +2344,9 @@ static void cake_set_rate(struct cake_tin_data *b, u64 rate, u32 mtu,
 	b->tin_rate_ns   = rate_ns;
 	b->tin_rate_shft = rate_shft;
 
+	if (mtu == 0)
+		return;
+
 	byte_target_ns = (byte_target * rate_ns) >> rate_shft;
 
 	b->cparams.target = max((byte_target_ns * 3) / 2, target_ns);
@@ -2703,6 +2749,17 @@ static int cake_config_change(struct cake_sched_config *q, struct nlattr *opt,
 	WRITE_ONCE(q->rate_flags, rate_flags);
 	WRITE_ONCE(q->flow_mode, flow_mode);
 
+	if (tb[TCA_CAKE_SYNC_TIME]) {
+		u32 sync_us = nla_get_u32(tb[TCA_CAKE_SYNC_TIME]);
+
+		if (!q->is_shared) {
+			NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CAKE_SYNC_TIME],
+					    "Sync time can only be set for cake_mq variant");
+			return -EOPNOTSUPP;
+		}
+		q->sync_time = (u64)sync_us * 1000; // from us to ns
+	}
+
 	return 0;
 }
 
@@ -2763,6 +2820,7 @@ static void cake_config_init(struct cake_sched_config *q, bool is_shared)
 			       */
 	q->rate_flags |= CAKE_FLAG_SPLIT_GSO;
 	q->is_shared = is_shared;
+	q->sync_time = 200 * NSEC_PER_USEC;
 }
 
 static int cake_init(struct Qdisc *sch, struct nlattr *opt,
@@ -2834,6 +2892,9 @@ static int cake_init(struct Qdisc *sch, struct nlattr *opt,
 	qd->avg_peak_bandwidth = q->rate_bps;
 	qd->min_netlen = ~0;
 	qd->min_adjlen = ~0;
+	qd->active_queues = 0;
+	qd->last_checked_active = 0;
+
 	return 0;
 err:
 	kvfree(qd->config);
@@ -2925,6 +2986,11 @@ static int cake_config_dump(struct cake_sched_config *q, struct sk_buff *skb)
 	if (nla_put_u32(skb, TCA_CAKE_FWMARK, READ_ONCE(q->fwmark_mask)))
 		goto nla_put_failure;
 
+	if (q->is_shared) {
+		if (nla_put_u32(skb, TCA_CAKE_SYNC_TIME, READ_ONCE(q->sync_time) / 1000))
+			goto nla_put_failure;
+	}
+
 	return nla_nest_end(skb, opts);
 
 nla_put_failure:
@@ -2967,6 +3033,7 @@ static int cake_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
 	PUT_STAT_U32(MAX_ADJLEN, q->max_adjlen);
 	PUT_STAT_U32(MIN_NETLEN, q->min_netlen);
 	PUT_STAT_U32(MIN_ADJLEN, q->min_adjlen);
+	PUT_STAT_U32(ACTIVE_QUEUES, q->active_queues);
 
 #undef PUT_STAT_U32
 #undef PUT_STAT_U64

-- 
2.51.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cake] [PATCH RFC net-next] tc: cake: add cake_mq support
  2025-09-24 12:16 [Cake] [PATCH RFC net-next 0/4] Multi-queue aware sch_cake Toke Høiland-Jørgensen
                   ` (3 preceding siblings ...)
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 4/4] net/sched: sch_cake: share shaper state across sub-instances of cake_mq Toke Høiland-Jørgensen
@ 2025-09-24 12:18 ` Toke Høiland-Jørgensen
  4 siblings, 0 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-09-24 12:18 UTC (permalink / raw)
  To: David Ahern, Stephen Hemminger; +Cc: cake, netdev, Jonas Köppeler

From: Jonas Köppeler <j.koeppeler@tu-berlin.de>

This adds support for the cake_mq variant of sch_cake to tc.

Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
---
 include/uapi/linux/pkt_sched.h |  2 ++
 tc/q_cake.c                    | 37 +++++++++++++++++++++++++++++++++-
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 15d1a37ac6d8..f85a5316f372 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -1014,6 +1014,7 @@ enum {
 	TCA_CAKE_ACK_FILTER,
 	TCA_CAKE_SPLIT_GSO,
 	TCA_CAKE_FWMARK,
+	TCA_CAKE_SYNC_TIME,
 	__TCA_CAKE_MAX
 };
 #define TCA_CAKE_MAX	(__TCA_CAKE_MAX - 1)
@@ -1036,6 +1037,7 @@ enum {
 	TCA_CAKE_STATS_DROP_NEXT_US,
 	TCA_CAKE_STATS_P_DROP,
 	TCA_CAKE_STATS_BLUE_TIMER_US,
+	TCA_CAKE_STATS_ACTIVE_QUEUES,
 	__TCA_CAKE_STATS_MAX
 };
 #define TCA_CAKE_STATS_MAX (__TCA_CAKE_STATS_MAX - 1)
diff --git a/tc/q_cake.c b/tc/q_cake.c
index e2b8de55e5a2..60688b3478ec 100644
--- a/tc/q_cake.c
+++ b/tc/q_cake.c
@@ -82,6 +82,7 @@ static void explain(void)
 		"                [ split-gso* | no-split-gso ]\n"
 		"                [ ack-filter | ack-filter-aggressive | no-ack-filter* ]\n"
 		"                [ memlimit LIMIT ]\n"
+		"                [ sync_time TIME ]\n"
 		"                [ fwmark MASK ]\n"
 		"                [ ptm | atm | noatm* ] [ overhead N | conservative | raw* ]\n"
 		"                [ mpu N ] [ ingress | egress* ]\n"
@@ -93,6 +94,8 @@ static int cake_parse_opt(const struct qdisc_util *qu, int argc, char **argv,
 {
 	struct cake_preset *preset, *preset_set = NULL;
 	bool overhead_override = false;
+	unsigned int sync_time = 0;
+	bool set_sync_time = false;
 	bool overhead_set = false;
 	unsigned int interval = 0;
 	int diffserv = -1;
@@ -340,6 +343,13 @@ static int cake_parse_opt(const struct qdisc_util *qu, int argc, char **argv,
 					"Illegal value for \"fwmark\": \"%s\"\n", *argv);
 				return -1;
 			}
+		} else if (strcmp(*argv, "sync_time") == 0) {
+			NEXT_ARG();
+			if (get_time(&sync_time, *argv)) {
+				fprintf(stderr, "Illegal sync time\n");
+				return -1;
+			}
+			set_sync_time = true;
 		} else if (strcmp(*argv, "help") == 0) {
 			explain();
 			return -1;
@@ -399,6 +409,8 @@ static int cake_parse_opt(const struct qdisc_util *qu, int argc, char **argv,
 	if (ack_filter != -1)
 		addattr_l(n, 1024, TCA_CAKE_ACK_FILTER, &ack_filter,
 			  sizeof(ack_filter));
+	if(set_sync_time)
+		addattr_l(n, 1024, TCA_CAKE_SYNC_TIME, &sync_time, sizeof(sync_time));
 
 	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
 	return 0;
@@ -421,6 +433,7 @@ static int cake_print_opt(const struct qdisc_util *qu, FILE *f, struct rtattr *o
 	unsigned int interval = 0;
 	unsigned int memlimit = 0;
 	unsigned int fwmark = 0;
+	unsigned int sync_time = 0;
 	__u64 bandwidth = 0;
 	int ack_filter = 0;
 	int split_gso = 0;
@@ -525,7 +538,10 @@ static int cake_print_opt(const struct qdisc_util *qu, FILE *f, struct rtattr *o
 	    RTA_PAYLOAD(tb[TCA_CAKE_FWMARK]) >= sizeof(__u32)) {
 		fwmark = rta_getattr_u32(tb[TCA_CAKE_FWMARK]);
 	}
-
+	if (tb[TCA_CAKE_SYNC_TIME] &&
+	    RTA_PAYLOAD(tb[TCA_CAKE_SYNC_TIME]) >= sizeof(__u32)) {
+		sync_time = rta_getattr_u32(tb[TCA_CAKE_SYNC_TIME]);
+	}
 	if (wash)
 		print_string(PRINT_FP, NULL, "wash ", NULL);
 	else
@@ -574,6 +590,13 @@ static int cake_print_opt(const struct qdisc_util *qu, FILE *f, struct rtattr *o
 	if (memlimit)
 		print_size(PRINT_ANY, "memlimit", "memlimit %s ", memlimit);
 
+	if (sync_time) {
+		print_uint(PRINT_JSON, "sync_time", NULL,
+			   sync_time);
+		print_string(PRINT_FP, NULL, "sync_time %s",
+			     sprint_time(sync_time, b2));
+	}
+
 	if (fwmark)
 		print_uint(PRINT_FP, NULL, "fwmark 0x%x ", fwmark);
 	print_0xhex(PRINT_JSON, "fwmark", NULL, fwmark);
@@ -667,6 +690,11 @@ static int cake_print_xstats(const struct qdisc_util *qu, FILE *f,
 			   " /%8u\n", GET_STAT_U32(MAX_ADJLEN));
 	}
 
+	if (st[TCA_CAKE_STATS_ACTIVE_QUEUES])
+		print_uint(PRINT_ANY, "active_queues",
+			   " active queues: %25u\n",
+			   GET_STAT_U32(ACTIVE_QUEUES));
+
 	if (st[TCA_CAKE_STATS_AVG_NETOFF])
 		print_uint(PRINT_ANY, "avg_hdr_offset",
 			   " average network hdr offset: %12u\n\n",
@@ -827,3 +855,10 @@ struct qdisc_util cake_qdisc_util = {
 	.print_qopt	= cake_print_opt,
 	.print_xstats	= cake_print_xstats,
 };
+
+struct qdisc_util cake_mq_qdisc_util = {
+	.id		= "cake_mq",
+	.parse_qopt	= cake_parse_opt,
+	.print_qopt	= cake_print_opt,
+	.print_xstats	= cake_print_xstats,
+};
-- 
2.51.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cake] Re: [PATCH RFC net-next 4/4] net/sched: sch_cake: share shaper state across sub-instances of cake_mq
  2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 4/4] net/sched: sch_cake: share shaper state across sub-instances of cake_mq Toke Høiland-Jørgensen
@ 2025-09-25  9:04   ` Donald Hunter
  2025-09-25 13:29     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 8+ messages in thread
From: Donald Hunter @ 2025-09-25  9:04 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Toke Høiland-Jørgensen, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Jonas Köppeler, cake, netdev

Toke Høiland-Jørgensen <toke@redhat.com> writes:

> From: Jonas Köppeler <j.koeppeler@tu-berlin.de>
>
> This commit adds shared shaper state across the cake instances beneath a
> cake_mq qdisc. It works by periodically tracking the number of active
> instances, and scaling the configured rate by the number of active
> queues.
>
> The scan is lockless and simply reads the qlen and the last_active state
> variable of each of the instances configured beneath the parent cake_mq
> instance. Locking is not required since the values are only updated by
> the owning instance, and eventual consistency is sufficient for the
> purpose of estimating the number of active queues.
>
> The interval for scanning the number of active queues is configurable
> and defaults to 200 us. We found this to be a good tradeoff between
> overhead and response time. For a detailed analysis of this aspect see
> the Netdevconf talk:
>
> https://netdevconf.info/0x19/docs/netdev-0x19-paper16-talk-paper.pdf
>
> Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  include/uapi/linux/pkt_sched.h |  2 ++
>  net/sched/sch_cake.c           | 67 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+)
>
> diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
> index c2da76e78badebbdf7d5482cef1a3132aec99fe1..a4aa812bfbe86424c502de5bb2e5b1429b440088 100644
> --- a/include/uapi/linux/pkt_sched.h
> +++ b/include/uapi/linux/pkt_sched.h
> @@ -1014,6 +1014,7 @@ enum {
>  	TCA_CAKE_ACK_FILTER,
>  	TCA_CAKE_SPLIT_GSO,
>  	TCA_CAKE_FWMARK,
> +	TCA_CAKE_SYNC_TIME,
>  	__TCA_CAKE_MAX
>  };
>  #define TCA_CAKE_MAX	(__TCA_CAKE_MAX - 1)
> @@ -1036,6 +1037,7 @@ enum {
>  	TCA_CAKE_STATS_DROP_NEXT_US,
>  	TCA_CAKE_STATS_P_DROP,
>  	TCA_CAKE_STATS_BLUE_TIMER_US,
> +	TCA_CAKE_STATS_ACTIVE_QUEUES,
>  	__TCA_CAKE_STATS_MAX
>  };
>  #define TCA_CAKE_STATS_MAX (__TCA_CAKE_STATS_MAX - 1)

Hi Toke,

Could you include this diff in the patch to keep the ynl spec up to date?

diff --git a/Documentation/netlink/specs/tc.yaml b/Documentation/netlink/specs/tc.yaml
index b398f7a46dae..f0edc84f9613 100644
--- a/Documentation/netlink/specs/tc.yaml
+++ b/Documentation/netlink/specs/tc.yaml
@@ -2153,6 +2153,9 @@ attribute-sets:
       -
         name: fwmark
         type: u32
+      -
+        name: sync-time
+        type: u32
   -
     name: cake-stats-attrs
     name-prefix: tca-cake-stats-
@@ -2207,6 +2210,9 @@ attribute-sets:
       -
         name: blue-timer-us
         type: s32
+      -
+        name: active-queues
+        type: u32
   -
     name: cake-tin-stats-attrs
     name-prefix: tca-cake-tin-stats-

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Cake] Re: [PATCH RFC net-next 4/4] net/sched: sch_cake: share shaper state across sub-instances of cake_mq
  2025-09-25  9:04   ` [Cake] " Donald Hunter
@ 2025-09-25 13:29     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-09-25 13:29 UTC (permalink / raw)
  To: Donald Hunter
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Jonas Köppeler, cake, netdev

Donald Hunter <donald.hunter@gmail.com> writes:

> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>
>> From: Jonas Köppeler <j.koeppeler@tu-berlin.de>
>>
>> This commit adds shared shaper state across the cake instances beneath a
>> cake_mq qdisc. It works by periodically tracking the number of active
>> instances, and scaling the configured rate by the number of active
>> queues.
>>
>> The scan is lockless and simply reads the qlen and the last_active state
>> variable of each of the instances configured beneath the parent cake_mq
>> instance. Locking is not required since the values are only updated by
>> the owning instance, and eventual consistency is sufficient for the
>> purpose of estimating the number of active queues.
>>
>> The interval for scanning the number of active queues is configurable
>> and defaults to 200 us. We found this to be a good tradeoff between
>> overhead and response time. For a detailed analysis of this aspect see
>> the Netdevconf talk:
>>
>> https://netdevconf.info/0x19/docs/netdev-0x19-paper16-talk-paper.pdf
>>
>> Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> ---
>>  include/uapi/linux/pkt_sched.h |  2 ++
>>  net/sched/sch_cake.c           | 67 ++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 69 insertions(+)
>>
>> diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
>> index c2da76e78badebbdf7d5482cef1a3132aec99fe1..a4aa812bfbe86424c502de5bb2e5b1429b440088 100644
>> --- a/include/uapi/linux/pkt_sched.h
>> +++ b/include/uapi/linux/pkt_sched.h
>> @@ -1014,6 +1014,7 @@ enum {
>>  	TCA_CAKE_ACK_FILTER,
>>  	TCA_CAKE_SPLIT_GSO,
>>  	TCA_CAKE_FWMARK,
>> +	TCA_CAKE_SYNC_TIME,
>>  	__TCA_CAKE_MAX
>>  };
>>  #define TCA_CAKE_MAX	(__TCA_CAKE_MAX - 1)
>> @@ -1036,6 +1037,7 @@ enum {
>>  	TCA_CAKE_STATS_DROP_NEXT_US,
>>  	TCA_CAKE_STATS_P_DROP,
>>  	TCA_CAKE_STATS_BLUE_TIMER_US,
>> +	TCA_CAKE_STATS_ACTIVE_QUEUES,
>>  	__TCA_CAKE_STATS_MAX
>>  };
>>  #define TCA_CAKE_STATS_MAX (__TCA_CAKE_STATS_MAX - 1)
>
> Hi Toke,
>
> Could you include this diff in the patch to keep the ynl spec up to
> date?

Ah yes, will do, thanks! :)

-Toke

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-09-25 13:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-24 12:16 [Cake] [PATCH RFC net-next 0/4] Multi-queue aware sch_cake Toke Høiland-Jørgensen
2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 1/4] net/sched: sch_cake: Factor out config variables into separate struct Toke Høiland-Jørgensen
2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 2/4] net/sched: sch_cake: Add cake_mq qdisc for using cake on mq devices Toke Høiland-Jørgensen
2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 3/4] net/sched: sch_cake: Share config across cake_mq sub-qdiscs Toke Høiland-Jørgensen
2025-09-24 12:16 ` [Cake] [PATCH RFC net-next 4/4] net/sched: sch_cake: share shaper state across sub-instances of cake_mq Toke Høiland-Jørgensen
2025-09-25  9:04   ` [Cake] " Donald Hunter
2025-09-25 13:29     ` Toke Høiland-Jørgensen
2025-09-24 12:18 ` [Cake] [PATCH RFC net-next] tc: cake: add cake_mq support Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox