From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mail.toke.dk; spf=pass smtp.mailfrom=; dkim=pass header.d=kernel.org; arc=none (Message is not ARC signed); dmarc=pass (Used From Domain Record) header.from=kernel.org policy.dmarc=quarantine Received: from tor.source.kernel.org (tor.source.kernel.org [IPv6:2600:3c04:e001:324:0:1991:8:25]) by mail.toke.dk (Postfix) with ESMTPS id 92D37E087D0 for ; Sat, 14 Mar 2026 16:00:09 +0100 (CET) Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 84F97600AD; Sat, 14 Mar 2026 15:00:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6F09BC116C6; Sat, 14 Mar 2026 15:00:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773500406; bh=GPPo7UyZhsa34vSl+juXV9cWHndyC5ux+fomdX4hXxk=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=H4WQH8NeAT2PaoMQAqqfNJ0e6dF5J9je+HIQckI7eVx5ReiBa/54cP11AJ0HDYAx2 7/6qkvNGzyvd8B1jm6+VA1a4/ucNTM/CY2sAOKiVNOCbJt/UrsytRYbRdJf7AcIDFh EbInrsUs/+XrMrXla93s/LrPDFfGIc3U63t3qlrnwZaa6FAj7UJlrnrCH4AMt/Z73C bh/hHbBvSCjVrsGvjDFAvT9oHnwO7iVVMLXPcgZjL1/k/S2RIEaHg00Puw+SI/lfMi wzV5sI9aeBrwwbTi20Qbg7L9+Qcuh722fsoVsTXeT2t0rDoH8ujqiwI67+np6uMkaQ Vk6xjVH4cXlZA== Date: Sat, 14 Mar 2026 08:00:04 -0700 From: Jakub Kicinski To: Jamal Hadi Salim Cc: netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com, pabeni@redhat.com, horms@kernel.org, jiri@resnulli.us, toke@toke.dk, vinicius.gomes@intel.com, stephen@networkplumber.org, vladbu@nvidia.com, cake@lists.bufferbloat.net, bpf@vger.kernel.org, ghandatmanas@gmail.com, km.kim1503@gmail.com, security@kernel.org, Victor Nogueira Message-ID: <20260314080004.3e8575d8@kernel.org> In-Reply-To: References: <20260307212058.169511-1-jhs@mojatatu.com> <20260310184713.7e810431@kernel.org> <20260311175249.54abe1b6@kernel.org> <20260312165113.773a5f44@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: KAGX5HEYHUQZJNNZCGILNPU5Z4CJJSWB X-Message-ID-Hash: KAGX5HEYHUQZJNNZCGILNPU5Z4CJJSWB X-MailFrom: kuba@kernel.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Subject: [Cake] Re: [PATCH net] net/sched: Mark qdisc for deletion if graft cannot delete List-Id: Cake - FQ_codel the next generation Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 13 Mar 2026 15:36:28 -0400 Jamal Hadi Salim wrote: > In this specific example, the issue is that the classifier code path > can't release the rtnl_lock while the qdisc's refcnt is bigger than 1. > > Does this make more sense? > The reason we went with the "mark for delete" approach is at time x+3 > the "qdisc add" wont be able to find this qdisc. This is the common > observed pattern - for example described in the commit message where > we get have a slightly different flow with "qdisc del" before "filter > add". Maybe a (completely untested) diff will help illustrate my thinking better than words: diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index 99ac747b7906..c13c9e8619e4 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -27,6 +27,9 @@ void unregister_tcf_proto_ops(struct tcf_proto_ops *ops); #define NET_CLS_ALIAS_PREFIX "net-cls-" #define MODULE_ALIAS_NET_CLS(kind) MODULE_ALIAS(NET_CLS_ALIAS_PREFIX kind) +extern char *rtnl_load_mod; +void rtnl_load_mod_check(void); + struct tcf_block_ext_info { enum flow_block_binder_type binder_type; tcf_chain_head_change_t *chain_head_change; diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 332fd9695e54..c21dd2e36592 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -1368,11 +1368,15 @@ struct tc_action_ops *tc_action_load_ops(struct nlattr *nla, u32 flags, #ifdef CONFIG_MODULES bool rtnl_held = !(flags & TCA_ACT_FLAGS_NO_RTNL); - if (rtnl_held) - rtnl_unlock(); + if (rtnl_held) { + if (WARN_ON_ONCE(rtnl_load_mod)) + return ERR_PTR(-EINVAL); + rtnl_load_mod = kasprintf(GFP_KERNEL, + NET_ACT_ALIAS_PREFIX "%s", + act_name); + return ERR_PTR(-EAGAIN); + } request_module(NET_ACT_ALIAS_PREFIX "%s", act_name); - if (rtnl_held) - rtnl_lock(); a_o = tc_lookup_action_n(act_name); @@ -2107,6 +2111,9 @@ static int tcf_action_add(struct net *net, struct nlattr *nla, &attr_size, flags, 0, extack); if (ret != -EAGAIN) break; + rtnl_unlock(); + rtnl_load_mod_check(); + rtnl_lock(); } if (ret < 0) diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 4829c27446e3..1b0f762d6e4b 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -46,6 +46,19 @@ /* The list of all installed classifier types */ static LIST_HEAD(tcf_proto_base); +char *rtnl_load_mod; + +void rtnl_load_mod_check(void) +{ + char *mod = rtnl_load_mod; + + if (mod) { + rtnl_load_mod = NULL; + request_module("%s", mod); + kfree(mod); + } +} + /* Protects list of registered TC modules. It is pure SMP lock. */ static DEFINE_RWLOCK(cls_mod_lock); @@ -255,17 +268,15 @@ tcf_proto_lookup_ops(const char *kind, bool rtnl_held, if (ops) return ops; #ifdef CONFIG_MODULES - if (rtnl_held) - rtnl_unlock(); + if (rtnl_held) { + if (WARN_ON_ONCE(rtnl_load_mod)) + return ERR_PTR(-EINVAL); + rtnl_load_mod = kasprintf(GFP_KERNEL, + NET_CLS_ALIAS_PREFIX "%s", kind); + return ERR_PTR(-EAGAIN); + } request_module(NET_CLS_ALIAS_PREFIX "%s", kind); - if (rtnl_held) - rtnl_lock(); ops = __tcf_proto_lookup_ops(kind); - /* We dropped the RTNL semaphore in order to perform - * the module load. So, even if we succeeded in loading - * the module we have to replay the request. We indicate - * this using -EAGAIN. - */ if (ops) { module_put(ops->owner); return ERR_PTR(-EAGAIN); @@ -2459,6 +2470,7 @@ static int tc_new_tfilter(struct sk_buff *skb, struct nlmsghdr *n, * of target chain. */ rtnl_held = true; + rtnl_load_mod_check(); /* Replay the request. */ goto replay; } @@ -3230,9 +3242,13 @@ static int tc_ctl_chain(struct sk_buff *skb, struct nlmsghdr *n, tcf_chain_put(chain); errout_block: tcf_block_release(q, block, true); - if (err == -EAGAIN) + if (err == -EAGAIN) { + rtnl_unlock(); + rtnl_load_mod_check(); + rtnl_lock(); /* Replay the request. */ goto replay; + } return err; errout_block_locked: