From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ie0-x236.google.com (mail-ie0-x236.google.com [IPv6:2607:f8b0:4001:c03::236]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 94A7821F107 for ; Mon, 10 Jun 2013 10:20:58 -0700 (PDT) Received: by mail-ie0-f182.google.com with SMTP id s9so2452543iec.27 for ; Mon, 10 Jun 2013 10:20:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=uxLoc49S5M2CRJZv5DZVnbfMxiMqeM525H/x7rAZjPQ=; b=ywlrSFHMlK/H7qTKh9/zRKpR2yFXjHexqC7XLDg5oeDCYuqxFlmVVDMzFdi9s/WZP2 867CHl8TDb6+64P3Ptq0LQMgKPU/i7hfffnV+U7HR1MAf2R0k4R+ZgDQmGWHRP0xRyRO Zcz4Rr6wv94pPCKtcyKpOrIPFfZeNz3PIP3VC+awcA3pb+GuAiJCog5KDsRbx7GFTiyK FmpnMnEl5UdFG1EMdwqLIiAi56NvFey3/XPWFbw0c1owgpkpvKGT2bGdJgkLpDIPyUjD Hq/QFYWEKqo5uBRJEEse0KN3Mjq7NThJruLGyxquoE6E3TKJUykl4zrEyRWf+V4WyhwI fRDg== MIME-Version: 1.0 X-Received: by 10.50.17.166 with SMTP id p6mr4523897igd.12.1370884857747; Mon, 10 Jun 2013 10:20:57 -0700 (PDT) Received: by 10.64.45.137 with HTTP; Mon, 10 Jun 2013 10:20:57 -0700 (PDT) In-Reply-To: <1370867989-7318-2-git-send-email-toke@toke.dk> References: <1370867989-7318-1-git-send-email-toke@toke.dk> <1370867989-7318-2-git-send-email-toke@toke.dk> Date: Mon, 10 Jun 2013 10:20:57 -0700 Message-ID: Subject: Re: [PATCH RFC 1/3] Broadcast qdisc statistics via netlink on packet dequeue. From: Dave Taht To: =?ISO-8859-1?Q?Toke_H=F8iland=2DJ=F8rgensen?= Content-Type: multipart/alternative; boundary=14dae9340cc9f9f68f04ded0006a Cc: bloat-devel@lists.bufferbloat.net X-BeenThere: bloat-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Developers working on AQM, device drivers, and networking stacks" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 17:20:58 -0000 --14dae9340cc9f9f68f04ded0006a Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable It is highly unlikely these patches would be accepted by upstream as allocating a skb in the fast path is probably a no-no. However at least part of them seem generically useful - instrumenting pfifo_fast in particular - and certainly helpful for debugging in general. I had hoped it would be possible to do this without mucking with locking, tho. :( In my own case I'd like to have a qdisc_drop_with_stats(skb,sch,"fq_codel",reason) that could be compiled in or out easily, sending the packet headers of the dropped packet to a listener much as you do here. This would be useful for monitoring, have potential as a routing metric, and make it easier to analyze the behavior of various qdisc types, without having to poll and lock via tc. Also as a drop takes place only when the card is producing delays, it would be possible to get a bandwidth estimate over smaller intervals from it easily, and be less invasive overall... So thanks for digging into netlink so much and having a sample program. I almost get it now... A couple more comments below: On Mon, Jun 10, 2013 at 5:39 AM, Toke H=F8iland-J=F8rgensen w= rote: Email syntax for a RFC PATCH is RFC PATCH, not PATCH RFC --- > include/uapi/linux/rtnetlink.h | 15 ++++++--- > net/core/gen_stats.c | 6 ++-- > net/sched/Kconfig | 8 +++++ > net/sched/sch_generic.c | 71 > ++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 93 insertions(+), 7 deletions(-) > > diff --git a/include/uapi/linux/rtnetlink.h > b/include/uapi/linux/rtnetlink.h > index 7a2144e..e9d91c5 100644 > --- a/include/uapi/linux/rtnetlink.h > +++ b/include/uapi/linux/rtnetlink.h > @@ -68,6 +68,9 @@ enum { > RTM_GETQDISC, > #define RTM_GETQDISC RTM_GETQDISC > > + RTM_QDISC_STATS, > +#define RTM_QDISC_STATS RTM_QDISC_STATS > + > > RTM_NEWTCLASS =3D 40, > #define RTM_NEWTCLASS RTM_NEWTCLASS > RTM_DELTCLASS, > @@ -140,7 +143,7 @@ enum { > #define RTM_NR_FAMILIES (RTM_NR_MSGTYPES >> 2) > #define RTM_FAM(cmd) (((cmd) - RTM_BASE) >> 2) > > -/* > +/* > Generic structure for encapsulation of optional route information. > It is reminiscent of sockaddr, but with sa_family replaced > with attribute type. > @@ -180,7 +183,7 @@ struct rtmsg { > > unsigned char rtm_table; /* Routing table id */ > unsigned char rtm_protocol; /* Routing protocol; see > below */ > - unsigned char rtm_scope; /* See below */ > + unsigned char rtm_scope; /* See below */ > unsigned char rtm_type; /* See below */ > > unnecessary change. > unsigned rtm_flags; > @@ -450,7 +453,7 @@ struct ifinfomsg { > }; > > /******************************************************************** > - * prefix information > + * prefix information > ****/ > > unnecessary change. try to touch as little as possible... There is an emacs mode for linux indentation, btw which helps avoid annoyingly messing up existing code. struct prefixmsg { > @@ -464,7 +467,7 @@ struct prefixmsg { > unsigned char prefix_pad3; > }; > > -enum > +enum > { > PREFIX_UNSPEC, > PREFIX_ADDRESS, > @@ -571,6 +574,8 @@ enum rtnetlink_groups { > #define RTNLGRP_NEIGH RTNLGRP_NEIGH > RTNLGRP_TC, > #define RTNLGRP_TC RTNLGRP_TC > + RTNLGRP_TC_STATS, > +#define RTNLGRP_TC_STATS RTNLGRP_TC_STATS > RTNLGRP_IPV4_IFADDR, > #define RTNLGRP_IPV4_IFADDR RTNLGRP_IPV4_IFADDR > RTNLGRP_IPV4_MROUTE, > It's not clear to me where this stuff enters userspace or not, and in general it's safer to append new enums/defines rather than insert them. > @@ -625,7 +630,7 @@ struct tcamsg { > }; > #define TA_RTA(r) ((struct rtattr*)(((char*)(r)) + > NLMSG_ALIGN(sizeof(struct tcamsg)))) > #define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg)) > -#define TCA_ACT_TAB 1 /* attr type must be >=3D1 */ > +#define TCA_ACT_TAB 1 /* attr type must be >=3D1 */ > #define TCAA_MAX 1 > Uneeeded change... > > /* New extended info filters for IFLA_EXT_MASK */ > diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c > index ddedf21..68df614 100644 > --- a/net/core/gen_stats.c > +++ b/net/core/gen_stats.c > @@ -61,7 +61,8 @@ gnet_stats_start_copy_compat(struct sk_buff *skb, int > type, int tc_stats_type, > { > memset(d, 0, sizeof(*d)); > > - spin_lock_bh(lock); > + if(lock) > + spin_lock_bh(lock); > So this acted up on a SMP kernel? > d->lock =3D lock; > if (type) > d->tail =3D (struct nlattr *)skb_tail_pointer(skb); > @@ -245,7 +246,8 @@ gnet_stats_finish_copy(struct gnet_dump *d) > return -1; > } > > - spin_unlock_bh(d->lock); > + if(d->lock) > + spin_unlock_bh(d->lock); > return 0; > } > EXPORT_SYMBOL(gnet_stats_finish_copy); > diff --git a/net/sched/Kconfig b/net/sched/Kconfig > index 235e01a..4902812 100644 > --- a/net/sched/Kconfig > +++ b/net/sched/Kconfig > @@ -308,6 +308,14 @@ config NET_SCH_PLUG > To compile this code as a module, choose M here: the > module will be called sch_plug. > > +config NET_SCH_BROADCAST_STATS > + bool "Enable Qdisc statistics broadcast" > + ---help--- > + > + Select this option if you want to enable qdisc stats broadcast > through > + netlink multicast. Broadcast happens on packet dequeue, limite= d > to the > + interval set by the sysctl parameter. > + > comment "Classification" > QDISC_BROADCAST_DEQUEUE_STATS maybe? Elsewhere you've defined the sysctl. Would argue for sysfs... > > config NET_CLS > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c > index eac7e0e..56490a1 100644 > --- a/net/sched/sch_generic.c > +++ b/net/sched/sch_generic.c > @@ -28,6 +28,7 @@ > #include > #include > #include > +#include > > /* Main transmission queue. */ > > @@ -148,6 +149,73 @@ int sch_direct_xmit(struct sk_buff *skb, struct Qdis= c > *q, > return ret; > } > > +#ifdef CONFIG_NET_SCH_BROADCAST_STATS > +int qdisc_broadcast_stats(struct Qdisc *q) > +{ > + struct tcmsg *tcm; > + struct nlmsghdr *nlh; > + struct gnet_dump d; > + struct sk_buff *skb; > + unsigned char *b; > + > + if(!q->dev_queue || !q->dev_queue->dev) > + return 0; > > Is there a way to also merely detect if there is a listener present? > + printk(KERN_DEBUG "Packet dequeue, ifname %s(%d), qdisc %s, qlen > %d\n", > + qdisc_dev(q)->name, > + qdisc_dev(q)->ifindex, > + q->ops->id, > + q->qstats.qlen); > Useful for debugging, but invasive for general use, far more heavyweight than tossing it into netlink. > + skb =3D alloc_skb(NLMSG_SPACE(1024), GFP_KERNEL); > NLMSG_SPACE seems arbitrary, and perhaps this needs to be GFP_ATOMIC? (But I am totally not an expert down here) > + if(!skb) > + return -ENOBUFS; > + b =3D skb_tail_pointer(skb); > + > + nlh =3D nlmsg_put(skb, 0, 0, RTM_QDISC_STATS, sizeof(*tcm), > NLM_F_MULTI); > + if (!nlh) > + goto out_nlmsg_trim; > + > + tcm =3D nlmsg_data(nlh); > + tcm->tcm_family =3D AF_UNSPEC; > + tcm->tcm__pad1 =3D 0; > + tcm->tcm__pad2 =3D 0; > + tcm->tcm_ifindex =3D qdisc_dev(q)->ifindex; > + tcm->tcm_parent =3D q->parent; > + tcm->tcm_handle =3D q->handle; > + tcm->tcm_info =3D atomic_read(&q->refcnt); > + > + if (nla_put_string(skb, TCA_KIND, q->ops->id)) > + goto nla_put_failure; > + > + if (gnet_stats_start_copy(skb, TCA_STATS2, NULL, &d) < 0) > + goto nla_put_failure; > + > + if (q->ops->dump_stats && q->ops->dump_stats(q, &d) < 0) > + goto nla_put_failure; > + > + if (gnet_stats_copy_basic(&d, &q->bstats) < 0 || > + gnet_stats_copy_queue(&d, &q->qstats) < 0) > + goto nla_put_failure; > + > + if (gnet_stats_finish_copy(&d) < 0) > + goto nla_put_failure; > + > + nlh->nlmsg_len =3D skb_tail_pointer(skb) - b; > + > + rtnetlink_send(skb, qdisc_dev(q)->nd_net, 0, RTNLGRP_TC_STATS, 0)= ; > + > + return 0; > + > +nla_put_failure: > +out_nlmsg_trim: > + nlmsg_trim(skb, b); > + return -1; > + > +} > +#endif > + > + > /* > * NOTE: Called under qdisc_lock(q) with locally disabled BH. > * > @@ -173,6 +241,9 @@ static inline int qdisc_restart(struct Qdisc *q) > struct net_device *dev; > spinlock_t *root_lock; > struct sk_buff *skb; > +#ifdef CONFIG_NET_SCH_BROADCAST_STATS > + qdisc_broadcast_stats(q); > +#endif > Am thinking this could be a callback in the Qdisc_ops struct... > /* Dequeue packet */ > skb =3D dequeue_skb(q); > -- > 1.8.3 > > _______________________________________________ > Bloat-devel mailing list > Bloat-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat-devel > --=20 Dave T=E4ht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html --14dae9340cc9f9f68f04ded0006a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable It is highly unlikely these patches would be accepted by upstream as alloca= ting a skb in the fast path is probably a no-no. However at least part of t= hem seem generically useful - instrumenting pfifo_fast in particular - and = certainly helpful for debugging in general.

I had hoped it would be possible to do this without mucking with lockin= g, tho. :(

In my own case I'd like to have a qdisc_drop_with_sta= ts(skb,sch,"fq_codel",reason) that could be compiled in or out ea= sily, sending the packet headers of the dropped packet to a listener much a= s you do here. This would be useful for monitoring, have potential as a rou= ting metric, and make it easier to analyze the behavior of various qdisc ty= pes, without having to poll and lock via tc.

Also as a drop takes place only when the card is producing delays, it w= ould be possible to get a bandwidth estimate over smaller intervals from it= easily, and be less invasive overall...

So thanks for digging into = netlink so much and having a sample program. I almost get it now...

A couple more comments below:

On Mon,= Jun 10, 2013 at 5:39 AM, Toke H=F8iland-J=F8rgensen <= toke@toke.dk> wrote:

Email syntax for a RFC PATCH is RFC PATCH, not PATCH RFC

---
=A0include/uapi/linux/rtnetlink.h | 15 ++++++---
=A0net/core/gen_stats.c =A0 =A0 =A0 =A0 =A0 | =A06 ++--
=A0net/sched/Kconfig =A0 =A0 =A0 =A0 =A0 =A0 =A0| =A08 +++++
=A0net/sched/sch_generic.c =A0 =A0 =A0 =A0| 71 ++++++++++++++++++++++++++++= ++++++++++++++
=A04 files changed, 93 insertions(+), 7 deletions(-)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.= h
index 7a2144e..e9d91c5 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -68,6 +68,9 @@ enum {
=A0 =A0 =A0 =A0 RTM_GETQDISC,
=A0#define RTM_GETQDISC =A0 RTM_GETQDISC

+ =A0 =A0 =A0 RTM_QDISC_STATS,
+#define RTM_QDISC_STATS RTM_QDISC_STATS
+
=A0
=A0 =A0 =A0 =A0 RTM_NEWTCLASS =A0 =3D 40,
=A0#define RTM_NEWTCLASS =A0RTM_NEWTCLASS
=A0 =A0 =A0 =A0 RTM_DELTCLASS,
@@ -140,7 +143,7 @@ enum {
=A0#define RTM_NR_FAMILIES =A0 =A0 =A0 =A0(RTM_NR_MSGTYPES >> 2)
=A0#define RTM_FAM(cmd) =A0 (((cmd) - RTM_BASE) >> 2)

-/*
+/*
=A0 =A0 Generic structure for encapsulation of optional route information.<= br> =A0 =A0 It is reminiscent of sockaddr, but with sa_family replaced
=A0 =A0 with attribute type.
@@ -180,7 +183,7 @@ struct rtmsg {

=A0 =A0 =A0 =A0 unsigned char =A0 =A0 =A0 =A0 =A0 rtm_table; =A0 =A0 =A0/* = Routing table id */
=A0 =A0 =A0 =A0 unsigned char =A0 =A0 =A0 =A0 =A0 rtm_protocol; =A0 /* Rout= ing protocol; see below =A0*/
- =A0 =A0 =A0 unsigned char =A0 =A0 =A0 =A0 =A0 rtm_scope; =A0 =A0 =A0/* Se= e below */
+ =A0 =A0 =A0 unsigned char =A0 =A0 =A0 =A0 =A0 rtm_scope; =A0 =A0 =A0/* Se= e below */
=A0 =A0 =A0 =A0 unsigned char =A0 =A0 =A0 =A0 =A0 rtm_type; =A0 =A0 =A0 /* = See below =A0 =A0*/


unnecessary change.
=A0
=A0 =A0 =A0 =A0 unsigned =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0rtm_flags;
@@ -450,7 +453,7 @@ struct ifinfomsg {
=A0};

=A0/******************************************************************** - * =A0 =A0 =A0 =A0 =A0 =A0 prefix information
+ * =A0 =A0 =A0 =A0 =A0 =A0 prefix information
=A0 ****/


unnecessary change. try to touch as little as pos= sible... There is an emacs mode for linux indentation, btw
which helps a= void annoyingly messing up existing code.

=A0struct prefixmsg {
@@ -464,7 +467,7 @@ struct prefixmsg {
=A0 =A0 =A0 =A0 unsigned char =A0 prefix_pad3;
=A0};

-enum
+enum
=A0{
=A0 =A0 =A0 =A0 PREFIX_UNSPEC,
=A0 =A0 =A0 =A0 PREFIX_ADDRESS,
@@ -571,6 +574,8 @@ enum rtnetlink_groups {
=A0#define RTNLGRP_NEIGH =A0 =A0 =A0 =A0 =A0RTNLGRP_NEIGH
=A0 =A0 =A0 =A0 RTNLGRP_TC,
=A0#define RTNLGRP_TC =A0 =A0 =A0 =A0 =A0 =A0 RTNLGRP_TC
+ =A0 =A0 =A0 RTNLGRP_TC_STATS,
+#define RTNLGRP_TC_STATS =A0 =A0 =A0 RTNLGRP_TC_STATS
=A0 =A0 =A0 =A0 RTNLGRP_IPV4_IFADDR,
=A0#define RTNLGRP_IPV4_IFADDR =A0 =A0RTNLGRP_IPV4_IFADDR
=A0 =A0 =A0 =A0 RTNLGRP_IPV4_MROUTE,

It's not = clear to me where this stuff enters userspace or not, and in general it'= ;s
safer to append new enums/defines rather than insert them.
=A0
=
@@ -625,7 +630,7 @@ struct tcamsg {
=A0};
=A0#define TA_RTA(r) =A0((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof= (struct tcamsg))))
=A0#define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg))
-#define TCA_ACT_TAB 1 /* attr type must be >=3D1 */
+#define TCA_ACT_TAB 1 /* attr type must be >=3D1 */
=A0#define TCAA_MAX 1

Uneeeded change...
=A0

=A0/* New extended info filters for IFLA_EXT_MASK */
diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c
index ddedf21..68df614 100644
--- a/net/core/gen_stats.c
+++ b/net/core/gen_stats.c
@@ -61,7 +61,8 @@ gnet_stats_start_copy_compat(struct sk_buff *skb, int typ= e, int tc_stats_type,
=A0{
=A0 =A0 =A0 =A0 memset(d, 0, sizeof(*d));

- =A0 =A0 =A0 spin_lock_bh(lock);
+ =A0 =A0 =A0 if(lock)
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock_bh(lock);

= So this acted up on a SMP kernel?
=A0
=A0 =A0 =A0 =A0 d->lock =3D lock;
=A0 =A0 =A0 =A0 if (type)
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 d->tail =3D (struct nlattr *)skb_tail_po= inter(skb);
@@ -245,7 +246,8 @@ gnet_stats_finish_copy(struct gnet_dump *d)
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -1;
=A0 =A0 =A0 =A0 }

- =A0 =A0 =A0 spin_unlock_bh(d->lock);
+ =A0 =A0 =A0 if(d->lock)
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock_bh(d->lock);
=A0 =A0 =A0 =A0 return 0;
=A0}
=A0EXPORT_SYMBOL(gnet_stats_finish_copy);
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 235e01a..4902812 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -308,6 +308,14 @@ config NET_SCH_PLUG
=A0 =A0 =A0 =A0 =A0 To compile this code as a module, choose M here: the =A0 =A0 =A0 =A0 =A0 module will be called sch_plug.

+config NET_SCH_BROADCAST_STATS
+ =A0 =A0 =A0 bool "Enable Qdisc statistics broadcast"
+ =A0 =A0 =A0 ---help---
+
+ =A0 =A0 =A0 =A0 Select this option if you want to enable qdisc stats broa= dcast through
+ =A0 =A0 =A0 =A0 =A0netlink multicast. Broadcast happens on packet dequeue= , limited to the
+ =A0 =A0 =A0 =A0 =A0interval set by the <something> sysctl parameter= .
+
=A0comment "Classification"

QDISC_BROADC= AST_DEQUEUE_STATS maybe?

Elsewhere you've defined the sysctl. W= ould argue for sysfs...
=A0

=A0config NET_CLS
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index eac7e0e..56490a1 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -28,6 +28,7 @@
=A0#include <net/sch_generic.h>
=A0#include <net/pkt_sched.h>
=A0#include <net/dst.h>
+#include <net/netlink.h>

=A0/* Main transmission queue. */

@@ -148,6 +149,73 @@ int sch_direct_xmit(struct sk_buff *skb, struct Qdisc = *q,
=A0 =A0 =A0 =A0 return ret;
=A0}

+#ifdef CONFIG_NET_SCH_BROADCAST_STATS
+int qdisc_broadcast_stats(struct Qdisc *q)
+{
+ =A0 =A0 =A0 struct tcmsg *tcm;
+ =A0 =A0 =A0 struct nlmsghdr *nlh;
+ =A0 =A0 =A0 struct gnet_dump d;
+ =A0 =A0 =A0 struct sk_buff *skb;
+ =A0 =A0 =A0 unsigned char *b;
+
+ =A0 =A0 =A0 if(!q->dev_queue || !q->dev_queue->dev)
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 return 0;


Is there a way to also merely detect if there is = a listener present?
=A0
+ =A0 =A0 =A0 printk(KERN_DEBUG "Packet dequeue, ifname %s(%d), qdisc = %s, qlen %d\n",
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 qdisc_dev(q)->name,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 qdisc_dev(q)->ifindex,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->ops->id,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->qstats.qlen);

= Useful for debugging, but invasive for general use, far more heavyweight th= an tossing
it into netlink.
=A0
+ =A0 =A0 =A0 skb =3D alloc_skb(NLMSG_SPACE(1024), GFP_KERNEL);

NLMSG_SPACE seems arbitrary, and perhaps this needs to be GFP= _ATOMIC?
(But I am totally not an expert down here)
=A0
+ =A0 =A0 =A0 if(!skb)
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -ENOBUFS;
+ =A0 =A0 =A0 b =3D skb_tail_pointer(skb);
+
+ =A0 =A0 =A0 nlh =3D nlmsg_put(skb, 0, 0, RTM_QDISC_STATS, sizeof(*tcm), N= LM_F_MULTI);
+ =A0 =A0 =A0 if (!nlh)
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out_nlmsg_trim;
+
+ =A0 =A0 =A0 tcm =3D nlmsg_data(nlh);
+ =A0 =A0 =A0 tcm->tcm_family =3D AF_UNSPEC;
+ =A0 =A0 =A0 tcm->tcm__pad1 =3D 0;
+ =A0 =A0 =A0 tcm->tcm__pad2 =3D 0;
+ =A0 =A0 =A0 tcm->tcm_ifindex =3D qdisc_dev(q)->ifindex;
+ =A0 =A0 =A0 tcm->tcm_parent =3D q->parent;
+ =A0 =A0 =A0 tcm->tcm_handle =3D q->handle;
+ =A0 =A0 =A0 tcm->tcm_info =3D atomic_read(&q->refcnt);
+
+ =A0 =A0 =A0 if (nla_put_string(skb, TCA_KIND, q->ops->id))
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto nla_put_failure;
+
+ =A0 =A0 =A0 if (gnet_stats_start_copy(skb, TCA_STATS2, NULL, &d) <= 0)
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto nla_put_failure;
+
+ =A0 =A0 =A0 if (q->ops->dump_stats && q->ops->dump_st= ats(q, &d) < 0)
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto nla_put_failure;
+
+ =A0 =A0 =A0 if (gnet_stats_copy_basic(&d, &q->bstats) < 0 |= |
+ =A0 =A0 =A0 =A0 =A0 gnet_stats_copy_queue(&d, &q->qstats) <= 0)
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto nla_put_failure;
+
+ =A0 =A0 =A0 if (gnet_stats_finish_copy(&d) < 0)
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto nla_put_failure;
+
+ =A0 =A0 =A0 nlh->nlmsg_len =3D skb_tail_pointer(skb) - b;
+
+ =A0 =A0 =A0 rtnetlink_send(skb, qdisc_dev(q)->nd_net, 0, RTNLGRP_TC_ST= ATS, 0);
+
+ =A0 =A0 =A0 return 0;
+
+nla_put_failure:
+out_nlmsg_trim:
+ =A0 =A0 =A0 nlmsg_trim(skb, b);
+ =A0 =A0 =A0 return -1;
+
+}
+#endif
+
+
=A0/*
=A0 * NOTE: Called under qdisc_lock(q) with locally disabled BH.
=A0 *
@@ -173,6 +241,9 @@ static inline int qdisc_restart(struct Qdisc *q)
=A0 =A0 =A0 =A0 struct net_device *dev;
=A0 =A0 =A0 =A0 spinlock_t *root_lock;
=A0 =A0 =A0 =A0 struct sk_buff *skb;
+#ifdef CONFIG_NET_SCH_BROADCAST_STATS
+ =A0 =A0 =A0 qdisc_broadcast_stats(q);
+#endif

Am thinking this could be a callback in th= e Qdisc_ops struct...
=A0
=A0 =A0 =A0 =A0 /* Dequeue packet */
=A0 =A0 =A0 =A0 skb =3D dequeue_skb(q);
--
1.8.3

_______________________________________________
Bloat-devel mailing list
Bloat-devel@lists.buff= erbloat.net
https://lists.bufferbloat.net/listinfo/bloat-devel



--
Dave T=E4= ht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowr= t/subscribe.html=20 --14dae9340cc9f9f68f04ded0006a--