From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f181.google.com (mail-wi0-f181.google.com [209.85.212.181]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 54185200CCC for ; Sat, 5 May 2012 09:11:18 -0700 (PDT) Received: by wibhn14 with SMTP id hn14so1787775wib.10 for ; Sat, 05 May 2012 09:11:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=tJLLYwRsk+W7PDI4avLg4EujhqQyccE/ERDMjWfzJvA=; b=kkEudDcEXm1MD7C/DkapkjLDzYHSkiWiUjsCNTIsF/z8NyhfhffU5/y1ZFHGAt+ZPJ XBujcvvXvbG4IA0EEwVJjucTdXwIC06wcfNXmLlnRD/fGdfaYKBOLKyOKPZ+5CIbOX5O OtPvXDaCKKqkyqP3byhvewafhKke8JiFHeo+xFeVWSbRMqM38gMWYYGEC+gbv8GerOOW QDk0mxoOxRBjx1s0m5lhgR3HL0b2kHuyt4MIul8qo9fmltp4x1yqPohJOu6Ehp27fckh riRuxzYtqIpVPyt5EvmBwqy6HF685rdZNSnKNDtOMTkKhEVxUx5KmZCkPE+RSSu9aNnT 9Fiw== MIME-Version: 1.0 Received: by 10.180.94.7 with SMTP id cy7mr25137622wib.3.1336234275941; Sat, 05 May 2012 09:11:15 -0700 (PDT) Received: by 10.223.112.66 with HTTP; Sat, 5 May 2012 09:11:15 -0700 (PDT) In-Reply-To: <1336229343.3752.516.camel@edumazet-glaptop> References: <1336217671-20384-1-git-send-email-dave.taht@bufferbloat.net> <1336218794.3752.508.camel@edumazet-glaptop> <1336229343.3752.516.camel@edumazet-glaptop> Date: Sat, 5 May 2012 09:11:15 -0700 Message-ID: From: Dave Taht To: Eric Dumazet Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: codel@lists.bufferbloat.net, =?ISO-8859-1?Q?Dave_T=E4ht?= Subject: Re: [Codel] [PATCH v5] pkt_sched: codel: Controlled Delay AQM X-BeenThere: codel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: CoDel AQM discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 May 2012 16:11:19 -0000 Nice! Nits: 0) I figure you already have an iproute2 patch that you can send? I thought 5 hours ago I had almost, but not entirely grokked netlink. The way you just did it was not at all how I thought it worked. :/ but I will read. 1) I take it if a limit is not specified or set here, sch->limit comes from txqueuelen? I do kind of like infinite queues (and angels dancing on the heads of p= ins) 2) I woke up with a mod that could do ecn. I'll do an rfc patch. 3) Tom's already on the list 4) I'd like to play with this a lot (and have others do so too) before it goes upstream, gain kathie and vans blessing, etc. Couple weeks? (see 2). In particular I was hoping to see actual pings under load match the target setting. I'll get this going on two boxes and see what happens... play with bql, htb, etc... 5) thought the * 16 could be efficiently implemented by the compiler, and saves a mem access. 6) unless 2) happens we can kill q->flags 7) thx On Sat, May 5, 2012 at 7:49 AM, Eric Dumazet wrote= : > From: Dave Taht > > A nice changelog here, to tell how nice is CoDel, giving pointers to > documentation and all credits. I don't have a link to the web site... nor have I read the paper, yet. Mond= ay. > > Signed-off-by: Dave Taht > Signed-off-by: Eric Dumazet > --- > =A0include/linux/pkt_sched.h | =A0 13 + > =A0net/sched/Kconfig =A0 =A0 =A0 =A0 | =A0 11 > =A0net/sched/Makefile =A0 =A0 =A0 =A0| =A0 =A01 > =A0net/sched/sch_codel.c =A0 =A0 | =A0425 +++++++++++++++++++++++++++++++= +++++ > =A04 files changed, 450 insertions(+) > > diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h > index ffe975c..62a73bf 100644 > --- a/include/linux/pkt_sched.h > +++ b/include/linux/pkt_sched.h > @@ -655,4 +655,17 @@ struct tc_qfq_stats { > =A0 =A0 =A0 =A0__u32 lmax; > =A0}; > > +/* CODEL */ > + > +enum { > + =A0 =A0 =A0 TCA_CODEL_UNSPEC, > + =A0 =A0 =A0 TCA_CODEL_TARGET, > + =A0 =A0 =A0 TCA_CODEL_LIMIT, > + =A0 =A0 =A0 TCA_CODEL_MINBYTES, > + =A0 =A0 =A0 TCA_CODEL_INTERVAL, > + =A0 =A0 =A0 __TCA_CODEL_MAX > +}; > + > +#define TCA_CODEL_MAX =A0(__TCA_CODEL_MAX - 1) > + > =A0#endif > diff --git a/net/sched/Kconfig b/net/sched/Kconfig > index 75b58f8..fadd252 100644 > --- a/net/sched/Kconfig > +++ b/net/sched/Kconfig > @@ -250,6 +250,17 @@ config NET_SCH_QFQ > > =A0 =A0 =A0 =A0 =A0If unsure, say N. > > +config NET_SCH_CODEL > + =A0 =A0 =A0 tristate "Controlled Delay AQM (CODEL)" > + =A0 =A0 =A0 help > + =A0 =A0 =A0 =A0 Say Y here if you want to use the Controlled Delay (COD= EL) > + =A0 =A0 =A0 =A0 packet scheduling algorithm. > + > + =A0 =A0 =A0 =A0 To compile this driver as a module, choose M here: the = module > + =A0 =A0 =A0 =A0 will be called sch_codel. > + > + =A0 =A0 =A0 =A0 If unsure, say N. > + > =A0config NET_SCH_INGRESS > =A0 =A0 =A0 =A0tristate "Ingress Qdisc" > =A0 =A0 =A0 =A0depends on NET_CLS_ACT > diff --git a/net/sched/Makefile b/net/sched/Makefile > index 8cdf4e2..30fab03 100644 > --- a/net/sched/Makefile > +++ b/net/sched/Makefile > @@ -37,6 +37,7 @@ obj-$(CONFIG_NET_SCH_PLUG) =A0 =A0+=3D sch_plug.o > =A0obj-$(CONFIG_NET_SCH_MQPRIO) =A0 +=3D sch_mqprio.o > =A0obj-$(CONFIG_NET_SCH_CHOKE) =A0 =A0+=3D sch_choke.o > =A0obj-$(CONFIG_NET_SCH_QFQ) =A0 =A0 =A0+=3D sch_qfq.o > +obj-$(CONFIG_NET_SCH_CODEL) =A0 =A0+=3D sch_codel.o > > =A0obj-$(CONFIG_NET_CLS_U32) =A0 =A0 =A0+=3D cls_u32.o > =A0obj-$(CONFIG_NET_CLS_ROUTE4) =A0 +=3D cls_route.o > diff --git a/net/sched/sch_codel.c b/net/sched/sch_codel.c > new file mode 100644 > index 0000000..a19177f > --- /dev/null > +++ b/net/sched/sch_codel.c > @@ -0,0 +1,425 @@ > +/* > + * net/sched/sch_codel.c =A0 =A0 =A0 A Codel implementation > + * > + * =A0 =A0 This program is free software; you can redistribute it and/or > + * =A0 =A0 modify it under the terms of the GNU General Public License > + * =A0 =A0 as published by the Free Software Foundation; either version > + * =A0 =A0 2 of the License, or (at your option) any later version. > + * > + * Codel, the COntrolled DELay Queueing discipline > + * Based on ns2 simulation code presented by Kathie Nichols > + * > + * Authors: =A0 =A0Dave T=E4ht > + * =A0 =A0 =A0 =A0 =A0 =A0 Eric Dumazet > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#define MS2TIME(a) (ns_to_ktime( (u64) a * NSEC_PER_MSEC)) > +#define DEFAULT_CODEL_LIMIT 1000 > +#define PRECALC_MAX 64 > + > +/* > + * Via patch found at: > + * http://lkml.indiana.edu/hypermail/linux/kernel/0802.0/0659.html > + * I don't know why this isn't in ktime.h as it seemed sane... > + */ > + > +/* > + * ktime_compare - Compares two ktime_t variables > + * > + * Return val: > + * lhs < rhs: < 0 > + * lhs =3D=3D rhs: 0 > + * lhs > rhs: > 0 > + */ > + > +#if (BITS_PER_LONG =3D=3D 64) || defined(CONFIG_KTIME_SCALAR) > +static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs) > +{ > + =A0 =A0 =A0 if (lhs.tv64 < rhs.tv64) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -1; > + =A0 =A0 =A0 if (lhs.tv64 > rhs.tv64) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return 1; > + =A0 =A0 =A0 return 0; > +} > +#else > +static inline int ktime_compare(const ktime_t lhs, const ktime_t rhs) > +{ > + =A0 =A0 =A0 if (lhs.tv.sec < rhs.tv.sec) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -1; > + =A0 =A0 =A0 if (lhs.tv.sec > rhs.tv.sec) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return 1; > + =A0 =A0 =A0 return lhs.tv.nsec - rhs.tv.nsec; > +} > +#endif > + > +/* Per-queue state (codel_queue_t instance variables) */ > + > +struct codel_sched_data { > + =A0 =A0 =A0 u32 =A0 =A0 flags; > + =A0 =A0 =A0 u32 =A0 =A0 minbytes; > + =A0 =A0 =A0 u32 =A0 =A0 count; /* packets dropped since we went into dr= op state */ > + =A0 =A0 =A0 u32 =A0 =A0 drop_count; > + =A0 =A0 =A0 bool =A0 =A0dropping; > + =A0 =A0 =A0 ktime_t target; > + =A0 =A0 =A0 /* time to declare above q->target (0 if below)*/ > + =A0 =A0 =A0 ktime_t first_above_time; > + =A0 =A0 =A0 ktime_t drop_next; /* time to drop next packet */ > + =A0 =A0 =A0 ktime_t interval16; > + =A0 =A0 =A0 u32 =A0 =A0 interval; > + =A0 =A0 =A0 u32 =A0 =A0 q_intervals[PRECALC_MAX]; > +}; > + > +struct codel_skb_cb { > + =A0 =A0 =A0 ktime_t enqueue_time; > +}; > + > +static unsigned int state1; > +static unsigned int state2; > +static unsigned int state3; > +static unsigned int states; > + > +/* > + * return interval/sqrt(x) with good precision > + */ > +static u32 calc(u32 _interval, unsigned long x) > +{ > + =A0 =A0 =A0 u64 interval =3D _interval; > + > + =A0 =A0 =A0 /* scale operands for max precision */ > + =A0 =A0 =A0 while (x < (1UL << (BITS_PER_LONG - 2))) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 x <<=3D 2; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 interval <<=3D 1; > + =A0 =A0 =A0 } > + =A0 =A0 =A0 do_div(interval, int_sqrt(x)); > + =A0 =A0 =A0 return (u32)interval; > +} > + > +static void codel_fill_cache(struct codel_sched_data *q) > +{ > + =A0 =A0 =A0 int i; > + > + =A0 =A0 =A0 q->q_intervals[0] =3D q->interval; > + =A0 =A0 =A0 for (i =3D 2; i <=3D PRECALC_MAX; i++) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->q_intervals[i - 1] =3D calc(q->interval,= i); > +} > + > +static struct codel_skb_cb *get_codel_cb(const struct sk_buff *skb) > +{ > + =A0 =A0 =A0 qdisc_cb_private_validate(skb, sizeof(struct codel_skb_cb))= ; > + =A0 =A0 =A0 return (struct codel_skb_cb *)qdisc_skb_cb(skb)->data; > +} > + > +static ktime_t get_enqueue_time(const struct sk_buff *skb) > +{ > + =A0 =A0 =A0 return get_codel_cb(skb)->enqueue_time; > +} > + > +static void set_enqueue_time(struct sk_buff *skb) > +{ > + =A0 =A0 =A0 get_codel_cb(skb)->enqueue_time =3D ktime_get(); > +} > + > +/* > + * =A0 =A0 The original control_law required floating point. > + * > + * =A0 =A0 return ktime_add_ns(t, q->interval / sqrt(q->count)); > + * > + */ > +static ktime_t control_law(const struct codel_sched_data *q, ktime_t t) > +{ > + =A0 =A0 =A0 u32 inter; > + > + =A0 =A0 =A0 if (q->count > PRECALC_MAX) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 inter =3D calc(q->interval, q->count); > + =A0 =A0 =A0 else > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 inter =3D q->q_intervals[q->count - 1]; > + =A0 =A0 =A0 return ktime_add_ns(t, inter); > +} > + > +static bool should_drop(struct sk_buff *skb, struct Qdisc *sch, ktime_t = now) > +{ > + =A0 =A0 =A0 struct codel_sched_data *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 ktime_t sojourn_time; > + =A0 =A0 =A0 bool drop; > + > + =A0 =A0 =A0 if (!skb) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->first_above_time.tv64 =3D 0; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return false; > + =A0 =A0 =A0 } > + =A0 =A0 =A0 sojourn_time =3D ktime_sub(now, get_enqueue_time(skb)); > + > + =A0 =A0 =A0 if (ktime_compare(sojourn_time, q->target) < 0 || > + =A0 =A0 =A0 =A0 =A0 sch->qstats.backlog < q->minbytes) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* went below so we'll stay below for at le= ast q->interval */ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->first_above_time.tv64 =3D 0; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return false; > + =A0 =A0 =A0 } > + =A0 =A0 =A0 drop =3D false; > + =A0 =A0 =A0 if (q->first_above_time.tv64 =3D=3D 0) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* just went above from below. If we stay a= bove > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* for at least q->interval we'll say it'= s ok to drop > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*/ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->first_above_time =3D ktime_add_ns(now, q= ->interval); > + =A0 =A0 =A0 } else if (ktime_compare(now, q->first_above_time) >=3D 0) = { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 drop =3D true; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 state1++; > + =A0 =A0 =A0 } > + =A0 =A0 =A0 return drop; > +} > + > +static void codel_drop(struct Qdisc *sch, struct sk_buff *skb) > +{ > + =A0 =A0 =A0 struct codel_sched_data *q =3D qdisc_priv(sch); > + > + =A0 =A0 =A0 sch->qstats.backlog -=3D qdisc_pkt_len(skb); > + =A0 =A0 =A0 qdisc_drop(skb, sch); > + =A0 =A0 =A0 q->drop_count++; > +} > + > +static struct sk_buff *codel_dequeue(struct Qdisc *sch) > +{ > + =A0 =A0 =A0 struct codel_sched_data *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct sk_buff *skb =3D __skb_dequeue(&sch->q); > + =A0 =A0 =A0 ktime_t now; > + =A0 =A0 =A0 bool drop; > + > + =A0 =A0 =A0 if (!skb) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->dropping =3D false; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return skb; > + =A0 =A0 =A0 } > + =A0 =A0 =A0 now =3D ktime_get(); > + =A0 =A0 =A0 drop =3D should_drop(skb, sch, now); > + =A0 =A0 =A0 if (q->dropping) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!drop) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* sojourn time below targe= t - leave dropping state */ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->dropping =3D false; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else if (ktime_compare(now, q->drop_next)= >=3D0) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 state2++; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* It's time for the next d= rop. Drop the current > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* packet and dequeue the= next. The dequeue might > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* take us out of droppin= g state. > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* If not, schedule the n= ext drop. > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* A large backlog might = result in drop rates so high > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* that the next drop sho= uld happen now, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* hence the while loop. > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*/ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 while (q->dropping && > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(ktime_compa= re(now, q->drop_next) >=3D 0)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 codel_drop(= sch, skb); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->count++; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 skb =3D __s= kb_dequeue(&sch->q); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!should= _drop(skb, sch, now)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 /* leave dropping state */ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 q->dropping =3D false; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 /* and schedule the next drop */ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 q->drop_next =3D > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 control_law(q, q->drop_next); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 } else if (drop && > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0((ktime_compare(ktime_sub(now, q->dr= op_next), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0q->i= nterval16) < 0) || > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(ktime_compare(ktime_sub(now, q->fir= st_above_time), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ns_to_k= time(2 * q->interval)) >=3D 0 ))) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 codel_drop(sch, skb); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 skb =3D __skb_dequeue(&sch->q); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 drop =3D should_drop(skb, sch, now); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->dropping =3D true; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 state3++; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* if min went above target close to when= we last went below it > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* assume that the drop rate that control= led the queue on the > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* last cycle is a good starting point to= control it now. > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*/ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (ktime_compare(ktime_sub(now, q->drop_ne= xt), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->inte= rval16) < 0) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->count =3D q->count > 1 ?= q->count - 1 : 1; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->count =3D 1; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->drop_next =3D control_law(q, now); > + =A0 =A0 =A0 } > + =A0 =A0 =A0 if ((states++ % 64) =3D=3D 0) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pr_debug("s1: %u, s2: %u, s3: %u\n", > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 state1, state2, state3)= ; > + =A0 =A0 =A0 } > + =A0 =A0 =A0 /* We cant call qdisc_tree_decrease_qlen() if our qlen is 0= , > + =A0 =A0 =A0 =A0* or HTB crashes > + =A0 =A0 =A0 =A0*/ > + =A0 =A0 =A0 if (q->drop_count && sch->q.qlen) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qdisc_tree_decrease_qlen(sch, q->drop_count= ); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->drop_count =3D 0; > + =A0 =A0 =A0 } > + =A0 =A0 =A0 if (skb) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 sch->qstats.backlog -=3D qdisc_pkt_len(skb)= ; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qdisc_bstats_update(sch, skb); > + =A0 =A0 =A0 } > + =A0 =A0 =A0 return skb; > +} > + > +static int codel_enqueue(struct sk_buff *skb, struct Qdisc *sch) > +{ > + =A0 =A0 =A0 if (likely(skb_queue_len(&sch->q) < sch->limit)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 set_enqueue_time(skb); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return qdisc_enqueue_tail(skb, sch); > + =A0 =A0 =A0 } > + =A0 =A0 =A0 return qdisc_drop(skb, sch); > +} > + > +static const struct nla_policy codel_policy[TCA_CODEL_MAX + 1] =3D { > + =A0 =A0 =A0 [TCA_CODEL_TARGET] =A0 =A0 =A0=3D { .type =3D NLA_U32 }, > + =A0 =A0 =A0 [TCA_CODEL_LIMIT] =A0 =A0 =A0 =3D { .type =3D NLA_U32 }, > + =A0 =A0 =A0 [TCA_CODEL_MINBYTES] =A0 =A0=3D { .type =3D NLA_U32 }, > + =A0 =A0 =A0 [TCA_CODEL_INTERVAL] =A0 =A0=3D { .type =3D NLA_U32 }, > +}; > + > +static int codel_change(struct Qdisc *sch, struct nlattr *opt) > +{ > + =A0 =A0 =A0 struct codel_sched_data *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct nlattr *tb[TCA_CODEL_MAX + 1]; > + =A0 =A0 =A0 unsigned int qlen; > + =A0 =A0 =A0 int err; > + > + =A0 =A0 =A0 if (opt =3D=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -EINVAL; > + > + =A0 =A0 =A0 err =3D nla_parse_nested(tb, TCA_CODEL_MAX, opt, codel_poli= cy); > + =A0 =A0 =A0 if (err < 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return err; > + > + =A0 =A0 =A0 sch_tree_lock(sch); > + =A0 =A0 =A0 if (tb[TCA_CODEL_TARGET]) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 u32 target =3D nla_get_u32(tb[TCA_CODEL_TAR= GET]); > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->target =3D ns_to_ktime((u64) target * NS= EC_PER_USEC); > + =A0 =A0 =A0 } > + =A0 =A0 =A0 if (tb[TCA_CODEL_INTERVAL]) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 u32 interval =3D nla_get_u32(tb[TCA_CODEL_I= NTERVAL]); > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 interval =3D min_t(u32, ~0U / NSEC_PER_USEC= , interval); > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->interval =3D interval * NSEC_PER_USEC; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->interval16 =3D ns_to_ktime(16 * (u64)q->= interval); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 codel_fill_cache(q); > + =A0 =A0 =A0 } > + =A0 =A0 =A0 if (tb[TCA_CODEL_LIMIT]) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 sch->limit =3D nla_get_u32(tb[TCA_CODEL_LIM= IT]); > + > + =A0 =A0 =A0 if (tb[TCA_CODEL_MINBYTES]) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->minbytes =3D nla_get_u32(tb[TCA_CODEL_MI= NBYTES]); > + > + =A0 =A0 =A0 qlen =3D sch->q.qlen; > + =A0 =A0 =A0 while (sch->q.qlen > sch->limit) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct sk_buff *skb =3D __skb_dequeue(&sch-= >q); > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 sch->qstats.backlog -=3D qdisc_pkt_len(skb)= ; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qdisc_drop(skb, sch); > + =A0 =A0 =A0 } > + =A0 =A0 =A0 qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen); > + > + =A0 =A0 =A0 q->drop_next.tv64 =3D q->first_above_time.tv64 =3D 0; > + =A0 =A0 =A0 q->dropping =3D false; > + =A0 =A0 =A0 sch_tree_unlock(sch); > + =A0 =A0 =A0 return 0; > +} > + > +static int codel_init(struct Qdisc *sch, struct nlattr *opt) > +{ > + =A0 =A0 =A0 struct codel_sched_data *q =3D qdisc_priv(sch); > + > + =A0 =A0 =A0 q->target =3D MS2TIME(5); > + =A0 =A0 =A0 /* It should be possible to run with no limit, > + =A0 =A0 =A0 =A0* with infinite memory :) > + =A0 =A0 =A0 =A0*/ > + =A0 =A0 =A0 sch->limit =3D DEFAULT_CODEL_LIMIT; > + =A0 =A0 =A0 q->minbytes =3D psched_mtu(qdisc_dev(sch)); > + =A0 =A0 =A0 q->interval =3D 100 * NSEC_PER_MSEC; > + =A0 =A0 =A0 q->interval16 =3D ns_to_ktime(16 * (u64)q->interval); > + =A0 =A0 =A0 q->drop_next.tv64 =3D q->first_above_time.tv64 =3D 0; > + =A0 =A0 =A0 q->dropping =3D false; /* exit dropping state */ > + =A0 =A0 =A0 q->count =3D 1; > + =A0 =A0 =A0 codel_fill_cache(q); > + =A0 =A0 =A0 if (opt) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 int err =3D codel_change(sch, opt); > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (err) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return err; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 if (sch->limit >=3D 1) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 sch->flags |=3D TCQ_F_CAN_BYPASS; > + =A0 =A0 =A0 else > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 sch->flags &=3D ~TCQ_F_CAN_BYPASS; > + > + =A0 =A0 =A0 return 0; > +} > + > +static int codel_dump(struct Qdisc *sch, struct sk_buff *skb) > +{ > + =A0 =A0 =A0 struct codel_sched_data *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct nlattr *opts; > + =A0 =A0 =A0 u32 target =3D ktime_to_us(q->target); > + > + =A0 =A0 =A0 opts =3D nla_nest_start(skb, TCA_OPTIONS); > + =A0 =A0 =A0 if (opts =3D=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto nla_put_failure; > + =A0 =A0 =A0 if (nla_put_u32(skb, TCA_CODEL_TARGET, target) || > + =A0 =A0 =A0 =A0 =A0 nla_put_u32(skb, TCA_CODEL_LIMIT, sch->limit) || > + =A0 =A0 =A0 =A0 =A0 nla_put_u32(skb, TCA_CODEL_INTERVAL, q->interval / = NSEC_PER_USEC) || > + =A0 =A0 =A0 =A0 =A0 nla_put_u32(skb, TCA_CODEL_MINBYTES, q->minbytes)) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto nla_put_failure; > + > + =A0 =A0 =A0 return nla_nest_end(skb, opts); > + > +nla_put_failure: > + =A0 =A0 =A0 nla_nest_cancel(skb, opts); > + =A0 =A0 =A0 return -1; > +} > + > +static void codel_reset(struct Qdisc *sch) > +{ > + =A0 =A0 =A0 struct codel_sched_data *q =3D qdisc_priv(sch); > + > + =A0 =A0 =A0 qdisc_reset_queue(sch); > + =A0 =A0 =A0 sch->q.qlen =3D 0; > + =A0 =A0 =A0 q->dropping =3D false; > + =A0 =A0 =A0 q->count =3D 1; > +} > + > +static struct Qdisc_ops codel_qdisc_ops __read_mostly =3D { > + =A0 =A0 =A0 .id =A0 =A0 =A0 =A0 =A0 =A0 =3D =A0 =A0 =A0 "codel", > + =A0 =A0 =A0 .priv_size =A0 =A0 =A0=3D =A0 =A0 =A0 sizeof(struct codel_s= ched_data), > + > + =A0 =A0 =A0 .enqueue =A0 =A0 =A0 =A0=3D =A0 =A0 =A0 codel_enqueue, > + =A0 =A0 =A0 .dequeue =A0 =A0 =A0 =A0=3D =A0 =A0 =A0 codel_dequeue, > + =A0 =A0 =A0 .peek =A0 =A0 =A0 =A0 =A0 =3D =A0 =A0 =A0 qdisc_peek_dequeu= ed, > + =A0 =A0 =A0 .init =A0 =A0 =A0 =A0 =A0 =3D =A0 =A0 =A0 codel_init, > + =A0 =A0 =A0 .reset =A0 =A0 =A0 =A0 =A0=3D =A0 =A0 =A0 codel_reset, > + =A0 =A0 =A0 .change =A0 =A0 =A0 =A0 =3D =A0 =A0 =A0 codel_change, > + =A0 =A0 =A0 .dump =A0 =A0 =A0 =A0 =A0 =3D =A0 =A0 =A0 codel_dump, > + =A0 =A0 =A0 .owner =A0 =A0 =A0 =A0 =A0=3D =A0 =A0 =A0 THIS_MODULE, > +}; > + > +static int __init codel_module_init(void) > +{ > + =A0 =A0 =A0 =A0return register_qdisc(&codel_qdisc_ops); > +} > +static void __exit codel_module_exit(void) > +{ > + =A0 =A0 =A0 =A0unregister_qdisc(&codel_qdisc_ops); > +} > +module_init(codel_module_init) > +module_exit(codel_module_exit) > +MODULE_LICENSE("GPL"); > + > > --=20 Dave T=E4ht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net