* Time in Queue, bufferbloat, and... our accidentally interplanetary network @ 2011-12-05 9:05 Dave Taht 2011-12-05 10:59 ` Eric Dumazet 0 siblings, 1 reply; 11+ messages in thread From: Dave Taht @ 2011-12-05 9:05 UTC (permalink / raw) To: bloat, bloat-devel, netdev, linux-wireless I was tickled to see that expiring packets based on 'time in queue' was already a key design feature of the IPN, http://www.science20.com/interwebometry/lessons_nasa_can_learn_internet-84861 and the idea was suggested also independently in saturday's CACM article's comments... http://queue.acm.org/detail.cfm?id=2071893 Making decisions based on time in queue (TiQ), rather than length of queue, would seem to be a win, especially for wireless, but also for that does 'soft' traffic shaping with HTB-like qdiscs and sub qdiscs. Anything that is not running at GigE plus speeds would benefit. ... since basically what's been happening with bufferbloat is a too early implementation of the IPN, with detours between here and the moon! ... and so far I haven't seen any major theoretical holes in with TiQ, except for deciding as to how long is too long as to consider a packet as 'expired', (which can be measured in ms or 10s of ms), and having reasonably monotonic time. I just wish I (or someone) could come up with a way to implement it in Linux without multiple layer violations. skb->queuestamp has a nice ring to it, but when I look at this portion of the stack I freely admit quivering in ignorance and fear. I did a bit of work on a set of timestamping fifos, but realized that it was the entire queue's duration from entrance to exit (through multiple scheduling qdiscs) was what needed to be measured, and the drop/no drop decision needs to be made as late as possible - and preferably, the queue being pulled from needs the next packet pulled forward so as to inform the receiver that congestion is happening... And Eric dumazet also produced a preliminary patch a few weeks back that tied timestamping to before the head of a queue, but that tried to use a reserved field in the skb that appears from points A to Z is not guaranteed to be preserved. Thoughts? -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 FR Tel: 0638645374 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Time in Queue, bufferbloat, and... our accidentally interplanetary network 2011-12-05 9:05 Time in Queue, bufferbloat, and... our accidentally interplanetary network Dave Taht @ 2011-12-05 10:59 ` Eric Dumazet 2011-12-06 2:03 ` Adrian Chadd ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Eric Dumazet @ 2011-12-05 10:59 UTC (permalink / raw) To: Dave Taht; +Cc: linux-wireless, netdev, bloat-devel, bloat Le lundi 05 décembre 2011 à 10:05 +0100, Dave Taht a écrit : > And Eric dumazet also produced a preliminary patch a few weeks back > that tied timestamping to before the head of a queue, but that tried to use a > reserved field in the skb that appears from points A to Z is not guaranteed > to be preserved. It is guaranteed to be preserved, as it is part of skb->cb[], and skb->cb[] content is private to each layer. Here, Qdisc layer. Adding a time limit is possible, all we need is a proper design and implementation :) Here is my suggestion : Design a new tfifo/tred qdisc, with following properties : Adaptative RED, (ECN enabled + head drop), but instead of using bytes/packet qlen average, use time_in_queue average. A single mandatory parameter to specify the tmin value tmax default value would be tmin*3 (so that target avg is 2*tmin) tlimit default value would be tmax*8 Adaptative RED dynamically adjusts maxp between 1% and 50% [Using a timer, every 500ms, and AIMD] tc qdisc add dev eth0 root tfifo tmin 1ms [tmax val] [tlimit val] I volunteer to work on this new tfifo experimental qdisc :) This can be used in replacement of pfifo/bfifo in a complex qdisc setup. As it has RED included, it also can replace RED. Please note that once skb leaves Qdisc and is delivered to device, any time limit feature must be handled in the driver itself (if it can queue packet for a long time period) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Time in Queue, bufferbloat, and... our accidentally interplanetary network 2011-12-05 10:59 ` Eric Dumazet @ 2011-12-06 2:03 ` Adrian Chadd 2011-12-07 9:59 ` Dave Taht 2011-12-07 10:15 ` [Bloat] " Hagen Paul Pfeifer 2011-12-08 16:06 ` [PATCH net-next] sch_red: Adaptative RED AQM Eric Dumazet 2 siblings, 1 reply; 11+ messages in thread From: Adrian Chadd @ 2011-12-06 2:03 UTC (permalink / raw) To: Eric Dumazet; +Cc: linux-wireless, netdev, bloat-devel, bloat Hi, For what it's worth, I've also been tinkering with time-in-queue for the wifi queue management in FreeBSD. I don't have anything public yet. What I have done actually seems to work quite well, when doing TX queue time-in-queue management. (I'm ignoring RX queue management for now.) Yes, I think a weighted random drop with both time-in-queue and queue depth would work well. I haven't sat down to model what it'd look like given some traffic profiles. I'll be sure to post some patches and results when I have them. :) Adrian ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Time in Queue, bufferbloat, and... our accidentally interplanetary network 2011-12-06 2:03 ` Adrian Chadd @ 2011-12-07 9:59 ` Dave Taht 0 siblings, 0 replies; 11+ messages in thread From: Dave Taht @ 2011-12-07 9:59 UTC (permalink / raw) To: Adrian Chadd; +Cc: linux-wireless, netdev, bloat-devel, Eric Dumazet, bloat On Tue, Dec 6, 2011 at 3:03 AM, Adrian Chadd <adrian@freebsd.org> wrote: > Hi, > > For what it's worth, I've also been tinkering with time-in-queue for > the wifi queue management in FreeBSD. I don't have anything public > yet. What I have done actually seems to work quite well, when doing TX > queue time-in-queue management. (I'm ignoring RX queue management for > now.) > > Yes, I think a weighted random drop with both time-in-queue and queue > depth would work well. I haven't sat down to model what it'd look like > given some traffic profiles. > > I'll be sure to post some patches and results when I have them. :) Well, if there is a way to add BQL sanely to the mac80211 or driver layers and then apply some time in queue techniques at some edge of some layer(s) down there in Linux, I'd love to know what and where is good. It might be simpler to discuss design ideas etc, in a more generic and less noisy forum than netdev and linux-wireless, at least for a while, getting to actual patches seems kind of hard at this point. (at least to me. For all I know eric is finished already, and me, I haven't finished grokking the paper he's leveraging some ideas on) The little I understand about Linux's networking stack dwarfs the little I know about BSD's, so I look forward to hearing about your results as you get them, and that said, if you could provide some pointers and insight into BSD's traffic shaping methods over on bloat-devel, we do try to be all-architecture embracing in our attempts to beat the bloat. > > > Adrian -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 FR Tel: 0638645374 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] Time in Queue, bufferbloat, and... our accidentally interplanetary network 2011-12-05 10:59 ` Eric Dumazet 2011-12-06 2:03 ` Adrian Chadd @ 2011-12-07 10:15 ` Hagen Paul Pfeifer 2011-12-07 10:19 ` Hagen Paul Pfeifer 2011-12-07 11:53 ` Eric Dumazet 2011-12-08 16:06 ` [PATCH net-next] sch_red: Adaptative RED AQM Eric Dumazet 2 siblings, 2 replies; 11+ messages in thread From: Hagen Paul Pfeifer @ 2011-12-07 10:15 UTC (permalink / raw) To: Eric Dumazet; +Cc: bloat-devel, netdev, linux-wireless, bloat On Mon, 05 Dec 2011 11:59:34 +0100, Eric Dumazet wrote: > Adding a time limit is possible, all we need is a proper design and > implementation :) > > Here is my suggestion : > > Design a new tfifo/tred qdisc, with following properties : > > Adaptative RED, (ECN enabled + head drop), but instead of using > bytes/packet qlen average, use time_in_queue average. Question one: is anything wrong with sfb or choke as the basis, instead of RED? Question two: I submitted pfast_head_drop to drop more outdated data instead of new data. Back in time I thought TCP _may_ experience benefits because more up-to-date SACK data packets are saved. Are there any other TCP advantages with head drop policy? Hagen ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] Time in Queue, bufferbloat, and... our accidentally interplanetary network 2011-12-07 10:15 ` [Bloat] " Hagen Paul Pfeifer @ 2011-12-07 10:19 ` Hagen Paul Pfeifer 2011-12-07 11:53 ` Eric Dumazet 1 sibling, 0 replies; 11+ messages in thread From: Hagen Paul Pfeifer @ 2011-12-07 10:19 UTC (permalink / raw) To: Hagen Paul Pfeifer Cc: linux-wireless, netdev, bloat-devel, Eric Dumazet, bloat On Wed, 07 Dec 2011 11:15:38 +0100, Hagen Paul Pfeifer <hagen@jauu.net> wrote: > Question two: I submitted pfast_head_drop to drop more outdated data > instead of new data. Back in time I thought TCP _may_ experience benefits > because more up-to-date SACK data packets are saved. Are there any other > TCP advantages with head drop policy? Small comment: pfast_head_drop was intended for UDP. E.g. OLSR messages, regular GPS reporting and the like. TCP was not the focus at that time. HGN ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] Time in Queue, bufferbloat, and... our accidentally interplanetary network 2011-12-07 10:15 ` [Bloat] " Hagen Paul Pfeifer 2011-12-07 10:19 ` Hagen Paul Pfeifer @ 2011-12-07 11:53 ` Eric Dumazet 1 sibling, 0 replies; 11+ messages in thread From: Eric Dumazet @ 2011-12-07 11:53 UTC (permalink / raw) To: Hagen Paul Pfeifer; +Cc: bloat-devel, netdev, linux-wireless, bloat Le mercredi 07 décembre 2011 à 11:15 +0100, Hagen Paul Pfeifer a écrit : > On Mon, 05 Dec 2011 11:59:34 +0100, Eric Dumazet wrote: > > > > Adding a time limit is possible, all we need is a proper design and > > implementation :) > > > > Here is my suggestion : > > > > Design a new tfifo/tred qdisc, with following properties : > > > > Adaptative RED, (ECN enabled + head drop), but instead of using > > bytes/packet qlen average, use time_in_queue average. > > Question one: is anything wrong with sfb or choke as the basis, instead of > RED? > RED is the module to bring EWMA stuff, it seems natural to start with it. Please note that choke has a RED module too. Then later, we can add time limit stuff to other Qdisc if needed, its a plug anyway. But is there any meaning to compute a global EWMA after SFB/SFQ packet classification ? > Question two: I submitted pfast_head_drop to drop more outdated data > instead of new data. Back in time I thought TCP _may_ experience benefits > because more up-to-date SACK data packets are saved. Are there any other > TCP advantages with head drop policy? > Note that head drop is a consequence of time limit idea on top of FIFO, since only at dequeue time, we compute the delta between current time and enqueue time, and we drop/mark the (head) packet if time exceeds our current limit. In general, being able to drop/mark firsts packets in queue instead of last ones can let TCP sender be notified of congestion much earlier than a tail drop. (We gain the time to transmit whole packets in queue before receiver can report in its ACK the congestion back to sender) ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net-next] sch_red: Adaptative RED AQM 2011-12-05 10:59 ` Eric Dumazet 2011-12-06 2:03 ` Adrian Chadd 2011-12-07 10:15 ` [Bloat] " Hagen Paul Pfeifer @ 2011-12-08 16:06 ` Eric Dumazet 2011-12-08 17:21 ` Stephen Hemminger 2011-12-09 0:55 ` David Miller 2 siblings, 2 replies; 11+ messages in thread From: Eric Dumazet @ 2011-12-08 16:06 UTC (permalink / raw) To: Dave Taht, David Miller Cc: linux-wireless, netdev, Hagen Paul Pfeifer, bloat-devel, bloat Adaptative RED AQM for linux, based on paper from Sally FLoyd, Ramakrishna Gummadi, and Scott Shenker, August 2001 : http://icir.org/floyd/papers/adaptiveRed.pdf Goal of Adaptative RED is to make max_p a dynamic value between 1% and 50% to reach the target average queue : (max_th - min_th) / 2 Every 500 ms: if (avg > target and max_p <= 0.5) increase max_p : max_p += alpha; else if (avg < target and max_p >= 0.01) decrease max_p : max_p *= beta; target :[min_th + 0.4*(min_th - max_th), min_th + 0.6*(min_th - max_th)]. alpha : min(0.01, max_p / 4) beta : 0.9 max_P is a Q0.32 fixed point number (unsigned, with 32 bits mantissa) Changes against our RED implementation are : max_p is no longer a negative power of two (1/(2^Plog)), but a Q0.32 fixed point number, to allow full range described in Adatative paper. To deliver a random number, we now use a reciprocal divide (thats really a multiply), but this operation is done once per marked/droped packet when in RED_BETWEEN_TRESH window, so added cost (compared to previous AND operation) is near zero. dump operation gives current max_p value in a new TCA_RED_MAX_P attribute. Example on a 10Mbit link : tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 8sec red \ limit 400000 min 30000 max 90000 avpkt 1000 \ burst 55 ecn adaptative bandwidth 10Mbit # tc -s -d qdisc show dev eth3 ... qdisc red 10: parent 1:1 limit 400000b min 30000b max 90000b ecn adaptative ewma 5 max_p=0.113335 Scell_log 15 Sent 50414282 bytes 34504 pkt (dropped 35, overlimits 1392 requeues 0) rate 9749Kbit 831pps backlog 72056b 16p requeues 0 marked 1357 early 35 pdrop 0 other 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --- include/linux/pkt_sched.h | 6 +- include/net/red.h | 101 +++++++++++++++++++++++++++++------- lib/reciprocal_div.c | 2 net/sched/sch_red.c | 21 +++++++ 4 files changed, 111 insertions(+), 19 deletions(-) diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h index fb556dc..e41e0d4 100644 --- a/include/linux/pkt_sched.h +++ b/include/linux/pkt_sched.h @@ -181,6 +181,7 @@ enum { TCA_RED_UNSPEC, TCA_RED_PARMS, TCA_RED_STAB, + TCA_RED_MAX_P, __TCA_RED_MAX, }; @@ -194,8 +195,9 @@ struct tc_red_qopt { unsigned char Plog; /* log(P_max/(qth_max-qth_min)) */ unsigned char Scell_log; /* cell size for idle damping */ unsigned char flags; -#define TC_RED_ECN 1 -#define TC_RED_HARDDROP 2 +#define TC_RED_ECN 1 +#define TC_RED_HARDDROP 2 +#define TC_RED_ADAPTATIVE 4 }; struct tc_red_xstats { diff --git a/include/net/red.h b/include/net/red.h index b72a3b8..f4e9533 100644 --- a/include/net/red.h +++ b/include/net/red.h @@ -5,6 +5,7 @@ #include <net/pkt_sched.h> #include <net/inet_ecn.h> #include <net/dsfield.h> +#include <linux/reciprocal_div.h> /* Random Early Detection (RED) algorithm. ======================================= @@ -87,6 +88,29 @@ etc. */ +/* + * Adaptative RED : An Algorithm for Increasing the Robustness of RED's AQM + * (Sally FLoyd, Ramakrishna Gummadi, and Scott Shenker) August 2001 + * + * Every 500 ms: + * if (avg > target and max_p <= 0.5) + * increase max_p : max_p += alpha; + * else if (avg < target and max_p >= 0.01) + * decrease max_p : max_p *= beta; + * + * target :[qth_min + 0.4*(qth_min - qth_max), + * qth_min + 0.6*(qth_min - qth_max)]. + * alpha : min(0.01, max_p / 4) + * beta : 0.9 + * max_P is a Q0.32 fixed point number (with 32 bits mantissa) + * max_P between 0.01 and 0.5 (1% - 50%) [ Its no longer a negative power of two ] + */ +#define RED_ONE_PERCENT ((u32)DIV_ROUND_CLOSEST(1ULL<<32, 100)) + +#define MAX_P_MIN (1 * RED_ONE_PERCENT) +#define MAX_P_MAX (50 * RED_ONE_PERCENT) +#define MAX_P_ALPHA(val) min(MAX_P_MIN, val / 4) + #define RED_STAB_SIZE 256 #define RED_STAB_MASK (RED_STAB_SIZE - 1) @@ -101,10 +125,14 @@ struct red_stats { struct red_parms { /* Parameters */ - u32 qth_min; /* Min avg length threshold: A scaled */ - u32 qth_max; /* Max avg length threshold: A scaled */ + u32 qth_min; /* Min avg length threshold: Wlog scaled */ + u32 qth_max; /* Max avg length threshold: Wlog scaled */ u32 Scell_max; - u32 Rmask; /* Cached random mask, see red_rmask */ + u32 max_P; /* probability, [0 .. 1.0] 32 scaled */ + u32 max_P_reciprocal; /* reciprocal_value(max_P / qth_delta) */ + u32 qth_delta; /* max_th - min_th */ + u32 target_min; /* min_th + 0.4*(max_th - min_th) */ + u32 target_max; /* min_th + 0.6*(max_th - min_th) */ u8 Scell_log; u8 Wlog; /* log(W) */ u8 Plog; /* random number bits */ @@ -115,19 +143,22 @@ struct red_parms { number generation */ u32 qR; /* Cached random number */ - unsigned long qavg; /* Average queue length: A scaled */ + unsigned long qavg; /* Average queue length: Wlog scaled */ ktime_t qidlestart; /* Start of current idle period */ }; -static inline u32 red_rmask(u8 Plog) +static inline u32 red_maxp(u8 Plog) { - return Plog < 32 ? ((1 << Plog) - 1) : ~0UL; + return Plog < 32 ? (~0U >> Plog) : ~0U; } + static inline void red_set_parms(struct red_parms *p, u32 qth_min, u32 qth_max, u8 Wlog, u8 Plog, u8 Scell_log, u8 *stab) { + int delta = qth_max - qth_min; + /* Reset average queue length, the value is strictly bound * to the parameters below, reseting hurts a bit but leaving * it might result in an unreasonable qavg for a while. --TGR @@ -139,14 +170,29 @@ static inline void red_set_parms(struct red_parms *p, p->qth_max = qth_max << Wlog; p->Wlog = Wlog; p->Plog = Plog; - p->Rmask = red_rmask(Plog); + if (delta < 0) + delta = 1; + p->qth_delta = delta; + p->max_P = red_maxp(Plog); + p->max_P *= delta; /* max_P = (qth_max-qth_min)/2^Plog */ + + p->max_P_reciprocal = reciprocal_value(p->max_P / delta); + + /* RED Adaptative target : + * [min_th + 0.4*(min_th - max_th), + * min_th + 0.6*(min_th - max_th)]. + */ + delta /= 5; + p->target_min = qth_min + 2*delta; + p->target_max = qth_min + 3*delta; + p->Scell_log = Scell_log; p->Scell_max = (255 << Scell_log); memcpy(p->Stab, stab, sizeof(p->Stab)); } -static inline int red_is_idling(struct red_parms *p) +static inline int red_is_idling(const struct red_parms *p) { return p->qidlestart.tv64 != 0; } @@ -168,7 +214,7 @@ static inline void red_restart(struct red_parms *p) p->qcount = -1; } -static inline unsigned long red_calc_qavg_from_idle_time(struct red_parms *p) +static inline unsigned long red_calc_qavg_from_idle_time(const struct red_parms *p) { s64 delta = ktime_us_delta(ktime_get(), p->qidlestart); long us_idle = min_t(s64, delta, p->Scell_max); @@ -215,7 +261,7 @@ static inline unsigned long red_calc_qavg_from_idle_time(struct red_parms *p) } } -static inline unsigned long red_calc_qavg_no_idle_time(struct red_parms *p, +static inline unsigned long red_calc_qavg_no_idle_time(const struct red_parms *p, unsigned int backlog) { /* @@ -230,7 +276,7 @@ static inline unsigned long red_calc_qavg_no_idle_time(struct red_parms *p, return p->qavg + (backlog - (p->qavg >> p->Wlog)); } -static inline unsigned long red_calc_qavg(struct red_parms *p, +static inline unsigned long red_calc_qavg(const struct red_parms *p, unsigned int backlog) { if (!red_is_idling(p)) @@ -239,23 +285,24 @@ static inline unsigned long red_calc_qavg(struct red_parms *p, return red_calc_qavg_from_idle_time(p); } -static inline u32 red_random(struct red_parms *p) + +static inline u32 red_random(const struct red_parms *p) { - return net_random() & p->Rmask; + return reciprocal_divide(net_random(), p->max_P_reciprocal); } -static inline int red_mark_probability(struct red_parms *p, unsigned long qavg) +static inline int red_mark_probability(const struct red_parms *p, unsigned long qavg) { /* The formula used below causes questions. - OK. qR is random number in the interval 0..Rmask + OK. qR is random number in the interval + (0..1/max_P)*(qth_max-qth_min) i.e. 0..(2^Plog). If we used floating point arithmetics, it would be: (2^Plog)*rnd_num, where rnd_num is less 1. Taking into account, that qavg have fixed - point at Wlog, and Plog is related to max_P by - max_P = (qth_max-qth_min)/2^Plog; two lines + point at Wlog, two lines below have the following floating point equivalent: max_P*(qavg - qth_min)/(qth_max-qth_min) < rnd/qcount @@ -315,4 +362,24 @@ static inline int red_action(struct red_parms *p, unsigned long qavg) return RED_DONT_MARK; } +static inline void red_adaptative_algo(struct red_parms *p) +{ + unsigned long qavg; + u32 max_p_delta; + + qavg = p->qavg; + if (red_is_idling(p)) + qavg = red_calc_qavg_from_idle_time(p); + + /* p->qavg is fixed point number with point at Wlog */ + qavg >>= p->Wlog; + + if (qavg > p->target_max && p->max_P <= MAX_P_MAX) + p->max_P += MAX_P_ALPHA(p->max_P); /* maxp = maxp + alpha */ + else if (qavg < p->target_min && p->max_P >= MAX_P_MIN) + p->max_P = (p->max_P/10)*9; /* maxp = maxp * Beta */ + + max_p_delta = DIV_ROUND_CLOSEST(p->max_P, p->qth_delta); + p->max_P_reciprocal = reciprocal_value(max_p_delta); +} #endif diff --git a/lib/reciprocal_div.c b/lib/reciprocal_div.c index 6a3bd48..75510e9 100644 --- a/lib/reciprocal_div.c +++ b/lib/reciprocal_div.c @@ -1,5 +1,6 @@ #include <asm/div64.h> #include <linux/reciprocal_div.h> +#include <linux/export.h> u32 reciprocal_value(u32 k) { @@ -7,3 +8,4 @@ u32 reciprocal_value(u32 k) do_div(val, k); return (u32)val; } +EXPORT_SYMBOL(reciprocal_value); diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c index d617161..8f5a85b 100644 --- a/net/sched/sch_red.c +++ b/net/sched/sch_red.c @@ -39,6 +39,7 @@ struct red_sched_data { u32 limit; /* HARD maximal queue length */ unsigned char flags; + struct timer_list adapt_timer; struct red_parms parms; struct red_stats stats; struct Qdisc *qdisc; @@ -161,6 +162,8 @@ static void red_reset(struct Qdisc *sch) static void red_destroy(struct Qdisc *sch) { struct red_sched_data *q = qdisc_priv(sch); + + del_timer_sync(&q->adapt_timer); qdisc_destroy(q->qdisc); } @@ -209,6 +212,10 @@ static int red_change(struct Qdisc *sch, struct nlattr *opt) ctl->Plog, ctl->Scell_log, nla_data(tb[TCA_RED_STAB])); + del_timer(&q->adapt_timer); + if (ctl->flags & TC_RED_ADAPTATIVE) + mod_timer(&q->adapt_timer, jiffies + HZ/2); + if (!q->qdisc->q.qlen) red_start_of_idle_period(&q->parms); @@ -216,11 +223,24 @@ static int red_change(struct Qdisc *sch, struct nlattr *opt) return 0; } +static inline void red_adaptative_timer(unsigned long arg) +{ + struct Qdisc *sch = (struct Qdisc *)arg; + struct red_sched_data *q = qdisc_priv(sch); + spinlock_t *root_lock = qdisc_lock(qdisc_root_sleeping(sch)); + + spin_lock(root_lock); + red_adaptative_algo(&q->parms); + mod_timer(&q->adapt_timer, jiffies + HZ/2); + spin_unlock(root_lock); +} + static int red_init(struct Qdisc *sch, struct nlattr *opt) { struct red_sched_data *q = qdisc_priv(sch); q->qdisc = &noop_qdisc; + setup_timer(&q->adapt_timer, red_adaptative_timer, (unsigned long)sch); return red_change(sch, opt); } @@ -243,6 +263,7 @@ static int red_dump(struct Qdisc *sch, struct sk_buff *skb) if (opts == NULL) goto nla_put_failure; NLA_PUT(skb, TCA_RED_PARMS, sizeof(opt), &opt); + NLA_PUT_U32(skb, TCA_RED_MAX_P, q->parms.max_P); return nla_nest_end(skb, opts); nla_put_failure: ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] sch_red: Adaptative RED AQM 2011-12-08 16:06 ` [PATCH net-next] sch_red: Adaptative RED AQM Eric Dumazet @ 2011-12-08 17:21 ` Stephen Hemminger 2011-12-08 18:16 ` Eric Dumazet 2011-12-09 0:55 ` David Miller 1 sibling, 1 reply; 11+ messages in thread From: Stephen Hemminger @ 2011-12-08 17:21 UTC (permalink / raw) To: Eric Dumazet Cc: bloat-devel, netdev, Hagen Paul Pfeifer, linux-wireless, bloat, David Miller On Thu, 08 Dec 2011 17:06:03 +0100 Eric Dumazet <eric.dumazet@gmail.com> wrote: > Changes against our RED implementation are : > > max_p is no longer a negative power of two (1/(2^Plog)), but a Q0.32 > fixed point number, to allow full range described in Adatative paper. > > To deliver a random number, we now use a reciprocal divide (thats really > a multiply), but this operation is done once per marked/droped packet > when in RED_BETWEEN_TRESH window, so added cost (compared to previous > AND operation) is near zero. > > dump operation gives current max_p value in a new TCA_RED_MAX_P > attribute. > > Example on a 10Mbit link : > > tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 8sec red \ > limit 400000 min 30000 max 90000 avpkt 1000 \ > burst 55 ecn adaptative bandwidth 10Mbit > > # tc -s -d qdisc show dev eth3 > ... > qdisc red 10: parent 1:1 limit 400000b min 30000b max 90000b ecn > adaptative ewma 5 max_p=0.113335 Scell_log 15 > Sent 50414282 bytes 34504 pkt (dropped 35, overlimits 1392 requeues 0) > rate 9749Kbit 831pps backlog 72056b 16p requeues 0 > marked 1357 early 35 pdrop 0 other 0 > > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Is this backward compatible for users that don't specify an adaptive parameter. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] sch_red: Adaptative RED AQM 2011-12-08 17:21 ` Stephen Hemminger @ 2011-12-08 18:16 ` Eric Dumazet 0 siblings, 0 replies; 11+ messages in thread From: Eric Dumazet @ 2011-12-08 18:16 UTC (permalink / raw) To: Stephen Hemminger Cc: bloat-devel, netdev, Hagen Paul Pfeifer, linux-wireless, bloat, David Miller Le jeudi 08 décembre 2011 à 09:21 -0800, Stephen Hemminger a écrit : > Is this backward compatible for users that don't specify > an adaptive parameter. Yes, completely compatible. I made sure max_p was initted with value provided by tc (Plog based) I also plan the TCA_RED_MAX_P support at configuration time with finer granularity than 1/(2^Plog), so a new tc user can ask for precise probability instead of current few discrete values. [ for non adaptative RED of course, since max_p can change very fast otherwise :) ] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next] sch_red: Adaptative RED AQM 2011-12-08 16:06 ` [PATCH net-next] sch_red: Adaptative RED AQM Eric Dumazet 2011-12-08 17:21 ` Stephen Hemminger @ 2011-12-09 0:55 ` David Miller 1 sibling, 0 replies; 11+ messages in thread From: David Miller @ 2011-12-09 0:55 UTC (permalink / raw) To: eric.dumazet; +Cc: bloat-devel, netdev, hagen, linux-wireless, bloat From: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu, 08 Dec 2011 17:06:03 +0100 > Adaptative RED AQM for linux, based on paper from Sally FLoyd, > Ramakrishna Gummadi, and Scott Shenker, August 2001 : Applied. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-12-09 0:57 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-12-05 9:05 Time in Queue, bufferbloat, and... our accidentally interplanetary network Dave Taht 2011-12-05 10:59 ` Eric Dumazet 2011-12-06 2:03 ` Adrian Chadd 2011-12-07 9:59 ` Dave Taht 2011-12-07 10:15 ` [Bloat] " Hagen Paul Pfeifer 2011-12-07 10:19 ` Hagen Paul Pfeifer 2011-12-07 11:53 ` Eric Dumazet 2011-12-08 16:06 ` [PATCH net-next] sch_red: Adaptative RED AQM Eric Dumazet 2011-12-08 17:21 ` Stephen Hemminger 2011-12-08 18:16 ` Eric Dumazet 2011-12-09 0:55 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox