From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp191.iad.emailsrvr.com (smtp191.iad.emailsrvr.com [207.97.245.191]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id A0EFE21F0B5; Fri, 24 Aug 2012 08:29:40 -0700 (PDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp39.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id 611F098394; Fri, 24 Aug 2012 11:29:39 -0400 (EDT) X-Virus-Scanned: OK Received: from legacy7.wa-web.iad1a (legacy7.wa-web.iad1a.rsapps.net [192.168.2.216]) by smtp39.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id 3954C98389; Fri, 24 Aug 2012 11:29:39 -0400 (EDT) Received: from reed.com (localhost [127.0.0.1]) by legacy7.wa-web.iad1a (Postfix) with ESMTP id 25A3F3200B0; Fri, 24 Aug 2012 11:29:39 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) with HTTP; Fri, 24 Aug 2012 11:29:39 -0400 (EDT) Date: Fri, 24 Aug 2012 11:29:39 -0400 (EDT) From: dpreed@reed.com To: "Dave Taht" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20120824112939000000_80955" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: References: <1345730322-6255-1-git-send-email-dave.taht@bufferbloat.net> <1345730322-6255-3-git-send-email-dave.taht@bufferbloat.net> <1345749433.571112868@apps.rackspace.com> Message-ID: <1345822179.152622513@apps.rackspace.com> X-Mailer: webmail7.0 Cc: codel@lists.bufferbloat.net, cerowrt-devel@lists.bufferbloat.net, =?utf-8?Q?Dave_T=C3=A4ht?= Subject: Re: [Codel] [Cerowrt-devel] [PATCH 2/2] codel: reduce count after exiting dropping state after one maxpacket X-BeenThere: codel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: CoDel AQM discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Aug 2012 15:29:41 -0000 ------=_20120824112939000000_80955 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AOne other thing that worries me is this: if you have 100 TCP connection= s all trying to get through a congested link, each "drop" signals only one = of those connections that there is a need to cut the receive window size (b= y a factor of 2). So if each of the 100 connections has a window size tha= t corresponds to the bitrate-delay-product of the channel, you need on the = order of 100 log2(100) ~ 650 drops to "fairly" slow all of the 100 connecti= ons down to each have a bitrate-delay-product equal to the 1/100 fair share= , at minimum. Because of this: you have to signal all 100 independent end-= to-end control loops enough times to slow their rate.=0A =0AAvoiding "drops= " means that it takes much longer to get to the right, relatively fair, ope= rating point. And even if you are near the right operating point already,= being selective on drops is pretty "unfair" in a statistical sense.=0A =0A= Also, remember that "fair" should not try to give capacity to TCP connectio= ns that are already constrained by other bottleneck links. You can't give= them capacity, usefully - doing so creates buffer "inflation" pressures on= those links, which causes unnecessary oscillation in the other links.=0A = =0ARemember that the Internet is not at a "stable equilibrium" ever - "wave= s" flow through it, and the goal is to not "amplify" the waves. The end-t= o-end control loops are the damping functions, while the link queue managem= ent (with drops and ECN) merely provide signals and keep latency short. Th= ey can't "damp" traffic and may amplify if they try misguidedly to damp out= waves.=0A =0A-----Original Message-----=0AFrom: "Dave Taht" =0ASent: Friday, August 24, 2012 4:08am=0ATo: dpreed@reed.com=0ACc: = "Dave T=C3=A4ht" , cerowrt-devel@lists.bufferblo= at.net, codel@lists.bufferbloat.net=0ASubject: Re: [Cerowrt-devel] [PATCH 2= /2] codel: reduce count after exiting dropping state after one maxpacket=0A= =0A=0A=0AOn Thu, Aug 23, 2012 at 12:17 PM, wrote:=0A> I = would be cautious about cutting down the information rate to TCP.=0A> Remem= ber the collection of TCP pairs are already seeking this information.=0A> Y= ou risk driving TCP's control loop crazy, especially if you end up=0A> "res= onating" with it with positive feedback.=0A=0AWe have been wrestling what w= e call s3 (re-entering dropping state)=0Afor a very long time. From k&v's c= urrent test ns2 code:=0A=0Ahttps://github.com/dtaht/ns2/blob/master/queue/c= odel.cc=0A=0A...=0A=0A r =3D dodeque();=0A dropping_ =3D 1; // mdt: thought= was to to change this to r.ok=0A // and do something mildly different belo= w=0A=0A // If min went above target close to when it last went below,=0A //= assume that the drop rate that controlled the queue on the=0A // last cycl= e is a good starting point to control it now.=0A // Unfortunately, this is = not easy to get at. "n*interval" is=0A // used to indicate "close to" and v= alues from 2 - 16 have been=0A // used. A value of 8 has worked well. The c= ount value doesn't=0A // decay well enough with this control law. Until a b= etter one=0A // is devised, the below is a hack that appears to improve thi= ngs.=0A=0A if(count_ > 2 && now- drop_next_ < 8*interval_) {=0A count_ =3D = count_ - 2;=0A // kmn decay tests // mdt - floating point is not an option!= =0A if(count_ > 126) count_ =3D 0.9844 * (count_ + 2);=0A=0A } else=0A coun= t_ =3D 1;=0A drop_next_ =3D control_law(now);=0A }=0A return (r.p);=0A=0A= =0A=0A>=0A>=0A>=0A> Rather than focus on "precision" or "stability" at the = contended resource,=0A> remember the point is to minimize latency on an end= -to-end basis so that all=0A> the TCP's can have tight control loops. Thus= there is no "goal" to making=0A> the contended queue maximized in capacity= . The goal is that it should drain=0A> quickly, which means keeping it sma= ll! The goal is (as the codel paper=0A> says) to use the *minimum* latenc= y as the estimator, not the "average" or=0A> weighted (delayed) average.=0A= >=0A>=0A>=0A> Keep focused on the control theory principles, NOT on through= put=0A> maximization. If you push toward maximizing throughput at any cost= , even=0A> with codel, you will end up with *large* excursions in latency t= hat stick=0A> for a long time. I.e. bufferbloat.=0A=0AI might argue from a= control theory perspective that we're not banging=0Aon or observing all th= e right things.=0A=0AI was dubious about the value of the 2nd patch, given = the more=0Aaggressive decline in drop probability I already had vs a vs the= ns2=0Acode, (which was admittedly far less aggressive than the original=0A= linux code). Might retain=0Athe idea, flip it around some more, certainly p= lan to scratch head a=0Alot... was glad to figure out newton didn't run in = reverse that well,=0Atho.=0A=0AAnyway, in my limited testing today the 3rd = version of the patches=0Aworked fine at short RTTs and at a range of bandwi= dths from 1Mbit to=0A100Mbit, but that's the easy part and plenty more fidd= ling remains=0Awith more streams and longer RTTs.=0A=0Ahttp://snapon.lab.bu= fferbloat.net/~d/fq_codel/rtts.png=0A=0A(It's also somewhat weird getting u= sed to nearly perfect=0Abi-directional utilization all the time in fq_codel= - no matter what=0Aunderlying version=0Aof codel is there)=0A=0AIncidental= ly there is this 10 second periodicity to my captures that bothers me=0A=0A= http://snapon.lab.bufferbloat.net/~d/fq_codel/alwaysthoughtthese10secondint= ervalswereaflawofthecapturetoolbut.png=0A=0AAnyway=0A=0A=0A=0A>=0A> -----Or= iginal Message-----=0A> From: "Dave T=C3=A4ht" = =0A> Sent: Thursday, August 23, 2012 9:58am=0A> To: cerowrt-devel@lists.buf= ferbloat.net=0A> Subject: [Cerowrt-devel] [PATCH 2/2] codel: reduce count a= fter exiting=0A> dropping state after one maxpacket=0A>=0A> From: Dave Taht= =0A>=0A> At a knife's edge, where we are rapidl= y entering and existing=0A> a dropping state, seek lower to find the optimi= mum.=0A> ---=0A> include/net/codel.h | 3 +++=0A> 1 file changed, 3 insertio= ns(+)=0A>=0A> diff --git a/include/net/codel.h b/include/net/codel.h=0A> in= dex dbfccb7..5e85632 100644=0A> --- a/include/net/codel.h=0A> +++ b/include= /net/codel.h=0A> @@ -342,6 +342,9 @@ static struct sk_buff *codel_dequeue(s= truct Qdisc *sch,=0A> vars->drop_next =3D codel_control_law(vars->drop_next= ,=0A> params->interval,=0A> vars->rec_inv_sqrt);=0A> + } else { /* we dropp= ed out of the dropping state in 1 pkt */=0A> + vars->count =3D vars->count = > 1 ? vars->count - 1 : 1;=0A> + codel_Newton_step(vars);=0A> }=0A> }=0A> e= nd:=0A> --=0A> 1.7.9.5=0A>=0A> ____________________________________________= ___=0A> Cerowrt-devel mailing list=0A> Cerowrt-devel@lists.bufferbloat.net= =0A> https://lists.bufferbloat.net/listinfo/cerowrt-devel=0A>=0A>=0A> _____= __________________________________________=0A> Cerowrt-devel mailing list= =0A> Cerowrt-devel@lists.bufferbloat.net=0A> https://lists.bufferbloat.net/= listinfo/cerowrt-devel=0A>=0A=0A=0A=0A-- =0ADave T=C3=A4ht=0Ahttp://www.buf= ferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out=0Awith fq_codel!" ------=_20120824112939000000_80955 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

One other = thing that worries me is this:  if you have 100 TCP connections all tr= ying to get through a congested link, each "drop" signals only one of those= connections that there is a need to cut the receive window size (by a fact= or of 2).   So if each of the 100 connections has a window size t= hat corresponds to the bitrate-delay-product of the channel, you need on th= e order of 100 log2(100) ~ 650 drops to "fairly" slow all of the 100 connec= tions down to each have a bitrate-delay-product equal to the 1/100 fair sha= re, at minimum.  Because of this: you have to signal all 100 independe= nt end-to-end control loops enough times to slow their rate.

=0A

 

=0A

Avoi= ding "drops" means that it takes much longer to get to the right, relativel= y fair, operating point.   And even if you are near the right ope= rating point already, being selective on drops is pretty "unfair" in a stat= istical sense.

=0A

 

=0A

Also, remember that "fair" should not try to give = capacity to TCP connections that are already constrained by other bottlenec= k links.   You can't give them capacity, usefully - doing so crea= tes buffer "inflation" pressures on those links, which causes unnecessary o= scillation in the other links.

=0A

 = ;

=0A

Remember that the Internet is not = at a "stable equilibrium" ever - "waves" flow through it, and the goal is t= o not "amplify" the waves.   The end-to-end control loops are the= damping functions, while the link queue management (with drops and ECN) me= rely provide signals and keep latency short.  They can't "damp" traffi= c and may amplify if they try misguidedly to damp out waves.

=0A

 

=0A

----= -Original Message-----
From: "Dave Taht" <dave.taht@gmail.com>Sent: Friday, August 24, 2012 4:08am
To: dpreed@reed.com
Cc: = "Dave T=C3=A4ht" <dave.taht@bufferbloat.net>, cerowrt-devel@lists.buf= ferbloat.net, codel@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel]= [PATCH 2/2] codel: reduce count after exiting dropping state after one max= packet

=0A
=0A

On Thu, Aug 23, 2012 at 12:17 PM, <dpreed@reed.com&g= t; wrote:
> I would be cautious about cutting down the information = rate to TCP.
> Remember the collection of TCP pairs are already see= king this information.
> You risk driving TCP's control loop crazy,= especially if you end up
> "resonating" with it with positive feed= back.

We have been wrestling what we call s3 (re-entering droppi= ng state)
for a very long time. From k&v's current test ns2 code:<= br />
https://github.com/dtaht/ns2/blob/master/queue/codel.cc
...

r =3D dodeque();
dropping_ =3D 1; // mdt: thought = was to to change this to r.ok
// and do something mildly different be= low

// If min went above target close to when it last went belo= w,
// assume that the drop rate that controlled the queue on the
// last cycle is a good starting point to control it now.
// Unfort= unately, this is not easy to get at. "n*interval" is
// used to indic= ate "close to" and values from 2 - 16 have been
// used. A value of 8= has worked well. The count value doesn't
// decay well enough with t= his control law. Until a better one
// is devised, the below is a hac= k that appears to improve things.

if(count_ > 2 && n= ow- drop_next_ < 8*interval_) {
count_ =3D count_ - 2;
// km= n decay tests // mdt - floating point is not an option!
if(count_ >= ; 126) count_ =3D 0.9844 * (count_ + 2);

} else
count_ = =3D 1;
drop_next_ =3D control_law(now);
}
return (r.p);


>
>
>
> Rather than focus = on "precision" or "stability" at the contended resource,
> remember= the point is to minimize latency on an end-to-end basis so that all
&= gt; the TCP's can have tight control loops. Thus there is no "goal" to mak= ing
> the contended queue maximized in capacity. The goal is that = it should drain
> quickly, which means keeping it small! The goal= is (as the codel paper
> says) to use the *minimum* latency as the= estimator, not the "average" or
> weighted (delayed) average.
>
>
>
> Keep focused on the control theory prin= ciples, NOT on throughput
> maximization. If you push toward maxim= izing throughput at any cost, even
> with codel, you will end up wi= th *large* excursions in latency that stick
> for a long time. I.e= . bufferbloat.

I might argue from a control theory perspective t= hat we're not banging
on or observing all the right things.

I was dubious about the value of the 2nd patch, given the more
aggres= sive decline in drop probability I already had vs a vs the ns2
code, (= which was admittedly far less aggressive than the original
linux code)= . Might retain
the idea, flip it around some more, certainly plan to s= cratch head a
lot... was glad to figure out newton didn't run in rever= se that well,
tho.

Anyway, in my limited testing today the = 3rd version of the patches
worked fine at short RTTs and at a range of= bandwidths from 1Mbit to
100Mbit, but that's the easy part and plenty= more fiddling remains
with more streams and longer RTTs.

h= ttp://snapon.lab.bufferbloat.net/~d/fq_codel/rtts.png

(It's also= somewhat weird getting used to nearly perfect
bi-directional utilizat= ion all the time in fq_codel - no matter what
underlying version
= of codel is there)

Incidentally there is this 10 second periodic= ity to my captures that bothers me

http://snapon.lab.bufferbloat= .net/~d/fq_codel/alwaysthoughtthese10secondintervalswereaflawofthecaptureto= olbut.png

Anyway



>
> -----Orig= inal Message-----
> From: "Dave T=C3=A4ht" <dave.taht@bufferbloa= t.net>
> Sent: Thursday, August 23, 2012 9:58am
> To: ce= rowrt-devel@lists.bufferbloat.net
> Subject: [Cerowrt-devel] [PATCH= 2/2] codel: reduce count after exiting
> dropping state after one = maxpacket
>
> From: Dave Taht <dave.taht@bufferbloat.net= >
>
> At a knife's edge, where we are rapidly entering a= nd existing
> a dropping state, seek lower to find the optimimum.> ---
> include/net/codel.h | 3 +++
> 1 file changed= , 3 insertions(+)
>
> diff --git a/include/net/codel.h b/in= clude/net/codel.h
> index dbfccb7..5e85632 100644
> --- a/i= nclude/net/codel.h
> +++ b/include/net/codel.h
> @@ -342,6 = +342,9 @@ static struct sk_buff *codel_dequeue(struct Qdisc *sch,
>= vars->drop_next =3D codel_control_law(vars->drop_next,
> par= ams->interval,
> vars->rec_inv_sqrt);
> + } else { /*= we dropped out of the dropping state in 1 pkt */
> + vars->coun= t =3D vars->count > 1 ? vars->count - 1 : 1;
> + codel_New= ton_step(vars);
> }
> }
> end:
> --
&g= t; 1.7.9.5
>
> ____________________________________________= ___
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.buff= erbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel<= br />>
>
> _____________________________________________= __
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.buffe= rbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel>



--
Dave T=C3=A4ht
http://www.buf= ferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out
with fq_codel!"<= /p>=0A

------=_20120824112939000000_80955--