From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dpreed@reed.com>
Received: from smtp191.iad.emailsrvr.com (smtp191.iad.emailsrvr.com
	[207.97.245.191])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by huchra.bufferbloat.net (Postfix) with ESMTPS id A0EFE21F0B5;
	Fri, 24 Aug 2012 08:29:40 -0700 (PDT)
Received: from localhost (localhost.localdomain [127.0.0.1])
	by smtp39.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id
	611F098394; Fri, 24 Aug 2012 11:29:39 -0400 (EDT)
X-Virus-Scanned: OK
Received: from legacy7.wa-web.iad1a (legacy7.wa-web.iad1a.rsapps.net
	[192.168.2.216])
	by smtp39.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id
	3954C98389; Fri, 24 Aug 2012 11:29:39 -0400 (EDT)
Received: from reed.com (localhost [127.0.0.1])
	by legacy7.wa-web.iad1a (Postfix) with ESMTP id 25A3F3200B0;
	Fri, 24 Aug 2012 11:29:39 -0400 (EDT)
Received: by apps.rackspace.com
	(Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) 
	with HTTP; Fri, 24 Aug 2012 11:29:39 -0400 (EDT)
Date: Fri, 24 Aug 2012 11:29:39 -0400 (EDT)
From: dpreed@reed.com
To: "Dave Taht" <dave.taht@gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_20120824112939000000_80955"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To: <CAA93jw6PWBxjwbCRcWNpBBdVhezjEwGNWQYAEeaLbdwsWPd3Zw@mail.gmail.com>
References: <1345730322-6255-1-git-send-email-dave.taht@bufferbloat.net> 
	<1345730322-6255-3-git-send-email-dave.taht@bufferbloat.net> 
	<1345749433.571112868@apps.rackspace.com> 
	<CAA93jw6PWBxjwbCRcWNpBBdVhezjEwGNWQYAEeaLbdwsWPd3Zw@mail.gmail.com>
Message-ID: <1345822179.152622513@apps.rackspace.com>
X-Mailer: webmail7.0
Cc: codel@lists.bufferbloat.net, cerowrt-devel@lists.bufferbloat.net,
	=?utf-8?Q?Dave_T=C3=A4ht?= <dave.taht@bufferbloat.net>
Subject: Re: [Codel] [Cerowrt-devel] [PATCH 2/2] codel: reduce count after
	exiting dropping state after one maxpacket
X-BeenThere: codel@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: CoDel AQM discussions <codel.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/codel>,
	<mailto:codel-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/codel>
List-Post: <mailto:codel@lists.bufferbloat.net>
List-Help: <mailto:codel-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/codel>,
	<mailto:codel-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 24 Aug 2012 15:29:41 -0000

------=_20120824112939000000_80955
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=0AOne other thing that worries me is this:  if you have 100 TCP connection=
s all trying to get through a congested link, each "drop" signals only one =
of those connections that there is a need to cut the receive window size (b=
y a factor of 2).   So if each of the 100 connections has a window size tha=
t corresponds to the bitrate-delay-product of the channel, you need on the =
order of 100 log2(100) ~ 650 drops to "fairly" slow all of the 100 connecti=
ons down to each have a bitrate-delay-product equal to the 1/100 fair share=
, at minimum.  Because of this: you have to signal all 100 independent end-=
to-end control loops enough times to slow their rate.=0A =0AAvoiding "drops=
" means that it takes much longer to get to the right, relatively fair, ope=
rating point.   And even if you are near the right operating point already,=
 being selective on drops is pretty "unfair" in a statistical sense.=0A =0A=
Also, remember that "fair" should not try to give capacity to TCP connectio=
ns that are already constrained by other bottleneck links.   You can't give=
 them capacity, usefully - doing so creates buffer "inflation" pressures on=
 those links, which causes unnecessary oscillation in the other links.=0A =
=0ARemember that the Internet is not at a "stable equilibrium" ever - "wave=
s" flow through it, and the goal is to not "amplify" the waves.   The end-t=
o-end control loops are the damping functions, while the link queue managem=
ent (with drops and ECN) merely provide signals and keep latency short.  Th=
ey can't "damp" traffic and may amplify if they try misguidedly to damp out=
 waves.=0A =0A-----Original Message-----=0AFrom: "Dave Taht" <dave.taht@gma=
il.com>=0ASent: Friday, August 24, 2012 4:08am=0ATo: dpreed@reed.com=0ACc: =
"Dave T=C3=A4ht" <dave.taht@bufferbloat.net>, cerowrt-devel@lists.bufferblo=
at.net, codel@lists.bufferbloat.net=0ASubject: Re: [Cerowrt-devel] [PATCH 2=
/2] codel: reduce count after exiting dropping state after one maxpacket=0A=
=0A=0A=0AOn Thu, Aug 23, 2012 at 12:17 PM,  <dpreed@reed.com> wrote:=0A> I =
would be cautious about cutting down the information rate to TCP.=0A> Remem=
ber the collection of TCP pairs are already seeking this information.=0A> Y=
ou risk driving TCP's control loop crazy, especially if you end up=0A> "res=
onating" with it with positive feedback.=0A=0AWe have been wrestling what w=
e call s3 (re-entering dropping state)=0Afor a very long time. From k&v's c=
urrent test ns2 code:=0A=0Ahttps://github.com/dtaht/ns2/blob/master/queue/c=
odel.cc=0A=0A...=0A=0A r =3D dodeque();=0A dropping_ =3D 1; // mdt: thought=
 was to to change this to r.ok=0A // and do something mildly different belo=
w=0A=0A // If min went above target close to when it last went below,=0A //=
 assume that the drop rate that controlled the queue on the=0A // last cycl=
e is a good starting point to control it now.=0A // Unfortunately, this is =
not easy to get at. "n*interval" is=0A // used to indicate "close to" and v=
alues from 2 - 16 have been=0A // used. A value of 8 has worked well. The c=
ount value doesn't=0A // decay well enough with this control law. Until a b=
etter one=0A // is devised, the below is a hack that appears to improve thi=
ngs.=0A=0A if(count_ > 2 && now- drop_next_ < 8*interval_) {=0A count_ =3D =
count_ - 2;=0A // kmn decay tests // mdt - floating point is not an option!=
=0A if(count_ > 126) count_ =3D 0.9844 * (count_ + 2);=0A=0A } else=0A coun=
t_ =3D 1;=0A drop_next_ =3D control_law(now);=0A }=0A return (r.p);=0A=0A=
=0A=0A>=0A>=0A>=0A> Rather than focus on "precision" or "stability" at the =
contended resource,=0A> remember the point is to minimize latency on an end=
-to-end basis so that all=0A> the TCP's can have tight control loops.  Thus=
 there is no "goal" to making=0A> the contended queue maximized in capacity=
.  The goal is that it should drain=0A> quickly, which means keeping it sma=
ll!   The goal is (as the codel paper=0A> says) to use the *minimum* latenc=
y as the estimator, not the "average" or=0A> weighted (delayed) average.=0A=
>=0A>=0A>=0A> Keep focused on the control theory principles, NOT on through=
put=0A> maximization.  If you push toward maximizing throughput at any cost=
, even=0A> with codel, you will end up with *large* excursions in latency t=
hat stick=0A> for a long time.  I.e. bufferbloat.=0A=0AI might argue from a=
 control theory perspective that we're not banging=0Aon or observing all th=
e right things.=0A=0AI was dubious about the value of the 2nd patch, given =
the more=0Aaggressive decline in drop probability I already had vs a vs the=
 ns2=0Acode, (which was admittedly far less aggressive than the original=0A=
linux code). Might retain=0Athe idea, flip it around some more, certainly p=
lan to scratch head a=0Alot... was glad to figure out newton didn't run in =
reverse that well,=0Atho.=0A=0AAnyway, in my limited testing today the 3rd =
version of the patches=0Aworked fine at short RTTs and at a range of bandwi=
dths from 1Mbit to=0A100Mbit, but that's the easy part and plenty more fidd=
ling remains=0Awith more streams and longer RTTs.=0A=0Ahttp://snapon.lab.bu=
fferbloat.net/~d/fq_codel/rtts.png=0A=0A(It's also somewhat weird getting u=
sed to nearly perfect=0Abi-directional utilization all the time in fq_codel=
 - no matter what=0Aunderlying version=0Aof codel is there)=0A=0AIncidental=
ly there is this 10 second periodicity to my captures that bothers me=0A=0A=
http://snapon.lab.bufferbloat.net/~d/fq_codel/alwaysthoughtthese10secondint=
ervalswereaflawofthecapturetoolbut.png=0A=0AAnyway=0A=0A=0A=0A>=0A> -----Or=
iginal Message-----=0A> From: "Dave T=C3=A4ht" <dave.taht@bufferbloat.net>=
=0A> Sent: Thursday, August 23, 2012 9:58am=0A> To: cerowrt-devel@lists.buf=
ferbloat.net=0A> Subject: [Cerowrt-devel] [PATCH 2/2] codel: reduce count a=
fter exiting=0A> dropping state after one maxpacket=0A>=0A> From: Dave Taht=
 <dave.taht@bufferbloat.net>=0A>=0A> At a knife's edge, where we are rapidl=
y entering and existing=0A> a dropping state, seek lower to find the optimi=
mum.=0A> ---=0A> include/net/codel.h | 3 +++=0A> 1 file changed, 3 insertio=
ns(+)=0A>=0A> diff --git a/include/net/codel.h b/include/net/codel.h=0A> in=
dex dbfccb7..5e85632 100644=0A> --- a/include/net/codel.h=0A> +++ b/include=
/net/codel.h=0A> @@ -342,6 +342,9 @@ static struct sk_buff *codel_dequeue(s=
truct Qdisc *sch,=0A> vars->drop_next =3D codel_control_law(vars->drop_next=
,=0A> params->interval,=0A> vars->rec_inv_sqrt);=0A> + } else { /* we dropp=
ed out of the dropping state in 1 pkt */=0A> + vars->count =3D vars->count =
> 1 ? vars->count - 1 : 1;=0A> + codel_Newton_step(vars);=0A> }=0A> }=0A> e=
nd:=0A> --=0A> 1.7.9.5=0A>=0A> ____________________________________________=
___=0A> Cerowrt-devel mailing list=0A> Cerowrt-devel@lists.bufferbloat.net=
=0A> https://lists.bufferbloat.net/listinfo/cerowrt-devel=0A>=0A>=0A> _____=
__________________________________________=0A> Cerowrt-devel mailing list=
=0A> Cerowrt-devel@lists.bufferbloat.net=0A> https://lists.bufferbloat.net/=
listinfo/cerowrt-devel=0A>=0A=0A=0A=0A-- =0ADave T=C3=A4ht=0Ahttp://www.buf=
ferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out=0Awith fq_codel!"
------=_20120824112939000000_80955
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<font face=3D"arial" size=3D"2"><p style=3D"margin:0;padding:0;">One other =
thing that worries me is this:&nbsp; if you have 100 TCP connections all tr=
ying to get through a congested link, each "drop" signals only one of those=
 connections that there is a need to cut the receive window size (by a fact=
or of 2).&nbsp;&nbsp; So if each of the 100 connections has a window size t=
hat corresponds to the bitrate-delay-product of the channel, you need on th=
e order of 100 log2(100) ~ 650 drops to "fairly" slow all of the 100 connec=
tions down to each have a bitrate-delay-product equal to the 1/100 fair sha=
re, at minimum.&nbsp; Because of this: you have to signal all 100 independe=
nt end-to-end control loops enough times to slow their rate.</p>=0A<p style=
=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">Avoi=
ding "drops" means that it takes much longer to get to the right, relativel=
y fair, operating point.&nbsp;&nbsp; And even if you are near the right ope=
rating point already, being selective on drops is pretty "unfair" in a stat=
istical sense.</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=
=3D"margin:0;padding:0;">Also, remember that "fair" should not try to give =
capacity to TCP connections that are already constrained by other bottlenec=
k links.&nbsp;&nbsp; You can't give them capacity, usefully - doing so crea=
tes buffer "inflation" pressures on those links, which causes unnecessary o=
scillation in the other links.</p>=0A<p style=3D"margin:0;padding:0;">&nbsp=
;</p>=0A<p style=3D"margin:0;padding:0;">Remember that the Internet is not =
at a "stable equilibrium" ever - "waves" flow through it, and the goal is t=
o not "amplify" the waves.&nbsp;&nbsp; The end-to-end control loops are the=
 damping functions, while the link queue management (with drops and ECN) me=
rely provide signals and keep latency short.&nbsp; They can't "damp" traffi=
c and may amplify if they try misguidedly to damp out waves.</p>=0A<p style=
=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">----=
-Original Message-----<br />From: "Dave Taht" &lt;dave.taht@gmail.com&gt;<b=
r />Sent: Friday, August 24, 2012 4:08am<br />To: dpreed@reed.com<br />Cc: =
"Dave T=C3=A4ht" &lt;dave.taht@bufferbloat.net&gt;, cerowrt-devel@lists.buf=
ferbloat.net, codel@lists.bufferbloat.net<br />Subject: Re: [Cerowrt-devel]=
 [PATCH 2/2] codel: reduce count after exiting dropping state after one max=
packet<br /><br /></p>=0A<div id=3D"SafeStyles1345821168">=0A<p style=3D"ma=
rgin:0;padding:0;">On Thu, Aug 23, 2012 at 12:17 PM,  &lt;dpreed@reed.com&g=
t; wrote:<br />&gt; I would be cautious about cutting down the information =
rate to TCP.<br />&gt; Remember the collection of TCP pairs are already see=
king this information.<br />&gt; You risk driving TCP's control loop crazy,=
 especially if you end up<br />&gt; "resonating" with it with positive feed=
back.<br /><br />We have been wrestling what we call s3 (re-entering droppi=
ng state)<br />for a very long time. From k&amp;v's current test ns2 code:<=
br /><br />https://github.com/dtaht/ns2/blob/master/queue/codel.cc<br /><br=
 />...<br /><br /> r =3D dodeque();<br /> dropping_ =3D 1; // mdt: thought =
was to to change this to r.ok<br /> // and do something mildly different be=
low<br /><br /> // If min went above target close to when it last went belo=
w,<br /> // assume that the drop rate that controlled the queue on the<br /=
> // last cycle is a good starting point to control it now.<br /> // Unfort=
unately, this is not easy to get at. "n*interval" is<br /> // used to indic=
ate "close to" and values from 2 - 16 have been<br /> // used. A value of 8=
 has worked well. The count value doesn't<br /> // decay well enough with t=
his control law. Until a better one<br /> // is devised, the below is a hac=
k that appears to improve things.<br /><br /> if(count_ &gt; 2 &amp;&amp; n=
ow- drop_next_ &lt; 8*interval_) {<br /> count_ =3D count_ - 2;<br /> // km=
n decay tests // mdt - floating point is not an option!<br /> if(count_ &gt=
; 126) count_ =3D 0.9844 * (count_ + 2);<br /><br /> } else<br /> count_ =
=3D 1;<br /> drop_next_ =3D control_law(now);<br /> }<br /> return (r.p);<b=
r /><br /><br /><br />&gt;<br />&gt;<br />&gt;<br />&gt; Rather than focus =
on "precision" or "stability" at the contended resource,<br />&gt; remember=
 the point is to minimize latency on an end-to-end basis so that all<br />&=
gt; the TCP's can have tight control loops.  Thus there is no "goal" to mak=
ing<br />&gt; the contended queue maximized in capacity.  The goal is that =
it should drain<br />&gt; quickly, which means keeping it small!   The goal=
 is (as the codel paper<br />&gt; says) to use the *minimum* latency as the=
 estimator, not the "average" or<br />&gt; weighted (delayed) average.<br /=
>&gt;<br />&gt;<br />&gt;<br />&gt; Keep focused on the control theory prin=
ciples, NOT on throughput<br />&gt; maximization.  If you push toward maxim=
izing throughput at any cost, even<br />&gt; with codel, you will end up wi=
th *large* excursions in latency that stick<br />&gt; for a long time.  I.e=
. bufferbloat.<br /><br />I might argue from a control theory perspective t=
hat we're not banging<br />on or observing all the right things.<br /><br /=
>I was dubious about the value of the 2nd patch, given the more<br />aggres=
sive decline in drop probability I already had vs a vs the ns2<br />code, (=
which was admittedly far less aggressive than the original<br />linux code)=
. Might retain<br />the idea, flip it around some more, certainly plan to s=
cratch head a<br />lot... was glad to figure out newton didn't run in rever=
se that well,<br />tho.<br /><br />Anyway, in my limited testing today the =
3rd version of the patches<br />worked fine at short RTTs and at a range of=
 bandwidths from 1Mbit to<br />100Mbit, but that's the easy part and plenty=
 more fiddling remains<br />with more streams and longer RTTs.<br /><br />h=
ttp://snapon.lab.bufferbloat.net/~d/fq_codel/rtts.png<br /><br />(It's also=
 somewhat weird getting used to nearly perfect<br />bi-directional utilizat=
ion all the time in fq_codel - no matter what<br />underlying version<br />=
of codel is there)<br /><br />Incidentally there is this 10 second periodic=
ity to my captures that bothers me<br /><br />http://snapon.lab.bufferbloat=
.net/~d/fq_codel/alwaysthoughtthese10secondintervalswereaflawofthecaptureto=
olbut.png<br /><br />Anyway<br /><br /><br /><br />&gt;<br />&gt; -----Orig=
inal Message-----<br />&gt; From: "Dave T=C3=A4ht" &lt;dave.taht@bufferbloa=
t.net&gt;<br />&gt; Sent: Thursday, August 23, 2012 9:58am<br />&gt; To: ce=
rowrt-devel@lists.bufferbloat.net<br />&gt; Subject: [Cerowrt-devel] [PATCH=
 2/2] codel: reduce count after exiting<br />&gt; dropping state after one =
maxpacket<br />&gt;<br />&gt; From: Dave Taht &lt;dave.taht@bufferbloat.net=
&gt;<br />&gt;<br />&gt; At a knife's edge, where we are rapidly entering a=
nd existing<br />&gt; a dropping state, seek lower to find the optimimum.<b=
r />&gt; ---<br />&gt; include/net/codel.h | 3 +++<br />&gt; 1 file changed=
, 3 insertions(+)<br />&gt;<br />&gt; diff --git a/include/net/codel.h b/in=
clude/net/codel.h<br />&gt; index dbfccb7..5e85632 100644<br />&gt; --- a/i=
nclude/net/codel.h<br />&gt; +++ b/include/net/codel.h<br />&gt; @@ -342,6 =
+342,9 @@ static struct sk_buff *codel_dequeue(struct Qdisc *sch,<br />&gt;=
 vars-&gt;drop_next =3D codel_control_law(vars-&gt;drop_next,<br />&gt; par=
ams-&gt;interval,<br />&gt; vars-&gt;rec_inv_sqrt);<br />&gt; + } else { /*=
 we dropped out of the dropping state in 1 pkt */<br />&gt; + vars-&gt;coun=
t =3D vars-&gt;count &gt; 1 ? vars-&gt;count - 1 : 1;<br />&gt; + codel_New=
ton_step(vars);<br />&gt; }<br />&gt; }<br />&gt; end:<br />&gt; --<br />&g=
t; 1.7.9.5<br />&gt;<br />&gt; ____________________________________________=
___<br />&gt; Cerowrt-devel mailing list<br />&gt; Cerowrt-devel@lists.buff=
erbloat.net<br />&gt; https://lists.bufferbloat.net/listinfo/cerowrt-devel<=
br />&gt;<br />&gt;<br />&gt; _____________________________________________=
__<br />&gt; Cerowrt-devel mailing list<br />&gt; Cerowrt-devel@lists.buffe=
rbloat.net<br />&gt; https://lists.bufferbloat.net/listinfo/cerowrt-devel<b=
r />&gt;<br /><br /><br /><br />-- <br />Dave T=C3=A4ht<br />http://www.buf=
ferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out<br />with fq_codel!"<=
/p>=0A</div></font>
------=_20120824112939000000_80955--