From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dpreed@reed.com>
Received: from smtp97.iad3a.emailsrvr.com (smtp97.iad3a.emailsrvr.com
	[173.203.187.97])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 94FD121F36E
	for <cerowrt-devel@lists.bufferbloat.net>;
	Fri, 16 May 2014 13:17:08 -0700 (PDT)
Received: from localhost (localhost.localdomain [127.0.0.1])
	by smtp13.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id
	83B3D1280B1; Fri, 16 May 2014 16:17:07 -0400 (EDT)
X-Virus-Scanned: OK
Received: from app30.wa-webapps.iad3a (relay.iad3a.rsapps.net [172.27.255.110])
	by smtp13.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id
	61ACD1280AB; Fri, 16 May 2014 16:17:07 -0400 (EDT)
Received: from reed.com (localhost.localdomain [127.0.0.1])
	by app30.wa-webapps.iad3a (Postfix) with ESMTP id 508508005F;
	Fri, 16 May 2014 16:17:07 -0400 (EDT)
Received: by apps.rackspace.com
	(Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) 
	with HTTP; Fri, 16 May 2014 16:17:07 -0400 (EDT)
Date: Fri, 16 May 2014 16:17:07 -0400 (EDT)
From: dpreed@reed.com
To: "Jim Gettys" <jg@freedesktop.org>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_20140516161707000000_90205"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To: <CAGhGL2BiJQ9XyDbf1Dm1PEdSL6DH4RpdXMHnCRk-E1Tn-821vQ@mail.gmail.com>
References: <ACF89699-67A0-4853-843F-CAC9BDE97CCB@gmail.com> 
	<3583FA43-EF5B-42D7-A79C-54C87AA514D5@gmail.com> 
	<5373EEFF.5030407@pollere.com> 
	<CAA93jw5Ypw+N8TcJK4tss6o7dtwgnpicLvEt4m1MiqkzP77FQg@mail.gmail.com> 
	<1400161663.82913275@apps.rackspace.com> <5374EC39.7040902@pollere.com>
	<1400185975.95298344@apps.rackspace.com> 
	<15152.1400251979@turing-police.cc.vt.edu> 
	<CAGhGL2BiJQ9XyDbf1Dm1PEdSL6DH4RpdXMHnCRk-E1Tn-821vQ@mail.gmail.com>
Message-ID: <1400271427.32846859@apps.rackspace.com>
X-Mailer: webmail7.0
Cc: Kathleen Nichols <nichols@pollere.com>,
	cerowrt-devel <cerowrt-devel@lists.bufferbloat.net>,
	bloat <bloat@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel]
 =?utf-8?q?=5BBloat=5D_fq=5Fcodel_is_two_years_old?=
X-BeenThere: cerowrt-devel@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Development issues regarding the cerowrt test router project
	<cerowrt-devel.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cerowrt-devel>
List-Post: <mailto:cerowrt-devel@lists.bufferbloat.net>
List-Help: <mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 16 May 2014 20:17:09 -0000

------=_20140516161707000000_90205
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=0AI agree with you Jim about being careful with QoS.  That's why Andrew Od=
lyzko proposed the experiment with exactly two classes, and proposed it as =
an *experiment*. So many researchers and IETF members seem to think we shou=
ld just turn on diffserv and everything will work great... I've seen very s=
enior members of IETF actually propose diffserv become a provider-wide stan=
dard as soon as possible.  I suppose they have a couple of ns2 runs that sh=
ow "nothing can go wrong". :-)=0A =0A(that's why I'm so impressed by the fq=
_codel work - it's more than just simulation, but has been tested and more =
or less stressed in real life, yet it is quite simple).=0A =0AI don't agree=
 with the idea that switches alone can solve global system problems by them=
selves. That's why the original AIMD algorithms use packet drops as signals=
, but make the endpoints responsible for managing congestion.  The switches=
 have nothing to do with the AIMD algorithm, they just create the control i=
nputs.=0A =0ASo it is kind of telling that Valdis cites a totally "switch-c=
entric" view from NANOG's perspective.  It's not the job of switches to man=
age congestion, just as it is not the job of endpoints to program switches.=
  There's a separation of concerns.=0A =0AThe simpler observation would be =
"if you are a switch, there is NOTHING you can do to stop congestion.  Even=
 dropping packets doesn't ameliorate congestion.  However, if you are a swi=
tch there are some things you can tell the endpoints, in particular the rec=
eiving endpoints of flows traveling across the switch, about the local 'col=
lision' of packets trying to get through the switch at the same time."=0A =
=0ASince the Internet end-to-end protocols are "receiver controlled" (TCP's=
 receive window is what controls the sender's flow, but it is set by the re=
ceiver), the locus of decision making is the collection of receivers.=0A =
=0ABuffering is not the real issue - the issue is the frequency with which =
the packets of all the flows going through a particular switch "collide".  =
The control problem is to make that frequency of collision quite small.=0A =
=0AThe nice thing about packet drops is that collisions are remediated imme=
diately, rather than creating sustained bottlenecks that increase the "coll=
ision cross section" of that switch, increasing the likelihood of collision=
s in the switch dramatically. Replacing a collided/dropped packet with a mu=
ch smaller "token" that goes on to the receiver would keep the collision cr=
oss section from growing, but provide better samples of collision info to t=
he receiver.  For fairness, you want all packets involved in a collision to=
 carry information, and ideally all "near collisions" to also carry informa=
tion about near collisions.=0A =0AA collision in this is simply defined: a =
packet that enters a switch collides with any other packets that have not c=
ompleted traversal of the switch when the packet arrives is considered to h=
ave collided with those packets.=0A =0AYou can expand packets' virtual time=
 in the switch by thinking of them as "virtually still in the switch" for s=
ome number of bit times after they exit.   Then a "near collision" happens =
between a packet and any packets that are still virtually in the switch.  N=
ear collisions are signals that can keep the system inside the "ballistic r=
egion" of the phase space.=0A =0A(you can track near collisions by a little=
 memory on each outbound link state - and even use Bloom Filters to quickly=
 detect collisions, but that is for a different lecture).=0A =0APlease stea=
l this idea and develop it.=0A =0A =0A =0A=0A=0AOn Friday, May 16, 2014 12:=
06pm, "Jim Gettys" <jg@freedesktop.org> said:=0A=0A=0A=0A=0A=0A=0A=0AOn Fri=
, May 16, 2014 at 10:52 AM,  <[Valdis.Kletnieks@vt.edu](mailto:Valdis.Kletn=
ieks@vt.edu)> wrote:=0A=0AOn Thu, 15 May 2014 16:32:55 -0400, [dpreed@reed.=
com](mailto:dpreed@reed.com) said:=0A=0A > And in the end of the day, the p=
roblem is congestion, which is very=0A > non-linear.  There is almost no co=
ngestion at almost all places in the Internet=0A > at any particular time. =
 You can't fix congestion locally - you have to slow=0A > down the sources =
across all of the edge of the Internet, quickly.=0A=0AThere's a second very=
 important point that somebody mentioned on the NANOG=0A list a while ago:=
=0A=0A If the local router/net/link/whatever isn't congested, QoS cannot do=
 anything=0A to improve life for anybody.=0A=0A If there *is* congestion, Q=
oS can only improve your service to the normal=0A uncongested state - and i=
t can *only do so by making somebody else's experience=0A suck more*....=0A=
=0A=E2=80=8BThe somebody else might be "you", in which life is much better.=
=E2=80=8B  once you have the concept of flows (at some level of abstraction=
), you can make more sane choices.=0A=E2=80=8BPersonally, I've mostly been =
interested in QOS in the local network: as "hints", for example, that it is=
 worth more aggressive bidding for transmit opportunities in WiFi, for exam=
ple to ensure my VOIP, teleconferencing, gaming, music playing and other ac=
tually real time packets get priority over bulk data (which includes web tr=
affic), and may need access to the medium sooner than for routine applicati=
ons or scavenger applications.=0AWhether it should have any use beyond the =
scope of the network that I control is less than clear to me, for the reaso=
ns you state; having my traffic screw up other people's traffic isn't high =
on my list of "good ideas".=0AThe other danger of QOS is that applications =
may "game" its use of QOS, to get preferential treatment, so each network (=
and potentially hosts) need to be able to control its own policy, and detec=
t (and potentially punish) transgressors.  Right now, we don't have those d=
etectors or controls in place (and how to inform naive users that their app=
lications are asking for priority service for no good reason) is another un=
answered question.=0AThis gaming danger (and a UI to enable policy to be se=
t), make me think it's something we're going to have to work through carefu=
lly.=0A- Jim=0A=0A_______________________________________________=0A Cerowr=
t-devel mailing list=0A[Cerowrt-devel@lists.bufferbloat.net](mailto:Cerowrt=
-devel@lists.bufferbloat.net)=0A[https://lists.bufferbloat.net/listinfo/cer=
owrt-devel](https://lists.bufferbloat.net/listinfo/cerowrt-devel)=0A=0A
------=_20140516161707000000_90205
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<font face=3D"arial" size=3D"2"><p style=3D"margin:0;padding:0;">I agree wi=
th you Jim about being careful with QoS. &nbsp;That's why Andrew Odlyzko pr=
oposed the experiment with exactly two classes, and proposed it as an *expe=
riment*. So many researchers and IETF members seem to think we should just =
turn on diffserv and everything will work great... I've seen very senior me=
mbers of IETF actually propose diffserv become a provider-wide standard as =
soon as possible. &nbsp;I suppose they have a couple of ns2 runs that show =
"nothing can go wrong". :-)</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</=
p>=0A<p style=3D"margin:0;padding:0;">(that's why I'm so impressed by the f=
q_codel work - it's more than just simulation, but has been tested and more=
 or less stressed in real life, yet it is quite simple).</p>=0A<p style=3D"=
margin:0;padding:0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">I don't =
agree with the idea that switches alone can solve global system problems by=
 themselves. That's why the original AIMD algorithms use packet drops as si=
gnals, but make the endpoints responsible for managing congestion. &nbsp;Th=
e switches have nothing to do with the AIMD algorithm, they just create the=
 control inputs.</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</p>=0A<p sty=
le=3D"margin:0;padding:0;">So it is kind of telling that Valdis cites a tot=
ally "switch-centric" view from NANOG's perspective. &nbsp;It's not the job=
 of switches to manage congestion, just as it is not the job of endpoints t=
o program switches. &nbsp;There's a separation of concerns.</p>=0A<p style=
=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">The =
simpler observation would be "if you are a switch, there is NOTHING you can=
 do to stop congestion. &nbsp;Even dropping packets doesn't ameliorate cong=
estion. &nbsp;However, if you are a switch there are some things you can te=
ll the endpoints, in particular the receiving endpoints of flows traveling =
across the switch, about the local 'collision' of packets trying to get thr=
ough the switch at the same time."</p>=0A<p style=3D"margin:0;padding:0;">&=
nbsp;</p>=0A<p style=3D"margin:0;padding:0;">Since the Internet end-to-end =
protocols are "receiver controlled" (TCP's receive window is what controls =
the sender's flow, but it is set by the receiver), the locus of decision ma=
king is the collection of receivers.</p>=0A<p style=3D"margin:0;padding:0;"=
>&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">Buffering is not the real is=
sue - the issue is the frequency with which the packets of all the flows go=
ing through a particular switch "collide". &nbsp;The control problem is to =
make that frequency of collision quite small.</p>=0A<p style=3D"margin:0;pa=
dding:0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">The nice thing abou=
t packet drops is that collisions are remediated immediately, rather than c=
reating sustained bottlenecks that increase the "collision cross section" o=
f that switch, increasing the likelihood of collisions in the switch dramat=
ically. Replacing a collided/dropped packet with a much smaller "token" tha=
t goes on to the receiver would keep the collision cross section from growi=
ng, but provide better samples of collision info to the receiver. &nbsp;For=
 fairness, you want all packets involved in a collision to carry informatio=
n, and ideally all "near collisions" to also carry information about near c=
ollisions.</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=3D"=
margin:0;padding:0;">A collision in this is simply defined: a packet that e=
nters a switch collides with any other packets that have not completed trav=
ersal of the switch when the packet arrives is considered to have collided =
with those packets.</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</p>=0A<p =
style=3D"margin:0;padding:0;">You can expand packets' virtual time in the s=
witch by thinking of them as "virtually still in the switch" for some numbe=
r of bit times after they exit. &nbsp; Then a "near collision" happens betw=
een a packet and any packets that are still virtually in the switch. &nbsp;=
Near collisions are signals that can keep the system inside the "ballistic =
region" of the phase space.</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</=
p>=0A<p style=3D"margin:0;padding:0;">(you can track near collisions by a l=
ittle memory on each outbound link state - and even use Bloom Filters to qu=
ickly detect collisions, but that is for a different lecture).</p>=0A<p sty=
le=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">Pl=
ease steal this idea and develop it.</p>=0A<p style=3D"margin:0;padding:0;"=
>&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=3D"mar=
gin:0;padding:0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;"><br class=
=3D"WM_COMPOSE_SIGNATURE_START" /><br class=3D"WM_COMPOSE_SIGNATURE_END" />=
<br /><br />On Friday, May 16, 2014 12:06pm, "Jim Gettys" &lt;jg@freedeskto=
p.org&gt; said:<br /><br /></p>=0A<div id=3D"SafeStyles1400269964">=0A<div =
dir=3D"ltr">=0A<div class=3D"gmail_extra"><br /><br />=0A<div class=3D"gmai=
l_quote">On Fri, May 16, 2014 at 10:52 AM,  <span dir=3D"ltr">&lt;<a href=
=3D"mailto:Valdis.Kletnieks@vt.edu" target=3D"_blank">Valdis.Kletnieks@vt.e=
du</a>&gt;</span> wrote:<br />=0A<blockquote class=3D"gmail_quote" style=3D=
"margin: 0 0 0 .8ex; border-left: 1px #ccc solid; padding-left: 1ex;">=0A<d=
iv>On Thu, 15 May 2014 16:32:55 -0400, <a href=3D"mailto:dpreed@reed.com">d=
preed@reed.com</a> said:<br /><br /> &gt; And in the end of the day, the pr=
oblem is congestion, which is very<br /> &gt; non-linear. &nbsp;There is al=
most no congestion at almost all places in the Internet<br /> &gt; at any p=
articular time. &nbsp;You can't fix congestion locally - you have to slow<b=
r /> &gt; down the sources across all of the edge of the Internet, quickly.=
<br /><br /></div>=0AThere's a second very important point that somebody me=
ntioned on the NANOG<br /> list a while ago:<br /><br /> If the local route=
r/net/link/whatever isn't congested, QoS cannot do anything<br /> to improv=
e life for anybody.<br /><br /> If there *is* congestion, QoS can only impr=
ove your service to the normal<br /> uncongested state - and it can *only d=
o so by making somebody else's experience<br /> suck more*....<br /></block=
quote>=0A<div class=3D"gmail_default" style=3D"font-size: small;">=E2=80=8B=
The somebody else might be "you", in which life is much better.=E2=80=8B &n=
bsp;once you have the concept of flows (at some level of abstraction), you =
can make more sane choices.</div>=0A<div class=3D"gmail_default" style=3D"f=
ont-size: small;">=E2=80=8BPersonally, I've mostly been interested in QOS i=
n the local network: as "hints", for example, that it is worth more aggress=
ive bidding for transmit opportunities in WiFi, for example to ensure my VO=
IP, teleconferencing, gaming, music playing and other actually real time pa=
ckets get priority over bulk data (which includes web traffic), and may nee=
d access to the medium sooner than for routine applications or scavenger ap=
plications.</div>=0A<div class=3D"gmail_default" style=3D"font-size: small;=
">Whether it should have any use beyond the scope of the network that I con=
trol is less than clear to me, for the reasons you state; having my traffic=
 screw up other people's traffic isn't high on my list of "good ideas".</di=
v>=0A<div class=3D"gmail_default" style=3D"font-size: small;">The other dan=
ger of QOS is that applications may "game" its use of QOS, to get preferent=
ial treatment, so each network (and potentially hosts) need to be able to c=
ontrol its own policy, and detect (and potentially punish) transgressors. &=
nbsp;Right now, we don't have those detectors or controls in place (and how=
 to inform naive users that their applications are asking for priority serv=
ice for no good reason) is another unanswered question.</div>=0A<div class=
=3D"gmail_default" style=3D"font-size: small;">This gaming danger (and a UI=
 to enable policy to be set), make me think it's something we're going to h=
ave to work through carefully.</div>=0A<div class=3D"gmail_default" style=
=3D"font-size: small;">- Jim</div>=0A<blockquote class=3D"gmail_quote" styl=
e=3D"margin: 0 0 0 .8ex; border-left: 1px #ccc solid; padding-left: 1ex;"><=
br /><br />_______________________________________________<br /> Cerowrt-de=
vel mailing list<br /><a href=3D"mailto:Cerowrt-devel@lists.bufferbloat.net=
">Cerowrt-devel@lists.bufferbloat.net</a><br /><a href=3D"https://lists.buf=
ferbloat.net/listinfo/cerowrt-devel" target=3D"_blank">https://lists.buffer=
bloat.net/listinfo/cerowrt-devel</a><br /><br /></blockquote>=0A</div>=0A</=
div>=0A</div>=0A</div></font>
------=_20140516161707000000_90205--