From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dpreed@reed.com>
Received: from smtp65.iad3a.emailsrvr.com (smtp65.iad3a.emailsrvr.com
	[173.203.187.65])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by huchra.bufferbloat.net (Postfix) with ESMTPS id A0AAC21F247
	for <cerowrt-devel@lists.bufferbloat.net>;
	Wed, 28 May 2014 08:20:06 -0700 (PDT)
Received: from localhost (localhost.localdomain [127.0.0.1])
	by smtp25.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id
	4D04DE00FE; Wed, 28 May 2014 11:20:05 -0400 (EDT)
X-Virus-Scanned: OK
Received: from app35.wa-webapps.iad3a (relay.iad3a.rsapps.net [172.27.255.110])
	by smtp25.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id
	2B21AE00CA; Wed, 28 May 2014 11:20:05 -0400 (EDT)
Received: from reed.com (localhost.localdomain [127.0.0.1])
	by app35.wa-webapps.iad3a (Postfix) with ESMTP id 18D2B182B11;
	Wed, 28 May 2014 11:20:05 -0400 (EDT)
Received: by apps.rackspace.com
	(Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) 
	with HTTP; Wed, 28 May 2014 11:20:05 -0400 (EDT)
Date: Wed, 28 May 2014 11:20:05 -0400 (EDT)
From: dpreed@reed.com
To: "Jim Gettys" <jg@freedesktop.org>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_20140528112005000000_26654"
Importance: Normal
X-Priority: 3 (Normal)
X-Type: html
In-Reply-To: <CAGhGL2Bv-2m+7nvUBNt7CfDqh9diQrMc00Tb1-7-fH2JLYcU=g@mail.gmail.com>
References: <CABsdH_FMqARQQ7oT2gGE6PEZWk1E6b6CDGdBH958nL2=FmFv-A@mail.gmail.com>
	<alpine.DEB.2.02.1405251740360.29282@uplift.swm.pp.se> 
	<1401048053.664331760@apps.rackspace.com> 
	<CAGhGL2Bv-2m+7nvUBNt7CfDqh9diQrMc00Tb1-7-fH2JLYcU=g@mail.gmail.com>
Message-ID: <1401290405.100110358@apps.rackspace.com>
X-Mailer: webmail7.0
Cc: "cerowrt-devel@lists.bufferbloat.net" <cerowrt-devel@lists.bufferbloat.net>
Subject: Re: [Cerowrt-devel] Ubiquiti QOS
X-BeenThere: cerowrt-devel@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Development issues regarding the cerowrt test router project
	<cerowrt-devel.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cerowrt-devel>
List-Post: <mailto:cerowrt-devel@lists.bufferbloat.net>
List-Help: <mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cerowrt-devel>,
	<mailto:cerowrt-devel-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Wed, 28 May 2014 15:20:07 -0000

------=_20140528112005000000_26654
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=0AI did not mean that "pacing".  Sorry I used a generic term.  I meant wha=
t my longer description described - a specific mechanism for reducing bunch=
ing that is essentially "cooperative" among all active flows through a bott=
lenecked link.  That's part of a "closed loop" control system driving each =
TCP endpoint into a cooperative mode.=0A =0AThe thing you call "pacing" is =
something quite different.  It is disconnected from the TCP control loops i=
nvolved, which basically means it is flying blind.  Introducing that kind o=
f "pacing" almost certainly reduces throughput, because it *delays* packets=
.=0A =0AThe thing I called "pacing" is in no version of Linux that I know o=
f.  Give it a different name: "anti-bunching cooperation" or "timing phase =
management for congestion reduction". Rather than *delaying* packets, it tr=
ies to get packets to avoid bunching only when reducing window size, and do=
ing so by tightening the control loop so that the sender transmits as *soon=
* as it can, not by delaying sending after the sender dallies around not se=
nding when it can.=0A =0A =0A =0A =0A =0A=0A=0AOn Tuesday, May 27, 2014 11:=
23am, "Jim Gettys" <jg@freedesktop.org> said:=0A=0A=0A=0A=0A=0A=0A=0AOn Sun=
, May 25, 2014 at 4:00 PM,  <[dpreed@reed.com](mailto:dpreed@reed.com)> wro=
te:=0A=0ANot that it is directly relevant, but there is no essential reason=
 to require 50 ms. of buffering.  That might be true of some particular QOS=
-related router algorithm.  50 ms. is about all one can tolerate in any rou=
ter between source and destination for today's networks - an upper-bound ra=
ther than a minimum.=0A =0AThe optimum buffer state for throughput is 1-2 p=
ackets worth - in other words, if we have an MTU of 1500, 1500 - 3000 bytes=
. Only the bottleneck buffer (the input queue to the lowest speed link alon=
g the path) should have this much actually buffered. Buffering more than th=
is increases end-to-end latency beyond its optimal state.  Increased end-to=
-end latency reduces the effectiveness of control loops, creating more cong=
estion.=0A =0AThe rationale for having 50 ms. of buffering is probably to a=
void disruption of bursty mixed flows where the bursts might persist for 50=
 ms. and then die. One reason for this is that source nodes run operating s=
ystems that tend to release packets in bursts. That's a whole other discuss=
ion - in an ideal world, source nodes would avoid bursty packet releases by=
 letting the control by the receiver window be "tight" timing-wise.  That i=
s, to transmit a packet immediately at the instant an ACK arrives increasin=
g the window.  This would pace the flow - current OS's tend (due to schedul=
ing mismatches) to send bursts of packets, "catching up" on sending that co=
uld have been spaced out and done earlier if the feedback from the receiver=
's window advancing were heeded.=0A=0A=E2=80=8B=0A =0AThat is, endpoint net=
work stacks (TCP implementations) can worsen congestion by "dallying".  The=
 ideal end-to-end flows occupying a congested router would have their packe=
ts paced so that the packets end up being sent in the least bursty manner t=
hat an application can support.  The effect of this pacing is to move the "=
backlog" for each flow quickly into the source node for that flow, which th=
en provides back pressure on the application driving the flow, which ultima=
tely is necessary to stanch congestion.  The ideal congestion control mecha=
nism slows the sender part of the application to a pace that can go through=
 the network without contributing to buffering.=0A=E2=80=8B=E2=80=8B=0A=E2=
=80=8BPacing is in Linux 3.12(?).  How long it will take to see widespread =
deployment is another question, and as for other operating systems, who kno=
ws.=0ASee: [https://lwn.net/Articles/564978/](https://lwn.net/Articles/5649=
78/)=0A=E2=80=8B=E2=80=8B=0A =0ACurrent network stacks (including Linux's) =
don't achieve that goal - their pushback on application sources is minimal =
- instead they accumulate buffering internal to the network implementation.=
=0A=E2=80=8BThis is much, much less true than it once was.  There have been=
 substantial changes in the Linux TCP stack in the last year or two, to avo=
id generating packets before necessary.  Again, how long it will take for p=
eople to deploy this on Linux (and implement on other OS's) is a question.=
=0A=E2=80=8B=0AThis contributes to end-to-end latency as well.  But if you =
think about it, this is almost as bad as switch-level bufferbloat in terms =
of degrading user experience.  The reason I say "almost" is that there are =
tools, rarely used in practice, that allow an application to specify that b=
uffering should not build up in the network stack (in the kernel or whereve=
r it is).  But the default is not to use those APIs, and to buffer way too =
much.=0A =0ARemember, the network send stack can act similarly to a congest=
ed switch (it is a switch among all the user applications running on that n=
ode).  IF there is a heavy file transfer, the file transfer's buffering act=
s to increase latency for all other networked communications on that machin=
e.=0A =0ATraditionally this problem has been thought of only as a within-no=
de fairness issue, but in fact it has a big effect on the switches in betwe=
en source and destination due to the lack of dispersed pacing of the packet=
s at the source - in other words, the current design does nothing to stem t=
he "burst groups" from a single source mentioned above.=0A =0ASo we do need=
 the source nodes to implement less "bursty" sending stacks.  This is espec=
ially true for multiplexed source nodes, such as web servers implementing t=
housands of flows.=0A =0AA combination of codel-style switch-level buffer m=
anagement and the stack at the sender being implemented to spread packets i=
n a particular TCP flow out over time would improve things a lot.  To achie=
ve best throughput, the optimal way to spread packets out on an end-to-end =
basis is to update the receive window (sending ACK) at the receive end as q=
uickly as possible, and to respond to the updated receive window as quickly=
 as possible when it increases.=0A =0AJust like the "bufferbloat" issue, th=
e problem is caused by applications like streaming video, file transfers an=
d big web pages that the application programmer sees as not having a latenc=
y requirement within the flow, so the application programmer does not have =
an incentive to control pacing.  Thus the operating system has got to push =
back on the applications' flow somehow, so that the flow ends up paced once=
 it enters the Internet itself.  So there's no real problem caused by large=
 buffering in the network stack at the endpoint, as long as the stack's del=
ivery to the Internet is paced by some mechanism, e.g. tight management of =
receive window control on an end-to-end basis.=0A =0AI don't think this can=
 be fixed by cerowrt, so this is out of place here.  It's partially amelior=
ated by cerowrt, if it aggressively drops packets from flows that burst wit=
hout pacing. fq_codel does this, if the buffer size it aims for is small - =
but the problem is that the OS stacks don't respond by pacing... they tend =
to respond by bursting, not because TCP doesn't provide the mechanisms for =
pacing, but because the OS stack doesn't transmit as soon as it is allowed =
to - thus building up a burst unnecessarily.=0A =0ABursts on a flow are thu=
s bad in general.  They make congestion happen when it need not.=0A=E2=80=
=8BBy far the biggest headache is what the Web does to the network.  It has=
 turned the web into a burst generator.=0AA typical web page may have 10 (o=
r even more images).  See the "connections per page" plot in the link below=
.=0AA browser downloads the base page, and then, over N connections, essent=
ially simultaneously downloads those embedded objects.  Many/most of them a=
re small in size (4-10 packets).  You never even get near slow start.=0ASo =
you get an IW amount of data/TCP connection, with no pacing, and no congest=
ion avoidance.  It is easy to observe 50-100 packets (or more) back to back=
 at the bottleneck.=0AThis is (in practice) the amount you have to buffer t=
oday: that burst of packets from a web page.  Without flow queuing, you are=
 screwed.  With it, it's annoying, but can be tolerated.=0AI go over this i=
s detail in:=0A=0A[http://gettys.wordpress.com/2013/07/10/low-latency-requi=
res-smart-queuing-traditional-aqm-is-not-enough/](http://gettys.wordpress.c=
om/2013/07/10/low-latency-requires-smart-queuing-traditional-aqm-is-not-eno=
ugh/)=E2=80=8B=0ASo far, I don't believe anyone has tried pacing the IW bur=
st of packets.  I'd certainly like to see that, but pacing needs to be acro=
ss TCP connections (host pairs) to be possibly effective to outwit the gami=
ng the web has done to the network.=0A- Jim=0A=0A=0A =0A =0A=0A=0A=0A=0AOn =
Sunday, May 25, 2014 11:42am, "Mikael Abrahamsson" <[swmike@swm.pp.se](mail=
to:swmike@swm.pp.se)> said:=0A=0A=0A=0A> On Sun, 25 May 2014, Dane Medic wr=
ote:=0A> =0A> > Is it true that devices with less than 64 MB can't handle Q=
OS? ->=0A> > [https://lists.chambana.net/pipermail/commotion-dev/2014-May/0=
01816.html](https://lists.chambana.net/pipermail/commotion-dev/2014-May/001=
816.html)=0A > =0A> At gig speeds you need around 50ms worth of buffering. =
1 gigabit/s =3D=0A> 125 megabyte/s meaning for 50ms you need 6.25 megabyte =
of buffer.=0A> =0A> I also don't see why performance and memory size would =
be relevant, I'd=0A > say forwarding performance has more to do with CPU sp=
eed than anything=0A> else.=0A> =0A> --=0A> Mikael Abrahamsson    email: [s=
wmike@swm.pp.se](mailto:swmike@swm.pp.se)=0A > ____________________________=
___________________=0A> Cerowrt-devel mailing list=0A> [Cerowrt-devel@lists=
.bufferbloat.net](mailto:Cerowrt-devel@lists.bufferbloat.net)=0A> [https://=
lists.bufferbloat.net/listinfo/cerowrt-devel](https://lists.bufferbloat.net=
/listinfo/cerowrt-devel)=0A >=0A___________________________________________=
____=0A Cerowrt-devel mailing list=0A[Cerowrt-devel@lists.bufferbloat.net](=
mailto:Cerowrt-devel@lists.bufferbloat.net)=0A[https://lists.bufferbloat.ne=
t/listinfo/cerowrt-devel](https://lists.bufferbloat.net/listinfo/cerowrt-de=
vel)=0A=0A
------=_20140528112005000000_26654
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<font face=3D"arial" size=3D"2"><p style=3D"margin:0;padding:0;">I did not =
mean that "pacing". &nbsp;Sorry I used a generic term. &nbsp;I meant what m=
y longer description described - a specific mechanism for reducing bunching=
 that is essentially "cooperative" among all active flows through a bottlen=
ecked link. &nbsp;That's part of a "closed loop" control system driving eac=
h TCP endpoint into a cooperative mode.</p>=0A<p style=3D"margin:0;padding:=
0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">The thing you call "pacin=
g" is something quite different. &nbsp;It is disconnected from the TCP cont=
rol loops involved, which basically means it is flying blind. &nbsp;Introdu=
cing that kind of "pacing" almost certainly reduces throughput, because it =
*delays* packets.</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</p>=0A<p st=
yle=3D"margin:0;padding:0;">The thing I called "pacing" is in no version of=
 Linux that I know of. &nbsp;Give it a different name: "anti-bunching coope=
ration" or "timing phase management for congestion reduction". Rather than =
*delaying* packets, it tries to get packets to avoid bunching only when red=
ucing window size, and doing so by tightening the control loop so that the =
sender transmits as *soon* as it can, not by delaying sending after the sen=
der dallies around not sending when it can.</p>=0A<p style=3D"margin:0;padd=
ing:0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=
=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;">&nbs=
p;</p>=0A<p style=3D"margin:0;padding:0;">&nbsp;</p>=0A<p style=3D"margin:0=
;padding:0;"><br class=3D"WM_COMPOSE_SIGNATURE_START" /><br class=3D"WM_COM=
POSE_SIGNATURE_END" /><br /><br />On Tuesday, May 27, 2014 11:23am, "Jim Ge=
ttys" &lt;jg@freedesktop.org&gt; said:<br /><br /></p>=0A<div id=3D"SafeSty=
les1401289912">=0A<div dir=3D"ltr">=0A<div class=3D"gmail_extra"><br /><br =
/>=0A<div class=3D"gmail_quote">On Sun, May 25, 2014 at 4:00 PM,  <span dir=
=3D"ltr">&lt;<a href=3D"mailto:dpreed@reed.com" target=3D"_blank">dpreed@re=
ed.com</a>&gt;</span> wrote:<br />=0A<blockquote class=3D"gmail_quote" styl=
e=3D"margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: =
#cccccc; border-left-style: solid; padding-left: 1ex;"><span style=3D"font-=
family: arial;">=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px=
;">Not that it is directly relevant, but there is no essential reason to re=
quire 50 ms. of buffering. &nbsp;That might be true of some particular QOS-=
related router algorithm. &nbsp;50 ms. is about all one can tolerate in any=
 router between source and destination for today's networks - an upper-boun=
d rather than a minimum.</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; =
padding: 0px;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; pad=
ding: 0px;"><span style=3D"font-family: Arial,Helvetica,Verdana,sans-serif;=
">The optimum buffer state for throughput is 1-2 packets worth - in other w=
ords, if we have an MTU of 1500, 1500 - 3000 bytes. Only the bottleneck buf=
fer (the input queue to the lowest speed link along the path) should have t=
his much actually buffered. Buffering more than this increases end-to-end l=
atency beyond its optimal state. &nbsp;Increased end-to-end latency reduces=
 the effectiveness of control loops, creating more congestion.</span></p>=
=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">&nbsp;</p>=0A=
<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">The rationale fo=
r having 50 ms. of buffering is probably to avoid disruption of bursty mixe=
d flows where the bursts might persist for 50 ms. and then die. One reason =
for this is that source nodes run operating systems that tend to release pa=
ckets in bursts. That's a whole other discussion - in an ideal world, sourc=
e nodes would avoid bursty packet releases by letting the control by the re=
ceiver window be "tight" timing-wise. &nbsp;That is, to transmit a packet i=
mmediately at the instant an ACK arrives increasing the window. &nbsp;This =
would pace the flow - current OS's tend (due to scheduling mismatches) to s=
end bursts of packets, "catching up" on sending that could have been spaced=
 out and done earlier if the feedback from the receiver's window advancing =
were heeded.</p>=0A</span></blockquote>=0A<div>=0A<div class=3D"gmail_defau=
lt" style=3D"font-size: small;">=E2=80=8B</div>=0A</div>=0A<blockquote clas=
s=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-width: 1p=
x; border-left-color: #cccccc; border-left-style: solid; padding-left: 1ex;=
"><span style=3D"font-family: arial;">=0A<p style=3D"margin:0;padding:0;mar=
gin: 0px; padding: 0px;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;margin=
: 0px; padding: 0px;">That is, endpoint network stacks (TCP implementations=
) can worsen congestion by "dallying". &nbsp;The ideal end-to-end flows occ=
upying a congested router would have their packets paced so that the packet=
s end up being sent in the least bursty manner that an application can supp=
ort. &nbsp;The effect of this pacing is to move the "backlog" for each flow=
 quickly into the source node for that flow, which then provides back press=
ure on the application driving the flow, which ultimately is necessary to s=
tanch congestion. &nbsp;The ideal congestion control mechanism slows the se=
nder part of the application to a pace that can go through the network with=
out contributing to buffering.</p>=0A</span></blockquote>=0A<div class=3D"g=
mail_default" style=3D"font-size: small;">=E2=80=8B=E2=80=8B</div>=0A<div c=
lass=3D"gmail_default">=E2=80=8BPacing is in Linux 3.12(?). &nbsp;How long =
it will take to see widespread deployment is another question, and as for o=
ther operating systems, who knows.</div>=0A<div class=3D"gmail_default">See=
:&nbsp;<a href=3D"https://lwn.net/Articles/564978/">https://lwn.net/Article=
s/564978/</a></div>=0A<div class=3D"gmail_default" style=3D"font-size: smal=
l;">=E2=80=8B=E2=80=8B</div>=0A<blockquote class=3D"gmail_quote" style=3D"m=
argin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: #ccccc=
c; border-left-style: solid; padding-left: 1ex;"><span style=3D"font-family=
: arial;">=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">&nb=
sp;</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">Curren=
t network stacks (including Linux's) don't achieve that goal - their pushba=
ck on application sources is minimal - instead they accumulate buffering in=
ternal to the network implementation.</p>=0A</span></blockquote>=0A<div cla=
ss=3D"gmail_default" style=3D"font-size: small;">=E2=80=8BThis is much, muc=
h less true than it once was. &nbsp;There have been substantial changes in =
the Linux TCP stack in the last year or two, to avoid generating packets be=
fore necessary. &nbsp;Again, how long it will take for people to deploy thi=
s on Linux (and implement on other OS's) is a question.</div>=0A<div class=
=3D"gmail_default" style=3D"font-size: small;">=E2=80=8B</div>=0A<blockquot=
e class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; border-left-wid=
th: 1px; border-left-color: #cccccc; border-left-style: solid; padding-left=
: 1ex;"><span style=3D"font-family: arial;">=0A<p style=3D"margin:0;padding=
:0;margin: 0px; padding: 0px;">This contributes to end-to-end latency as we=
ll. &nbsp;But if you think about it, this is almost as bad as switch-level =
bufferbloat in terms of degrading user experience. &nbsp;The reason I say "=
almost" is that there are tools, rarely used in practice, that allow an app=
lication to specify that buffering should not build up in the network stack=
 (in the kernel or wherever it is). &nbsp;But the default is not to use tho=
se APIs, and to buffer way too much.</p>=0A<p style=3D"margin:0;padding:0;m=
argin: 0px; padding: 0px;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;marg=
in: 0px; padding: 0px;">Remember, the network send stack can act similarly =
to a congested switch (it is a switch among all the user applications runni=
ng on that node). &nbsp;IF there is a heavy file transfer, the file transfe=
r's buffering acts to increase latency for all other networked communicatio=
ns on that machine.</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; paddi=
ng: 0px;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding:=
 0px;">Traditionally this problem has been thought of only as a within-node=
 fairness issue, but in fact it has a big effect on the switches in between=
 source and destination due to the lack of dispersed pacing of the packets =
at the source - in other words, the current design does nothing to stem the=
 "burst groups" from a single source mentioned above.</p>=0A<p style=3D"mar=
gin:0;padding:0;margin: 0px; padding: 0px;">&nbsp;</p>=0A<p style=3D"margin=
:0;padding:0;margin: 0px; padding: 0px;">So we do need the source nodes to =
implement less "bursty" sending stacks. &nbsp;This is especially true for m=
ultiplexed source nodes, such as web servers implementing thousands of flow=
s.</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">&nbsp;<=
/p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">A combinat=
ion of codel-style switch-level buffer management and the stack at the send=
er being implemented to spread packets in a particular TCP flow out over ti=
me would improve things a lot. &nbsp;To achieve best throughput, the optima=
l way to spread packets out on an end-to-end basis is to update the receive=
 window (sending ACK) at the receive end as quickly as possible, and to res=
pond to the updated receive window as quickly as possible when it increases=
.</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">&nbsp;</=
p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">Just like t=
he "bufferbloat" issue, the problem is caused by applications like streamin=
g video, file transfers and big web pages that the application programmer s=
ees as not having a latency requirement within the flow, so the application=
 programmer does not have an incentive to control pacing. &nbsp;Thus the op=
erating system has got to push back on the applications' flow somehow, so t=
hat the flow ends up paced once it enters the Internet itself. &nbsp;So the=
re's no real problem caused by large buffering in the network stack at the =
endpoint, as long as the stack's delivery to the Internet is paced by some =
mechanism, e.g. tight management of receive window control on an end-to-end=
 basis.</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">&n=
bsp;</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">I don=
't think this can be fixed by cerowrt, so this is out of place here. &nbsp;=
It's partially ameliorated by cerowrt, if it aggressively drops packets fro=
m flows that burst without pacing. fq_codel does this, if the buffer size i=
t aims for is small - but the problem is that the OS stacks don't respond b=
y pacing... they tend to respond by bursting, not because TCP doesn't provi=
de the mechanisms for pacing, but because the OS stack doesn't transmit as =
soon as it is allowed to - thus building up a burst unnecessarily.</p>=0A<p=
 style=3D"margin:0;padding:0;margin: 0px; padding: 0px;">&nbsp;</p>=0A<p st=
yle=3D"margin:0;padding:0;margin: 0px; padding: 0px;">Bursts on a flow are =
thus bad in general. &nbsp;They make congestion happen when it need not.</p=
>=0A</span></blockquote>=0A<div class=3D"gmail_default" style=3D"font-size:=
 small;">=E2=80=8BBy far the biggest headache is what the Web does to the n=
etwork. &nbsp;It has turned the web into a burst generator.</div>=0A<div cl=
ass=3D"gmail_default" style=3D"font-size: small;">A typical web page may ha=
ve 10 (or even more images). &nbsp;See the "connections per page" plot in t=
he link below.</div>=0A<div class=3D"gmail_default" style=3D"font-size: sma=
ll;">A browser downloads the base page, and then, over N connections, essen=
tially simultaneously downloads those embedded objects. &nbsp;Many/most of =
them are small in size (4-10 packets). &nbsp;You never even get near slow s=
tart.</div>=0A<div class=3D"gmail_default" style=3D"font-size: small;">So y=
ou get an IW amount of data/TCP connection, with no pacing, and no congesti=
on avoidance. &nbsp;It is easy to observe 50-100 packets (or more) back to =
back at the bottleneck.</div>=0A<div class=3D"gmail_default" style=3D"font-=
size: small;">This is (in practice) the amount you have to buffer today: th=
at burst of packets from a web page. &nbsp;Without flow queuing, you are sc=
rewed. &nbsp;With it, it's annoying, but can be tolerated.</div>=0A<div cla=
ss=3D"gmail_default" style=3D"font-size: small;">I go over this is detail i=
n:</div>=0A<div class=3D"gmail_default" style=3D"font-size: small;"></div>=
=0A<div class=3D"gmail_default" style=3D"font-size: small;"><a href=3D"http=
://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queuing-tradi=
tional-aqm-is-not-enough/">http://gettys.wordpress.com/2013/07/10/low-laten=
cy-requires-smart-queuing-traditional-aqm-is-not-enough/</a>=E2=80=8B</div>=
=0A<div class=3D"gmail_default" style=3D"font-size: small;">So far, I don't=
 believe anyone has tried pacing the IW burst of packets. &nbsp;I'd certain=
ly like to see that, but pacing needs to be across TCP connections (host pa=
irs) to be possibly effective to outwit the gaming the web has done to the =
network.</div>=0A<div class=3D"gmail_default" style=3D"font-size: small;">-=
 Jim</div>=0A<blockquote class=3D"gmail_quote" style=3D"margin: 0px 0px 0px=
 0.8ex; border-left-width: 1px; border-left-color: #cccccc; border-left-sty=
le: solid; padding-left: 1ex;"><span style=3D"font-family: arial;">=0A<div>=
=0A<div class=3D"h5">=0A<p style=3D"margin:0;padding:0;margin: 0px; padding=
: 0px;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0=
px;">&nbsp;</p>=0A<p style=3D"margin:0;padding:0;margin: 0px; padding: 0px;=
"><br /><br /><br /><br />On Sunday, May 25, 2014 11:42am, "Mikael Abrahams=
son" &lt;<a href=3D"mailto:swmike@swm.pp.se" target=3D"_blank">swmike@swm.p=
p.se</a>&gt; said:<br /><br /></p>=0A<div>=0A<p style=3D"margin:0;padding:0=
;margin: 0px; padding: 0px;">&gt; On Sun, 25 May 2014, Dane Medic wrote:<br=
 />&gt; <br />&gt; &gt; Is it true that devices with less than 64 MB can't =
handle QOS? -&gt;<br />&gt; &gt; <a href=3D"https://lists.chambana.net/pipe=
rmail/commotion-dev/2014-May/001816.html" target=3D"_blank">https://lists.c=
hambana.net/pipermail/commotion-dev/2014-May/001816.html</a><br /> &gt; <br=
 />&gt; At gig speeds you need around 50ms worth of buffering. 1 gigabit/s =
=3D<br />&gt; 125 megabyte/s meaning for 50ms you need 6.25 megabyte of buf=
fer.<br />&gt; <br />&gt; I also don't see why performance and memory size =
would be relevant, I'd<br /> &gt; say forwarding performance has more to do=
 with CPU speed than anything<br />&gt; else.<br />&gt; <br />&gt; --<br />=
&gt; Mikael Abrahamsson    email: <a href=3D"mailto:swmike@swm.pp.se" targe=
t=3D"_blank">swmike@swm.pp.se</a><br /> &gt; ______________________________=
_________________<br />&gt; Cerowrt-devel mailing list<br />&gt; <a href=3D=
"mailto:Cerowrt-devel@lists.bufferbloat.net" target=3D"_blank">Cerowrt-deve=
l@lists.bufferbloat.net</a><br />&gt; <a href=3D"https://lists.bufferbloat.=
net/listinfo/cerowrt-devel" target=3D"_blank">https://lists.bufferbloat.net=
/listinfo/cerowrt-devel</a><br /> &gt;</p>=0A</div>=0A</div>=0A</div>=0A</s=
pan><br />_______________________________________________<br /> Cerowrt-dev=
el mailing list<br /><a href=3D"mailto:Cerowrt-devel@lists.bufferbloat.net"=
>Cerowrt-devel@lists.bufferbloat.net</a><br /><a href=3D"https://lists.buff=
erbloat.net/listinfo/cerowrt-devel" target=3D"_blank">https://lists.bufferb=
loat.net/listinfo/cerowrt-devel</a><br /><br /></blockquote>=0A</div>=0A</d=
iv>=0A</div>=0A</div></font>
------=_20140528112005000000_26654--