From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp97.iad3a.emailsrvr.com (smtp97.iad3a.emailsrvr.com [173.203.187.97]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 8899421F38C for ; Thu, 29 May 2014 08:29:32 -0700 (PDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp29.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 849FCF8129; Thu, 29 May 2014 11:29:31 -0400 (EDT) X-Virus-Scanned: OK Received: from app8.wa-webapps.iad3a (relay.iad3a.rsapps.net [172.27.255.110]) by smtp29.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id D7FCBF8113; Thu, 29 May 2014 11:29:30 -0400 (EDT) Received: from reed.com (localhost.localdomain [127.0.0.1]) by app8.wa-webapps.iad3a (Postfix) with ESMTP id C2555280057; Thu, 29 May 2014 11:29:30 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) with HTTP; Thu, 29 May 2014 11:29:30 -0400 (EDT) Date: Thu, 29 May 2014 11:29:30 -0400 (EDT) From: dpreed@reed.com To: "David P. Reed" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20140529112930000000_15668" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: <3eb328c9-05fb-4594-81cc-71e6a623b977@katmail.1gravity.com> References: <1401048053.664331760@apps.rackspace.com> <1401290405.100110358@apps.rackspace.com> <3eb328c9-05fb-4594-81cc-71e6a623b977@katmail.1gravity.com> Message-ID: <1401377370.793326873@apps.rackspace.com> X-Mailer: webmail7.0 Cc: "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] Ubiquiti QOS X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 May 2014 15:29:33 -0000 ------=_20140529112930000000_15668 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0ANote: this is all about "how to achieve and sustain the ballistic phase = that is optimal for Internet transport" in an end-to-end based control syst= em like TCP.=0A =0AI think those who have followed this know that, but I wa= nt to make it clear that I'm proposing a significant improvement that requi= res changes at the OS stacks and changes in the switches' approach to conge= stion signaling. There are ways to phase it in gradually. In "meshes", et= c. it could probably be developed and deployed more quickly - but my though= ts on co-existence with the current TCP stacks and current IP routers are f= ar less precisely worked out.=0A =0AI am way too busy with my day job to do= what needs to be done ... but my sense is that the folks who reduce this t= o practice will make a HUGE difference to Internet performance. Bigger tha= n getting bloat fixed, and to me that is a major, major potential triumph.= =0A =0A=0A=0AOn Thursday, May 29, 2014 8:11am, "David P. Reed" said:=0A=0A=0AECN-style signaling has the right properties ... just l= ike TTL it can provide valid and current sampling of the packet ' s environ= ment as it travels. The idea is to sample what is happening at a bottleneck= for the packet ' s flow. The bottleneck is the link with the most likelih= ood of a collision from flows sharing that link.=0A=0A A control - theoreti= c estimator of recent collision likelihood is easy to do at each queue. Al= l active flows would receive that signal, with the busiest ones getting it = most quickly. Also it is reasonable to count all potentially colliding flow= s at all outbound queues, and report that.=0A=0A The estimator can then pro= vide the signal that each flow responds to.=0A=0A The problem of "defectors= " is best dealt with by punishment... An aggressive packet drop policy that= makes causing congestion reduce the cause's throughput and increases laten= cy is the best kind of answer. Since the router can remember recent flow be= havior, it can penalize recent flows.=0A=0A A Bloom style filter can rememb= er flow statistics for both of these local policies. A great use for the me= mory no longer misapplied to buffering....=0A=0A Simple?=0A=0A=0AOn May 28,= 2014, David Lang wrote:=0AOn Wed, 28 May 2014, dpreed@reed= .com wrote:=0A=0AI did not mean that "pacing". Sorry I used a generic term= . I meant what my =0Alonger description described - a specific mechanism f= or reducing bunching that =0Ais essentially "cooperative" among all active = flows through a bottlenecked =0Alink. That's part of a "closed loop" contr= ol system driving each TCP endpoint =0Ainto a cooperative mode.=0Ahow do yo= u think we can get feedback from the bottleneck node to all the =0Adifferen= t senders?=0A=0Awhat happens to the ones who try to play nice if one doesn'= t?, including what =0Ahappens if one isn't just ignorant of the new coopera= tive mode, but activly =0Atries to cheat? (as I understand it, this is the = fatal flaw in many of the past =0Abuffering improvement proposals)=0A=0AWhi= le the in-h ouserouter is the first bottleneck that user's traffic hits, th= e =0Abigger problems happen when the bottleneck is in the peering between I= SPs, many =0Ahops away from any sender, with many different senders competi= ng for the =0Aavialable bandwidth.=0A=0AThis is where the new buffering app= roaches win. If the traffic is below the =0Acongestion level, they add very= close to zero overhead, but when congestion =0Ahappens, they manage the re= sulting buffers in a way that's works better for =0Apeople (allowing short,= fast connections to be fast with only a small impact on =0Avery long conne= ctions)=0A=0ADavid Lang=0A=0AThe thing you call "pacing" is something quite= different. It is disconnected =0Afrom the TCP control loops involved, whi= ch basically means it is flying blind. =0AIntroducing that kind of "pacing"= almost certainly reducesthroughput, because =0Ait *delays* packets.=0A=0A= The thing I called "pacing" is in no version of Linux that I know of. Give= it =0Aa different name: "anti-bunching cooperation" or "timing phase manag= ement for =0Acongestion reduction". Rather than *delaying* packets, it trie= s to get packets =0Ato avoid bunching only when reducing window size, and d= oing so by tightening =0Athe control loop so that the sender transmits as *= soon* as it can, not by =0Adelaying sending after the sender dallies around= not sending when it can.=0A=0A=0A=0A=0A=0A=0A=0AOn Tuesday, May 27, 2014 1= 1:23am, "Jim Gettys" said:=0A=0A=0A=0A=0A=0A=0A=0AOn S= un, May 25, 2014 at 4:00 PM, <[dpreed@reed.com](mailto:dpreed@reed.com)> w= rote:=0A=0ANot that it is directly relevant, but there is no essential reas= on to require 50 ms. of buffering. That might be true of some particular Q= OS-related router algorith m. 50ms. is about all one can tolerate in any r= outer between source and destination for today's networks - an upper-bound = rather than a minimum.=0A=0AThe optimum buffer state for throughput is 1-2 = packets worth - in other words, if we have an MTU of 1500, 1500 - 3000 byte= s. Only the bottleneck buffer (the input queue to the lowest speed link alo= ng the path) should have this much actually buffered. Buffering more than t= his increases end-to-end latency beyond its optimal state. Increased end-t= o-end latency reduces the effectiveness of control loops, creating more con= gestion.=0A=0AThe rationale for having 50 ms. of buffering is probably to a= void disruption of bursty mixed flows where the bursts might persist for 50= ms. and then die. One reason for this is that source nodes run operating s= ystems that tend to release packets in bursts. That's a whole other discuss= ion - in an ideal world, source nodes would avoid bursty packet releases by= letting the control by the receiver windowbe "tight" timing-wise. That i= s, to transmit a packet immediately at the instant an ACK arrives increasin= g the window. This would pace the flow - current OS's tend (due to schedul= ing mismatches) to send bursts of packets, "catching up" on sending that co= uld have been spaced out and done earlier if the feedback from the receiver= 's window advancing were heeded.=0A=0A=E2=80=8B=0A=0AThat is, endpoint netw= ork stacks (TCP implementations) can worsen congestion by "dallying". The = ideal end-to-end flows occupying a congested router would have their packet= s paced so that the packets end up being sent in the least bursty manner th= at an application can support. The effect of this pacing is to move the "b= acklog" for each flow quickly into the source node for that flow, which the= n provides back pressure on the application driving the flow, which ultimat= ely is necessary to stanch congestion. The ideal congestion control mechan= ism slows the sender part of the application to a pac e thatcan go through = the network without contributing to buffering.=0A=E2=80=8B=E2=80=8B=0A=E2= =80=8BPacing is in Linux 3.12(?). How long it will take to see widespread = deployment is another question, and as for other operating systems, who kno= ws.=0ASee: [[ https://lwn.net/Articles/564978 ]( https://lwn.net/Articles/5= 64978 )/]([ https://lwn.net/Articles/564978 ]( https://lwn.net/Articles/564= 978 )/)=0A=E2=80=8B=E2=80=8B=0A=0ACurrent network stacks (including Linux's= ) don't achieve that goal - their pushback on application sources is minima= l - instead they accumulate buffering internal to the network implementatio= n.=0A=E2=80=8BThis is much, much less true than it once was. There have be= en substantial changes in the Linux TCP stack in the last year or two, to a= void generating packets before necessary. Again, how long it will take for= people to deploy this on Linux (and implement on other OS's) is a question= .=0A=E2=80=8B=0AThis contributes to end-to-end latency as well. But if you= th inkabout it, this is almost as bad as switch-level bufferbloat in terms= of degrading user experience. The reason I say "almost" is that there are= tools, rarely used in practice, that allow an application to specify that = buffering should not build up in the network stack (in the kernel or wherev= er it is). But the default is not to use those APIs, and to buffer way too= much.=0A=0ARemember, the network send stack can act similarly to a congest= ed switch (it is a switch among all the user applications running on that n= ode). IF there is a heavy file transfer, the file transfer's buffering act= s to increase latency for all other networked communications on that machin= e.=0A=0ATraditionally this problem has been thought of only as a within-nod= e fairness issue, but in fact it has a big effect on the switches in betwee= n source and destination due to the lack of dispersed pacing of the packets= at the source - in other words, the current design does nothing to stem th= e "burst g roups"from a single source mentioned above.=0A=0ASo we do need t= he source nodes to implement less "bursty" sending stacks. This is especia= lly true for multiplexed source nodes, such as web servers implementing tho= usands of flows.=0A=0AA combination of codel-style switch-level buffer mana= gement and the stack at the sender being implemented to spread packets in a= particular TCP flow out over time would improve things a lot. To achieve = best throughput, the optimal way to spread packets out on an end-to-end bas= is is to update the receive window (sending ACK) at the receive end as quic= kly as possible, and to respond to the updated receive window as quickly as= possible when it increases.=0A=0AJust like the "bufferbloat" issue, the pr= oblem is caused by applications like streaming video, file transfers and bi= g web pages that the application programmer sees as not having a latency re= quirement within the flow, so the application programmer does not have an i= ncentive to co ntrolpacing. Thus the operating system has got to push back= on the applications' flow somehow, so that the flow ends up paced once it = enters the Internet itself. So there's no real problem caused by large buf= fering in the network stack at the endpoint, as long as the stack's deliver= y to the Internet is paced by some mechanism, e.g. tight management of rece= ive window control on an end-to-end basis.=0A=0AI don't think this can be f= ixed by cerowrt, so this is out of place here. It's partially ameliorated = by cerowrt, if it aggressively drops packets from flows that burst without = pacing. fq_codel does this, if the buffer size it aims for is small - but t= he problem is that the OS stacks don't respond by pacing... they tend to re= spond by bursting, not because TCP doesn't provide the mechanisms for pacin= g, but because the OS stack doesn't transmit as soon as it is allowed to - = thus building up a burst unnecessarily.=0A=0ABursts on a flow are thus bad = in general. They makecongestion happen when it need not.=0A=E2=80=8BBy far= the biggest headache is what the Web does to the network. It has turned t= he web into a burst generator.=0AA typical web page may have 10 (or even mo= re images). See the "connections per page" plot in the link below.=0AA bro= wser downloads the base page, and then, over N connections, essentially sim= ultaneously downloads those embedded objects. Many/most of them are small = in size (4-10 packets). You never even get near slow start.=0ASo you get a= n IW amount of data/TCP connection, with no pacing, and no congestion avoid= ance. It is easy to observe 50-100 packets (or more) back to back at the b= ottleneck.=0AThis is (in practice) the amount you have to buffer today: tha= t burst of packets from a web page. Without flow queuing, you are screwed.= With it, it's annoying, but can be tolerated.=0AI go over this is detail = in:=0A=0A[[ http://gettys.wordpress.com/2013/07/10/low-latency-requires-sma= rt-queuing-traditional-aqm-is-not-enough ]( http://gettys.wordpress.com/201= 3/07/10/low-latency-requires-smart-queuing-traditional-aqm-is-not-enough )/= ]([ http://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queui= ng-traditional-aqm-is-not-enough ]( http://gettys.wordpress.com/2013/07/10/= low-latency-requires-smart-queuing-traditional-aqm-is-not-enough )/)=E2=80= =8B=0ASo far, I don't believe anyone has tried pacing the IW burst of packe= ts. I'd certainly like to see that, but pacing needs to be across TCP conn= ections (host pairs) to be possibly effective to outwit the gaming the web = has done to the network.=0A- Jim=0A=0A=0A=0A=0A=0A=0A=0A=0AOn Sunday, May 2= 5, 2014 11:42am, "Mikael Abrahamsson" <[swmike@swm.pp.se](mailto:swmike@swm= .pp.se)> said:=0A=0A=0A=0AOn Sun, 25 May 2014, Dane Medic wrote:=0A=0AIs it= true that devices with less than 64 MB can't handle QOS? ->=0A[[ https://l= ists.chambana.net/pipermail/commotion-dev/2014-May/001816.html ]( https://l= ists.chambana.net/pipermail/commotion-dev/2014-May/001816.html )]([ https:/= /lists.chambana.net/pipermail/commotion-dev/2014-May/001816.html ]( https:/= /lists.chambana.net/pipermail/commotion-dev/2014-May/001816.html ))=0AAt gi= g speeds you need around 50ms worth of buffering. 1 gigabit/s =3D=0A125 meg= abyte/s meaning for 50ms you need 6.25 megabyte of buffer.=0A=0AI also don'= t see why performance and memory size would be relevant, I'd=0Asay forwardi= ng performance has more to do with CPU speed than anything=0Aelse.=0A=0A--= =0AMikael Abrahamsson email:[swmike@swm.pp.se](mailto:swmike@swm.pp.se)= =0A=0ACerowrt-devel mailing list=0A[Cerowrt-devel@lists.bufferbloat.net](ma= ilto:Cerowrt-devel@lists.bufferbloat.net)=0A[[ https://lists.bufferbloat.ne= t/listinfo/cerowrt-devel ]( https://lists.bufferbloat.net/listinfo/cerowrt-= devel )]([ https://lists.bufferbloat.net/listinfo/cerowrt-devel ]( https://= lists.bufferbloat.net/listinfo/cerowrt-devel ))=0A=0ACerowrt-devel mailing = list=0A[Cerowrt-devel@lists.bufferbloat.net](mailto:Cerowrt-devel@lists.buf= ferbloat.net)=0A[[ https://lists.bufferbloat.net/listinfo/cerowrt-devel ]( = https://lists.bufferbloat.net/listinfo/cerowrt-devel )]([ https://lists.buf= ferbloat.net/listinfo/cerowrt-devel ]( https://lists.bufferbloat.net/listin= fo/cerowrt-devel ))=0A =0A=0ACerowrt-devel mailing list=0ACerowrt-devel@lis= ts.bufferbloat.net=0A[ https://lists.bufferbloat.net/listinfo/cerowrt-devel= ]( https://lists.bufferbloat.net/listinfo/cerowrt-devel )=0A=0A-- Sent fro= m my Android device with [ K-@ Mail ]( https://play.google.com/store/apps/d= etails?id=3Dcom.onegravity.k10.pro2 ). Please excuse my brevity. ------=_20140529112930000000_15668 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Note: this= is all about "how to achieve and sustain the ballistic phase that is optim= al for Internet transport" in an end-to-end based control system like TCP.<= /p>=0A

 

=0A

I think those who have followed this know that, but I want to mak= e it clear that I'm proposing a significant improvement that requires chang= es at the OS stacks and changes in the switches' approach to congestion sig= naling.  There are ways to phase it in gradually.  In "meshes", e= tc. it could probably be developed and deployed more quickly - but my thoug= hts on co-existence with the current TCP stacks and current IP routers are = far less precisely worked out.

=0A

 = ;

=0A

I am way too busy with my day job = to do what needs to be done ... but my sense is that the folks who reduce t= his to practice will make a HUGE difference to Internet performance.  = Bigger than getting bloat fixed, and to me that is a major, major potential= triumph.

=0A

 

=0A





On Thursday, May 29, 2014 8:11am, = "David P. Reed" <dpreed@reed.com> said:

=0A
ECN-style signaling has the right properties ... just= like TTL it can provide valid and current sampling of the packet ' s envir= onment as it travels. The idea is to sample what is happening at a bottlene= ck for the packet ' s flow.  The bottleneck is the link with the most = likelihood of a collision from flows sharing that link.

A contr= ol - theoretic estimator of recent collision likelihood is easy to do at ea= ch queue.  All active flows would receive that signal, with the busies= t ones getting it most quickly. Also it is reasonable to count all potentia= lly colliding flows at all outbound queues, and report that.

Th= e estimator can then provide the signal that each flow responds to.
The problem of "defectors" is best dealt with by punishment... An aggr= essive packet drop policy that makes causing congestion reduce the cause's = throughput and increases latency is the best kind of answer. Since the rout= er can remember recent flow behavior, it can penalize recent flows.
A Bloom style filter can remember flow statistics for both of these lo= cal policies. A great use for the memory no longer misapplied to buffering.= ...

Simple?

=0A
On May 28,= 2014, David Lang <david@lang.hm> wrote:=0A
=0A

= On Wed, 28 May 2014, dpreed@reed.com wrote:

I did not mean that "pacing". Sorry I used a ge= neric term. I meant what my
longer description described - a specifi= c mechanism for reducing bunching that
is essentially "cooperative" a= mong all active flows through a bottlenecked
link. That's part of a = "closed loop" control system driving each TCP endpoint
into a coopera= tive mode.

how do you think we can get feedback from the = bottleneck node to all the
different senders?

what happens= to the ones who try to play nice if one doesn't?, including what
hap= pens if one isn't just ignorant of the new cooperative mode, but activly tries to cheat? (as I understand it, this is the fatal flaw in many of = the past
buffering improvement proposals)

While the in-h= =0A ouse=0Arouter is the first bottleneck that user's traffic hits, the bigger problems happen when the bottleneck is in the peering between ISP= s, many
hops away from any sender, with many different senders compet= ing for the
avialable bandwidth.

This is where the new buf= fering approaches win. If the traffic is below the
congestion level, = they add very close to zero overhead, but when congestion
happens, th= ey manage the resulting buffers in a way that's works better for
peop= le (allowing short, fast connections to be fast with only a small impact on=
very long connections)

David Lang

The thing you call "pacing" is something= quite different. It is disconnected
from the TCP control loops invo= lved, which basically means it is flying blind.
Introducing that kind= of "pacing" almost certainly =0A reduces=0Athroughput, because
it *d= elays* packets.

The thing I called "pacing" is in no version of = Linux that I know of. Give it
a different name: "anti-bunching coope= ration" or "timing phase management for
congestion reduction". Rather= than *delaying* packets, it tries to get packets
to avoid bunching o= nly when reducing window size, and doing so by tightening
the control= loop so that the sender transmits as *soon* as it can, not by
delayi= ng sending after the sender dallies around not sending when it can.






On Tuesday, May 27, 2014 11:23am, "= Jim Gettys" <jg@freedesktop.org> said:




<= br />

On Sun, May 25, 2014 at 4:00 PM, <[dpreed@reed.com](ma= ilto:dpreed@reed.com)> wrote:

Not that it is directly relevan= t, but there is no essential reason to require 50 ms. of buffering. That m= ight be true of some particular QOS-related router algorith=0A m. 50=0Ams.= is about all one can tolerate in any router between source and destination= for today's networks - an upper-bound rather than a minimum.

Th= e optimum buffer state for throughput is 1-2 packets worth - in other words= , if we have an MTU of 1500, 1500 - 3000 bytes. Only the bottleneck buffer = (the input queue to the lowest speed link along the path) should have this = much actually buffered. Buffering more than this increases end-to-end laten= cy beyond its optimal state. Increased end-to-end latency reduces the effe= ctiveness of control loops, creating more congestion.

The ration= ale for having 50 ms. of buffering is probably to avoid disruption of burst= y mixed flows where the bursts might persist for 50 ms. and then die. One r= eason for this is that source nodes run operating systems that tend to rele= ase packets in bursts. That's a whole other discussion - in an ideal world,= source nodes would avoid bursty packet releases by letting the control by = the receiver=0A window=0Abe "tight" timing-wise. That is, to transmit a p= acket immediately at the instant an ACK arrives increasing the window. Thi= s would pace the flow - current OS's tend (due to scheduling mismatches) to= send bursts of packets, "catching up" on sending that could have been spac= ed out and done earlier if the feedback from the receiver's window advancin= g were heeded.

=E2=80=8B

That is, endpoint network st= acks (TCP implementations) can worsen congestion by "dallying". The ideal = end-to-end flows occupying a congested router would have their packets pace= d so that the packets end up being sent in the least bursty manner that an = application can support. The effect of this pacing is to move the "backlog= " for each flow quickly into the source node for that flow, which then prov= ides back pressure on the application driving the flow, which ultimately is= necessary to stanch congestion. The ideal congestion control mechanism sl= ows the sender part of the application to a pac=0A e that=0Acan go through = the network without contributing to buffering.
=E2=80=8B=E2=80=8B
=E2=80=8BPacing is in Linux 3.12(?). How long it will take to see widespr= ead deployment is another question, and as for other operating systems, who= knows.
See: [https://lwn.= net/Articles/564978/](https= ://lwn.net/Articles/564978/)
=E2=80=8B=E2=80=8B

Current= network stacks (including Linux's) don't achieve that goal - their pushbac= k on application sources is minimal - instead they accumulate buffering int= ernal to the network implementation.
=E2=80=8BThis is much, much less = true than it once was. There have been substantial changes in the Linux TC= P stack in the last year or two, to avoid generating packets before necessa= ry. Again, how long it will take for people to deploy this on Linux (and i= mplement on other OS's) is a question.
=E2=80=8B
This contributes= to end-to-end latency as well. But if you th=0A ink=0Aabout it, this is a= lmost as bad as switch-level bufferbloat in terms of degrading user experie= nce. The reason I say "almost" is that there are tools, rarely used in pra= ctice, that allow an application to specify that buffering should not build= up in the network stack (in the kernel or wherever it is). But the defaul= t is not to use those APIs, and to buffer way too much.

Remember= , the network send stack can act similarly to a congested switch (it is a s= witch among all the user applications running on that node). IF there is a= heavy file transfer, the file transfer's buffering acts to increase latenc= y for all other networked communications on that machine.

Tradit= ionally this problem has been thought of only as a within-node fairness iss= ue, but in fact it has a big effect on the switches in between source and d= estination due to the lack of dispersed pacing of the packets at the source= - in other words, the current design does nothing to stem the "burst g=0A = roups"=0Afrom a single source mentioned above.

So we do need the= source nodes to implement less "bursty" sending stacks. This is especiall= y true for multiplexed source nodes, such as web servers implementing thous= ands of flows.

A combination of codel-style switch-level buffer = management and the stack at the sender being implemented to spread packets = in a particular TCP flow out over time would improve things a lot. To achi= eve best throughput, the optimal way to spread packets out on an end-to-end= basis is to update the receive window (sending ACK) at the receive end as = quickly as possible, and to respond to the updated receive window as quickl= y as possible when it increases.

Just like the "bufferbloat" iss= ue, the problem is caused by applications like streaming video, file transf= ers and big web pages that the application programmer sees as not having a = latency requirement within the flow, so the application programmer does not= have an incentive to co=0A ntrol=0Apacing. Thus the operating system has = got to push back on the applications' flow somehow, so that the flow ends u= p paced once it enters the Internet itself. So there's no real problem cau= sed by large buffering in the network stack at the endpoint, as long as the= stack's delivery to the Internet is paced by some mechanism, e.g. tight ma= nagement of receive window control on an end-to-end basis.

I don= 't think this can be fixed by cerowrt, so this is out of place here. It's = partially ameliorated by cerowrt, if it aggressively drops packets from flo= ws that burst without pacing. fq_codel does this, if the buffer size it aim= s for is small - but the problem is that the OS stacks don't respond by pac= ing... they tend to respond by bursting, not because TCP doesn't provide th= e mechanisms for pacing, but because the OS stack doesn't transmit as soon = as it is allowed to - thus building up a burst unnecessarily.

Bu= rsts on a flow are thus bad in general. They make=0Acongestion happen when= it need not.
=E2=80=8BBy far the biggest headache is what the Web doe= s to the network. It has turned the web into a burst generator.
A typ= ical web page may have 10 (or even more images). See the "connections per = page" plot in the link below.
A browser downloads the base page, and t= hen, over N connections, essentially simultaneously downloads those embedde= d objects. Many/most of them are small in size (4-10 packets). You never = even get near slow start.
So you get an IW amount of data/TCP connecti= on, with no pacing, and no congestion avoidance. It is easy to observe 50-= 100 packets (or more) back to back at the bottleneck.
This is (in prac= tice) the amount you have to buffer today: that burst of packets from a web= page. Without flow queuing, you are screwed. With it, it's annoying, but= can be tolerated.
I go over this is detail in:

[http://gettys.wordpress.com/2013/07/10/low-l= atency-requires-smart-queuing-traditional-aqm-is-not-enough/](http://gettys.wordpress.com/2013/07/10/lo= w-latency-requires-smart-queuing-traditional-aqm-is-not-enough/)=E2=80= =8B
So far, I don't believe anyone has tried pacing the IW burst of pa= ckets. I'd certainly like to see that, but pacing needs to be across TCP c= onnections (host pairs) to be possibly effective to outwit the gaming the w= eb has done to the network.
- Jim





<= br />

On Sunday, May 25, 2014 11:42am, "Mikael Abrahamsson" <= [swmike@swm.pp.se](mailto:swmike@swm.pp.se)> said:



On Sun, 25 May 2014, D= ane Medic wrote:

= Is it true that devices with less than 64 MB can't handle QOS? ->
[= https://lists.chambana.net/pipermail/commotion-dev/2014-May/001816= .html](https://lists.chambana.net/pipermail/commotion-dev/2014= -May/001816.html)

At gig speeds you need around 50ms = worth of buffering. 1 gigabit/s =3D
125 megabyte/s meaning for 50ms yo= u need 6.25 megabyte of buffer.

I also don't see why performance= and memory size would be relevant, I'd
say forwarding performance has= more to do with CPU speed than anything
else.

--
Mika= el Abrahamsson email:=0A[swmike@swm.pp.se](mailto:swmike@swm.pp.se)


Cerowrt-devel mailing list
[Cerowrt-devel@lists.bufferblo= at.net](mailto:Cerowrt-devel@lists.bufferbloat.net)
[https://lists.bufferbloat.n= et/listinfo/cerowrt-devel](https://lists.bufferbloat.net/listinfo/cerowrt-devel)



Cerowrt-devel mailing list
[Cerowrt-= devel@lists.bufferbloat.net](mailto:Cerowrt-devel@lists.bufferbloat.net)[
https:= //lists.bufferbloat.net/listinfo/cerowrt-devel](https://lists.bufferbloat.net/lis= tinfo/cerowrt-devel)
=0A

&n= bsp;

=0A



C= erowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists= .bufferbloat.net/listinfo/cerowrt-devel
=0A
=0A<= /div>=0A
-- Sent from my Android device with K-@ Mail= . Please excuse my brevity.
------=_20140529112930000000_15668--