From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp191.iad.emailsrvr.com (smtp191.iad.emailsrvr.com [207.97.245.191]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 79A8B21F1C3 for ; Sun, 7 Jul 2013 11:53:00 -0700 (PDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp59.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id B19053F0116; Sun, 7 Jul 2013 14:52:58 -0400 (EDT) X-Virus-Scanned: OK Received: from legacy10.wa-web.iad1a (legacy10.wa-web.iad1a.rsapps.net [192.168.4.112]) by smtp59.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id 926FC3F010F; Sun, 7 Jul 2013 14:52:58 -0400 (EDT) Received: from reed.com (localhost.localdomain [127.0.0.1]) by legacy10.wa-web.iad1a (Postfix) with ESMTP id 774C01A0011; Sun, 7 Jul 2013 14:52:58 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) with HTTP; Sun, 7 Jul 2013 14:52:58 -0400 (EDT) Date: Sun, 7 Jul 2013 14:52:58 -0400 (EDT) From: dpreed@reed.com To: "Mikael Abrahamsson" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20130707145258000000_21271" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: References: Message-ID: <1373223178.486913695@apps.rackspace.com> X-Mailer: webmail7.0 Cc: cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] happy 4th! X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jul 2013 18:53:00 -0000 ------=_20130707145258000000_21271 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AWhereever the idea came from that you "had to buffer RTT*2" in a midpath= node, it is categorically wrong.=0A =0AWhat is possibly relevant is that y= ou will have RTT * bottleneck-bit-rate bits "in flight" from end-to-end in = order not to be constrained by the acknowledgement time. That is: TCP's o= utstanding "window" should be RTT*bottleneck-bit-rate to maximize throughpu= t. Making the window *larger* than that is not helpful.=0A =0ASo when som= ebody "throws that in your face", just confidently use the words "Bullshit,= show me evidence", and ignore the ignorant person who is repeating an urba= n legend similar to the one about the size of crocodiles in New York's sewe= rs that are supposedly there because people throw pet crocodiles down there= .=0A =0AIf you need a simplified explanation of why having 2*RTT-in-the-wor= st-case-around-the-world * maximum-bit-rate-on-the-path, all you need to th= ink about is what happens when some intermediate huge bottleneck buffer fil= ls up (which it certainly will, very quickly, since by definition the paths= feeding it have much higher delivery rates than it can handle).=0A =0AWhat= will happen? A packet will be silently discarded from the "tail" of the q= ueue. But that packet's loss will not be discovered by the endpoints until= the "bottleneck-bit-rate" * the worst-case-RTT * 2 (or maybe 4 if the reve= rse path is similarly clogged) seconds later. Meanwhile the sources would = have happily *sustained* the size of the bottleneck's buffer, by putting ou= t that many bits past the lost packet's position (thinking all is well).=0A= =0AAnd so what will happen? most of the following packets behind the lost= packet will be retransmitted by the source again. This of course *double= s* the packet rate into the bottleneck.=0A =0AAnd there is an infinite regr= ession - all the while there being a solidly maintained extremely long queu= e of packets that are waiting for the bottleneck link. Many, many seconds = of end-to-end latency on that link, perhaps.=0A =0AOnly if all users "give = up and go home" for the day on that link will the bottleneck link's send qu= eue ever drain. New TCP connections will open, and if lucky, they will see= a link with delays from earth-to-pluto as its norm on their SYN/SYN-ACK. = But they won't get better service than that, while continuing to congest th= e node.=0A =0AWhat you need is a message from the bottleneck link to say "W= HOA - I can't process all this traffic". And that happens *only* when that= link actually drops packets after about 50 msec. or less of traffic is que= ued.=0A =0A =0A =0A =0A=0A=0AOn Thursday, July 4, 2013 1:57am, "Mikael Abra= hamsson" said:=0A=0A=0A=0A> On Wed, 3 Jul 2013, Dave Tah= t wrote:=0A> =0A> > Suggestions as to things to test and code to test them = welcomed. In=0A> =0A> I'm wondering a bit what the shallow buffering depth = means to higher-RTT=0A> connections. When I advocate bufferbloat solutions = I usually get thrown in=0A> my face that shallow buffering means around-the= -world TCP-connections will=0A> behave worse than with a lot of buffers (tr= aditional truth being that you=0A> need to be able to buffer RTT*2).=0A> = =0A> It would be very interesting to see what an added 100ms=0A> ()=0A> and some packet loss/PDV would result in. If it still works well= , at least=0A> it would mean that people concerned about this could go back= to rest.=0A> =0A> Also, would be interesting to see is Googles proposed QU= IC interacts well=0A> with the bufferbloat solutions. I imagine it will sin= ce it in itself=0A> measures RTT and FQ_CODEL is all about controlling dela= y, so I imagine=0A> QUIC will see a quite constant view of the world throug= h FQ_CODEL.=0A> =0A> --=0A> Mikael Abrahamsson email: swmike@swm.pp.se= =0A> _______________________________________________=0A> Cerowrt-devel mail= ing list=0A> Cerowrt-devel@lists.bufferbloat.net=0A> https://lists.bufferbl= oat.net/listinfo/cerowrt-devel=0A> ------=_20130707145258000000_21271 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

= Whereever the idea came from that you "had to buffer RTT*2" in a midpath no= de, it is categorically wrong.

=0A

 = ;

=0A

What is possibly relevant is that = you will have RTT * bottleneck-bit-rate bits "in flight" from end-to-end in= order not to be constrained by the acknowledgement time.   That = is: TCP's outstanding "window" should be RTT*bottleneck-bit-rate to maximiz= e throughput.   Making the window *larger* than that is not helpf= ul.

=0A

 

=0A

So when somebody "throws that in your face", just confidently= use the words "Bullshit, show me evidence", and ignore the ignorant person= who is repeating an urban legend similar to the one about the size of croc= odiles in New York's sewers that are supposedly there because people throw = pet crocodiles down there.

=0A

 =0A

If you need a simplified explanation o= f why having 2*RTT-in-the-worst-case-around-the-world * maximum-bit-rate-on= -the-path, all you need to think about is what happens when some intermedia= te huge bottleneck buffer fills up (which it certainly will, very quickly, = since by definition the paths feeding it have much higher delivery rates th= an it can handle).

=0A

 

=0A

What will happen?  A packet will be silen= tly discarded from the "tail" of the queue.  But that packet's loss wi= ll not be discovered by the endpoints until the "bottleneck-bit-rate" * the= worst-case-RTT * 2 (or maybe 4 if the reverse path is similarly clogged) s= econds later.  Meanwhile the sources would have happily *sustained* th= e size of the bottleneck's buffer, by putting out that many bits past the l= ost packet's position (thinking all is well).

=0A

 

=0A

And so what will ha= ppen?  most of the following packets behind the lost packet will be re= transmitted by the source again.   This of course *doubles* the p= acket rate into the bottleneck.

=0A

&nbs= p;

=0A

And there is an infinite regressi= on - all the while there being a solidly maintained extremely long queue of= packets that are waiting for the bottleneck link.  Many, many seconds= of end-to-end latency on that link, perhaps.

=0A

 

=0A

Only if all users "= give up and go home" for the day on that link will the bottleneck link's se= nd queue ever drain.  New TCP connections will open, and if lucky, the= y will see a link with delays from earth-to-pluto as its norm on their SYN/= SYN-ACK.  But they won't get better service than that, while continuin= g to congest the node.

=0A

 

=0A=

What you need is a message from the bottle= neck link to say "WHOA - I can't process all this traffic".  And that = happens *only* when that link actually drops packets after about 50 msec. o= r less of traffic is queued.

=0A

 <= /p>=0A

 

=0A

 

=0A

 

=0A=0A



On Thursday, July 4, 2013 1:57am, "Mikael Ab= rahamsson" <swmike@swm.pp.se> said:

=0A
=0A

> On Wed, 3 Jul 20= 13, Dave Taht wrote:
>
> > Suggestions as to things to = test and code to test them welcomed. In
>
> I'm wondering = a bit what the shallow buffering depth means to higher-RTT
> connec= tions. When I advocate bufferbloat solutions I usually get thrown in
&= gt; my face that shallow buffering means around-the-world TCP-connections w= ill
> behave worse than with a lot of buffers (traditional truth be= ing that you
> need to be able to buffer RTT*2).
>
&g= t; It would be very interesting to see what an added 100ms
> (<h= ttp://stackoverflow.com/questions/614795/simulate-delayed-and-dropped-packe= ts-on-linux>)
> and some packet loss/PDV would result in. If it = still works well, at least
> it would mean that people concerned ab= out this could go back to rest.
>
> Also, would be interes= ting to see is Googles proposed QUIC interacts well
> with the buff= erbloat solutions. I imagine it will since it in itself
> measures = RTT and FQ_CODEL is all about controlling delay, so I imagine
> QUI= C will see a quite constant view of the world through FQ_CODEL.
> <= br />> --
> Mikael Abrahamsson email: swmike@swm.pp.se
&= gt; _______________________________________________
> Cerowrt-devel= mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https= ://lists.bufferbloat.net/listinfo/cerowrt-devel
>

=0A
------=_20130707145258000000_21271--