[Bloat] tcp loss probes in linux 3.10

General list for discussing Bufferbloat
 help / color / mirror / Atom feed

* [Bloat] tcp loss probes in linux 3.10
@ 2013-05-09 20:55 Dave Taht
  2013-05-09 21:39 ` Hagen Paul Pfeifer
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Taht @ 2013-05-09 20:55 UTC (permalink / raw)
  To: bloat

[-- Attachment #1: Type: text/plain, Size: 3218 bytes --]

While this appears to make a great deal of sense

http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01

and just landed in

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6ba8a3b19e764b6a65e4030ab0999be50c291e6c

I was intrigued by several of the pieces of data that drive this stuff

Measurements on Google Web servers show that approximately 70% of
   retransmissions for Web transfers are sent after the RTO timer
   expires, while only 30% are handled by fast recovery.  Even on
   servers exclusively serving YouTube videos, RTO based retransmissions,

96% of the timeout episodes occur without any preceding duplicate ACKs or
other indication of losses at the sender

And especially this, in the context of a post-delay-aware-aqm world.

The key takeaway (with TLP) is: the
   average response time improved up to 7% and the 99th percentile
   improved by 10%.  Nearly all of the improvement for TLP is in the
   tail latency (post-90th percentile).  The varied improvements across
   services are due to different response-size distributions and traffic
   patterns.  For example, TLP helps the most for Images, as these are
   served by multiple concurrently active TCP connections which increase
   the chances of tail segment losses.

Application        Average   99%

   Google Web Search  -3%       -5%

   Google Maps        -5%       -10%

   Google Images      -7%       -10%

   TLP also improved performance in mobile networks -- by 7.2% for Web
   search and Instant and 7.6% for Images transferred over Verizon
   network.  To see why and where the latency improvements are coming
   from, we measured the retransmission statistics.  We broke down the
   retransmission stats based on nature of retransmission -- timeout
   retransmission or fast recovery.  TLP reduced the number of timeouts
   by 15% compared to the baseline, i.e. (timeouts_tlp -
   timeouts_baseline) / timeouts_baseline = 15%.  Instead, these losses
   were either recovered via fast recovery or by the loss probe
   retransmission itself.  The largest reduction in timeouts is when the
   sender is in the Open state in which it receives only insequence ACKs
   and no duplicate ACKs, likely because of tail losses.
   Correspondingly, the retransmissions occurring in the slow start
   phase after RTO reduced by 46% relative to baseline.  Note that it is
   not always possible for TLP to convert 100% of the timeouts into fast
   recovery episodes because a probe itself may be lost.  Also notable
   in our experiments is a significant decrease in the number of
   spurious timeouts -- the experiment had 61% fewer congestion window
   undo events.  The Linux TCP sender uses either DSACK or timestamps to
   determine if retransmissions are spurious and employs techniques for
   undoing congestion window reductions.  We also note that the total
   number of retransmissions decreased 7% with TLP because of the
   decrease in spurious retransmissions, and because the TLP probe
   itself plugs a hole.

-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html

[-- Attachment #2: Type: text/html, Size: 3640 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Bloat] tcp loss probes in linux 3.10
  2013-05-09 20:55 [Bloat] tcp loss probes in linux 3.10 Dave Taht
@ 2013-05-09 21:39 ` Hagen Paul Pfeifer
  2013-05-09 23:01   ` Dave Taht
  0 siblings, 1 reply; 4+ messages in thread
From: Hagen Paul Pfeifer @ 2013-05-09 21:39 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

* Dave Taht | 2013-05-09 13:55:18 [-0700]:

>While this appears to make a great deal of sense
>
>http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01
>
>and just landed in
>
>http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6ba8a3b19e764b6a65e4030ab0999be50c291e6c
>
>I was intrigued by several of the pieces of data that drive this stuff
>
>Measurements on Google Web servers show that approximately 70% of
>   retransmissions for Web transfers are sent after the RTO timer
>   expires, while only 30% are handled by fast recovery.  Even on
>   servers exclusively serving YouTube videos, RTO based retransmissions,
>
>96% of the timeout episodes occur without any preceding duplicate ACKs or
>other indication of losses at the sender

Btw: Nandita introduced a new MIB entry: LINUX_MIB_TCPLOSSPROBES.

$ nstat -a | grep TCPLossProbes

will show fired probes.



Hagen


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Bloat] tcp loss probes in linux 3.10
  2013-05-09 21:39 ` Hagen Paul Pfeifer
@ 2013-05-09 23:01   ` Dave Taht
  2013-05-09 23:43     ` Jonathan Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Taht @ 2013-05-09 23:01 UTC (permalink / raw)
  To: Hagen Paul Pfeifer; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 1403 bytes --]

On Thu, May 9, 2013 at 2:39 PM, Hagen Paul Pfeifer <hagen@jauu.net> wrote:

> * Dave Taht | 2013-05-09 13:55:18 [-0700]:
>
> >While this appears to make a great deal of sense
> >
> >http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01
> >
> >and just landed in
> >
> >
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6ba8a3b19e764b6a65e4030ab0999be50c291e6c
> >
> >I was intrigued by several of the pieces of data that drive this stuff
> >
> >Measurements on Google Web servers show that approximately 70% of
> >   retransmissions for Web transfers are sent after the RTO timer
> >   expires, while only 30% are handled by fast recovery.  Even on
> >   servers exclusively serving YouTube videos, RTO based retransmissions,
> >
> >96% of the timeout episodes occur without any preceding duplicate ACKs or
> >other indication of losses at the sender
>
> Btw: Nandita introduced a new MIB entry: LINUX_MIB_TCPLOSSPROBES.
>
> $ nstat -a | grep TCPLossProbes
>
> will show fired probes.
>
>
>
I have to admit that the 96% figure strongly suggests some degree of
bufferbloat in the tested mix here. I am curious however as to what other
causes there might be, ranging from tcp bugs to glitches in the matrix?

>
> Hagen
>
>


-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html

[-- Attachment #2: Type: text/html, Size: 2362 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Bloat] tcp loss probes in linux 3.10
  2013-05-09 23:01   ` Dave Taht
@ 2013-05-09 23:43     ` Jonathan Morton
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Morton @ 2013-05-09 23:43 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

On 10 May, 2013, at 2:01 am, Dave Taht wrote:

> I have to admit that the 96% figure strongly suggests some degree of bufferbloat in the tested mix here. I am curious however as to what other causes there might be, ranging from tcp bugs to glitches in the matrix? 

For the specific case of YouTube servers, the fact that the video stream is "bottled" so that only a few seconds of buffering-ahead are available to the client probably plays a role.  Sections of the file are released in bursts, filling any buffers en route.  There is a good chance that the end of a burst is often consumed by a subsequent tail-drop loss episode.  If the time between release bursts exceeds the RTO, then the RTO will be the only information (by default) reaching the server about the loss event.

The bursty release is worse for TCP than a free-streaming flow, because the latter would be largely self-timed by the steady return of ACKs, with the congestion window remaining full most of the time - so only the bottleneck queue fills up and overflows.  When a burst is released, however, other intermediate queues can also fill up and overflow, resulting in a larger number of lost packets - and yet a larger congestion window might result.

So it is still bufferbloat, but there can be strange and unintended interactions with some systems.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-05-09 23:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-09 20:55 [Bloat] tcp loss probes in linux 3.10 Dave Taht
2013-05-09 21:39 ` Hagen Paul Pfeifer
2013-05-09 23:01   ` Dave Taht
2013-05-09 23:43     ` Jonathan Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox