While this appears to make a great deal of sense<br><br><a href="http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01">http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01</a><br><br>and just landed in<br>
<br><a href="http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6ba8a3b19e764b6a65e4030ab0999be50c291e6c">http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6ba8a3b19e764b6a65e4030ab0999be50c291e6c</a><br clear="all">
<br>I was intrigued by several of the pieces of data that drive this stuff<br><pre class="newpage">Measurements on Google Web servers show that approximately 70% of
retransmissions for Web transfers are sent after the RTO timer
expires, while only 30% are handled by fast recovery. Even on
servers exclusively serving YouTube videos, RTO based retransmissions,</pre>96%
of the timeout episodes occur without any preceding duplicate ACKs or
other indication of losses at the sender<br><br><br><br><br>And especially this, in the context of a post-delay-aware-aqm world.<br><br><br><br><pre class="newpage">The key takeaway (with TLP) is: the
average response time improved up to 7% and the 99th percentile
improved by 10%. Nearly all of the improvement for TLP is in the
tail latency (post-90th percentile). The varied improvements across
services are due to different response-size distributions and traffic
patterns. For example, TLP helps the most for Images, as these are
served by multiple concurrently active TCP connections which increase
the chances of tail segment losses.
</pre><br><pre class="newpage">Application Average 99%
Google Web Search -3% -5%
Google Maps -5% -10%
Google Images -7% -10%
TLP also improved performance in mobile networks -- by 7.2% for Web
search and Instant and 7.6% for Images transferred over Verizon
network. To see why and where the latency improvements are coming
from, we measured the retransmission statistics. We broke down the
retransmission stats based on nature of retransmission -- timeout
retransmission or fast recovery. TLP reduced the number of timeouts
by 15% compared to the baseline, i.e. (timeouts_tlp -
timeouts_baseline) / timeouts_baseline = 15%. Instead, these losses
were either recovered via fast recovery or by the loss probe
retransmission itself. The largest reduction in timeouts is when the
sender is in the Open state in which it receives only insequence ACKs
and no duplicate ACKs, likely because of tail losses.
Correspondingly, the retransmissions occurring in the slow start
phase after RTO reduced by 46% relative to baseline. Note that it is
not always possible for TLP to convert 100% of the timeouts into fast
recovery episodes because a probe itself may be lost. Also notable
in our experiments is a significant decrease in the number of
spurious timeouts -- the experiment had 61% fewer congestion window
undo events. The Linux TCP sender uses either DSACK or timestamps to
determine if retransmissions are spurious and employs techniques for
undoing congestion window reductions. We also note that the total
number of retransmissions decreased 7% with TLP because of the
decrease in spurious retransmissions, and because the TLP probe
itself plugs a hole.</pre>-- <br>Dave Täht<br><br>Fixing bufferbloat with cerowrt: <a href="http://www.teklibre.com/cerowrt/subscribe.html" target="_blank">http://www.teklibre.com/cerowrt/subscribe.html</a>