[Bloat] review: Deployment of RITE mechanisms, in use-case trial testbeds report part 1

Wed Mar 2 15:53:55 EST 2016

On 02/03/16 18:09, Fred Baker (fred) wrote:
>> On Feb 27, 2016, at 11:04 AM, Dave Täht <dave at taht.net> wrote:
>>
>> https://reproducingnetworkresearch.wordpress.com/2014/06/03/cs244-14-confused-timid-and-unstable-picking-a-video-streaming-rate-is-hard/
>>
>>>    o the results are very poor with a particular popular AQM
>> Define "very poor". ?
> Presuming this is Adaptive Bitrate Video, as in Video-in-TCP, we (as in Cisco engineers, not me personally; you have met them) have observed this as well. Our belief is that this is at least in part a self-inflicted wound; when the codec starts transmission on any four second segment except the first, there is no slow-start phase because the TCP session is still open (and in the case of some services, there are several TCP sessions open and the application chooses the one with the highest cwnd value). You can now think of the behavior of the line as repeating a four phase sequence: nobody is talking, then one is talking, then both are, and then the other is talking. When only one is talking, whichever it is, its cwnd value is slowing increasing - especially if cwnd*mss/rtt < bottleneck line rate, minimizing RTT. At the start of the "both are talking" phase, the one already talking has generally found a cwnd value that fills the line and its RTT is slowly increasing. The one starting sends a burst of cwnd packets, creating an instant queue and often causing one or both to drop a packet - reducing their respective cwnd values. Depending on the TCP implementation in question at the sender, if the induced drop isn't a single packet but is two or three, that can make the affected session pause for as many RTO timeouts (Reno), RTTs (New Reno), or at least retransmit the lost packets in the subsequent RTT and then reduce cwnd by at least that amount (cubic) and maybe half (SACK).

Interesting!  Just as Dave reminds us that Google avoid the bursts you 
describe, using pacing. (See end of message 
https://lists.bufferbloat.net/pipermail/bloat/2016-February/007205.html)

You can call the result a disadvantage of FQ in the real world if you 
want.  But you can also say it provides some necessary alignment of 
incentives.  Incentives for applications to develop more 
network-friendly behaviour :).  I was surprised that a project with 
large ISP involvement seems to take the first point of view.

(Also the part about connections being chosen by cwnd helps explain the 
fq_codel throughput graph.  You can see the audio and video connections 
switch roles several times.  The same times as the bitrate fluctuates, I 
notice)

I was just skimming PANDA[1], which does AIMD for adaptive streaming. So 
they decrement the interval between chunk fetches, until the observed 
throughput _over the full on-off cycle_ is sufficient to sustain the 
next quality level.  <handwave>It could just as easily pace the fetch 
over the full period. <utopian>No more on-off cycle, no damaging bursts 
of packets?

Alan

[1] http://arxiv.org/pdf/1305.0510.pdf