[Bloat] BBR high RTT unfairness: Fifty Shades of Congestion Control: A Performance and Interactions Evaluation

Fri May 31 08:39:07 EDT 2019

Hi Dave,

On 30.05.19 at 15:38 Dave Taht wrote:
> On Thu, May 30, 2019 at 4:31 AM Roland Bless <roland.bless at kit.edu> wrote:
>>
>> Hi Dave,
>>
>> On 29.05.19 at 17:05 Dave Taht wrote:
>>> I have been trying to work through this paper:
>>> https://arxiv.org/pdf/1903.03852.pdf
>>> which is enormous and well worth reading.
>>>
>>> I have a theory, though, about TABLE XII, which contrasts four BBR
>>> flows at different RTTs, in that BBRv1's probe phase makes a 200ms
>>> assumption, thus
>>> not seeing the real rtt at ong rtts, and thus the longest RTT flow
>>> gets the most bandwidth on this test, and the second (testable) theory
>>> is that were these rtts not exactly on the 100ms boundaries, we  would
>>> see more throughput fairness.
>>
>> Nope, the main reason for RTT unfairness in BBRv1 is its
>> CWnd cap at 2*(RTT_min*est_bw) (2*estimated bottleneck BDP share).
> 
> The striking thing about that table was that the 300ms result was the
> ~same as the 100ms result for throughput, while the ones on the 200
> and 400ms ones were 2x and 4x respectively.

Yes, correct.

> My thought was that at extraordinary RTTs (anything > planet girdling
> e.g. > 200ms) that trying a probe of 250ms (or some degree of variance
> periodically - 220ms, 260ms, 180ms) or changing the period of the
> probe itself, would desync things and get closer to the real RTT,
> particularly when BBR was duking it out with itself.
> 
> This would also make up for researchers (which includes myself until I
> trained myself out of it) tending to always start a test with multiple
> flows all at exactly the same time, which could be another flaw in
> this dataset.

You are right and we ran into this trap also at first and
experienced similar effects in our BBR evaluations. Therefore,
we used starting times that are not multiple of 10s (0s, 23s, 31s, 38s,
42s, 47s), because BBR probes every 10s for the min RTT, cf. Section V.D
of our paper below.

>> As we showed in http://doc.tm.kit.edu/2017-kit-icnp-bbr-authors-copy.pdf
>> Section III: multiple BBR flows will always increase their CWnd up
>> to this point (except when the buffer capacity is smaller than a BDP).
>> Neal's explanation is in line with our findings.
>> Consequently, each flow will converge towards a share of RTT_min*est_bw
>> at the bottleneck queue, providing a larger bandwidth share for flows
>> with a larger RTT_min. See also Section V.F of our paper that also
>> evaluated RTT unfairness (moreover, the outcome depends also on the
>> bottleneck buffer size).
> 
> I get it, it's my point above about not seeing RTT_min properly with
> synced flows...

I'm not sure, my explanations hold even if all flows see the proper
RTT_min. It seems that the 300ms RTT flow has difficulties in getting
a larger share.

>> Unfortunately, they didn't test TCP-LoLa in this context, since it is
>> actually able to provide fairness among flows with different RTTs
>> (while still limiting the overall queuing delay).
> 
> I keep hoping people keep their labs setup, so that we could have 54
> shades of congestion control going forward (dctcp, bbrv2, lola, fu)
> and a stable base of data to work from.
> 
> Me being me I'd also like to vary the fq and aqm algorithms using the
> same test setups.
> 
> I'll ping the authors.
> 
>> Moreover, Mario
>> and Felix improved the convergence speed by introducing FFBquick, see:
>> http://doc.tm.kit.edu/Poster/2019-FFBquick_Networking.pdf
>> for a quick glance on the challenges and the solution.
>> This was published as poster paper at Networking 2019:
>> M. Hock, R. Bless, F. Neumeister, M. Zitterbart: FFBquick: Fast
>> Convergence to Fairness for Delay-bounded Congestion Controls,
>> Networking 2019, Warsaw, Poland, May 20-22.
>>
>> Regards
>>  Roland
>>
> 
> --
> 
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740
>