<div dir="ltr">Mario,<div><br></div><div>putting aside LoLa for a second,</div><div>I'm not quite sure that the theorem you cite is valid.</div><div><br></div><div>According to the model R_i is the sending rate.</div><div>The sum across all flows bottlenecked at the link does not need to be equal to the link capacity.</div><div>The input rate can be instantaneously below or above the link capacity.</div><div>Even If the link is never idle and the output rate is always C for all t.</div><div><br></div><div>If the sum_i R_i = C is false because of what I said, than p_i which is nothing more than</div><div>the shadow price of the link capacity constraint, can be a function of a constant delay d, i.e. p_i = cost * d for all i.</div><div><br></div><div>This theorem can be valid only if the input rate of a queue is instantaneously equal to the output queue.</div><div>We all know that a queue exists just because there is an instantaneous difference of input and output rates </div><div>at the link. So to conclude,  this theorem if valid iff input == output rate then the queue is always zero, i.e.  d=0.</div><div><br></div><div>The theorem is either an artefact of the model or just wrong. Or I'm missing something...</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br><div class="gmail_quote"><div dir="ltr">On Thu, Nov 29, 2018 at 6:07 PM Mario Hock <<a href="mailto:mario.hock@kit.edu">mario.hock@kit.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Luca,<br>

<br>

I'm answering on behalf of Roland, since I am a co-author of the paper.<br>

<br>

This is an excellent question, since it goes right at the heart of how <br>

LoLa works.<br>

<br>

Indeed, the paper is a first of a series. A second one, looking deeper <br>

into the fair flow balancing mechanism, is currently under submission.<br>

<br>

Similar as other delay based congestion controls, LoLa tries to achieve <br>

fairness by allowing each flow to buffer the same amount of data at the <br>

bottleneck. We have this, e.g., in TCP Vegas, and (in a way) also in <br>

Copa (a recently proposed congestion control) and many others. If this <br>

is achieved, we get flow rate fairness independent of a flow's RTT.<br>

<br>

Usually (in other congestion controls) this "allowed amount of data" is <br>

fixed per flow. We presume that this approach does not scale well to <br>

high speed networks. Since the queuing delay resulting from this amount <br>

of data is reduced with increasing bottleneck rate. Thus, it becomes <br>

harder to measure it right. This can easily be seen (and proven) for TCP <br>

Vegas.<br>

<br>

Note: Just using higher fixed values is not an option, since it would <br>

not work at lower speeds anymore and also not with a large number of flows.<br>

<br>

Therefore, LoLa tries to find a suitable value for the "allowed amount <br>

of data" dynamically. This is X(t).<br>

<br>

Our approach is to grow X(t) over time during the Fair Flow Balancing <br>

phase. This phase ends when the queuing delay reaches 5ms. Thus, (in the <br>

ideal case) at the end of Fair Flow Balancing, X(t) is just as large <br>

that all flows at bottleneck create a queuing delay of 5ms, and all <br>

flows contribute equally to this queue. Hence, flow rate fairness is <br>

achieved. (Note that LoLa is designed in a way that t is (almost) <br>

synchronized among the competing flows.)<br>

<br>

Generally, other ways of determining a suitable X(t) are conceivable. In <br>

our approach X(t) is a monotonically increasing function, but it is <br>

regularly reset as LoLa cycles between its states; i.e., after a queuing <br>

delay of 5ms is reached, the queue is drained and everything starts <br>

again. (Thus, the timespan where X(t) is monotonically increased is <br>

called a "round of fair flow balancing".)<br>

<br>

This way we can overcome the constraint given in [1]:<br>

<br>

"""<br>

THEOREM 6 (FAIRNESS/DELAY TRADEOFF). For congestion control mechanisms <br>

that have steady state throughput of the kind R = f(d, p), for some <br>

function f, delay d and feedback p, if the feedback is based on purely <br>

end to end delay measurements, you can either have fairness or a fixed <br>

delay, but not both simultaneously<br>

"""<br>

<br>

[1] "ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY"<br>

Yibo Zhu et al., <a href="https://dl.acm.org/citation.cfm?id=2999593" rel="noreferrer" target="_blank">https://dl.acm.org/citation.cfm?id=2999593</a><br>

<br>

Best, Mario<br>

<br>

<br>

Am 29.11.18 um 17:09 schrieb Luca Muscariello:<br>

> Hi Roland,<br>

> <br>

> It took me quite a lot of time to find this message in the thread...<br>

> I read the paper you sent and I guess this is the first of a series as <br>

> many things stay uncovered.<br>

> <br>

> Just a quick question: why is X(t) always increasing with  t?<br>

> <br>

> <br>

> On Tue, Nov 27, 2018 at 11:26 AM Bless, Roland (TM) <br>

> <<a href="mailto:roland.bless@kit.edu" target="_blank">roland.bless@kit.edu</a> <mailto:<a href="mailto:roland.bless@kit.edu" target="_blank">roland.bless@kit.edu</a>>> wrote:<br>

> <br>

>     Hi Luca,<br>

> <br>

>     Am 27.11.18 um 10:24 schrieb Luca Muscariello:<br>

>      > A congestion controlled protocol such as TCP or others, including<br>

>     QUIC,<br>

>      > LEDBAT and so on<br>

>      > need at least the BDP in the transmission queue to get full link<br>

>      > efficiency, i.e. the queue never empties out.<br>

> <br>

>     This is not true. There are congestion control algorithms<br>

>     (e.g., TCP LoLa [1] or BBRv2) that can fully utilize the bottleneck link<br>

>     capacity without filling the buffer to its maximum capacity. The BDP<br>

>     rule of thumb basically stems from the older loss-based congestion<br>

>     control variants that profit from the standing queue that they built<br>

>     over time when they detect a loss:<br>

>     while they back-off and stop sending, the queue keeps the bottleneck<br>

>     output busy and you'll not see underutilization of the link. Moreover,<br>

>     once you get good loss de-synchronization, the buffer size requirement<br>

>     for multiple long-lived flows decreases.<br>

> <br>

>      > This gives rule of thumbs to size buffers which is also very<br>

>     practical<br>

>      > and thanks to flow isolation becomes very accurate.<br>

> <br>

>     The positive effect of buffers is merely their role to absorb<br>

>     short-term bursts (i.e., mismatch in arrival and departure rates)<br>

>     instead of dropping packets. One does not need a big buffer to<br>

>     fully utilize a link (with perfect knowledge you can keep the link<br>

>     saturated even without a single packet waiting in the buffer).<br>

>     Furthermore, large buffers (e.g., using the BDP rule of thumb)<br>

>     are not useful/practical anymore at very high speed such as 100 Gbit/s:<br>

>     memory is also quite costly at such high speeds...<br>

> <br>

>     Regards,<br>

>       Roland<br>

> <br>

>     [1] M. Hock, F. Neumeister, M. Zitterbart, R. Bless.<br>

>     TCP LoLa: Congestion Control for Low Latencies and High Throughput.<br>

>     Local Computer Networks (LCN), 2017 IEEE 42nd Conference on, pp.<br>

>     215-218, Singapore, Singapore, October 2017<br>

>     <a href="http://doc.tm.kit.edu/2017-LCN-lola-paper-authors-copy.pdf" rel="noreferrer" target="_blank">http://doc.tm.kit.edu/2017-LCN-lola-paper-authors-copy.pdf</a><br>

> <br>

>      > Which is:<br>

>      ><br>

>      > 1) find a way to keep the number of backlogged flows at a<br>

>     reasonable value.<br>

>      > This largely depends on the minimum fair rate an application may<br>

>     need in<br>

>      > the long term.<br>

>      > We discussed a little bit of available mechanisms to achieve that<br>

>     in the<br>

>      > literature.<br>

>      ><br>

>      > 2) fix the largest RTT you want to serve at full utilization and size<br>

>      > the buffer using BDP * N_backlogged.<br>

>      > Or the other way round: check how much memory you can use<br>

>      > in the router/line card/device and for a fixed N, compute the largest<br>

>      > RTT you can serve at full utilization.<br>

>      ><br>

>      > 3) there is still some memory to dimension for sparse flows in<br>

>     addition<br>

>      > to that, but this is not based on BDP.<br>

>      > It is just enough to compute the total utilization of sparse<br>

>     flows and<br>

>      > use the same simple model Toke has used<br>

>      > to compute the (de)prioritization probability.<br>

>      ><br>

>      > This procedure would allow to size FQ_codel but also SFQ.<br>

>      > It would be interesting to compare the two under this buffer sizing.<br>

>      > It would also be interesting to compare another mechanism that we<br>

>     have<br>

>      > mentioned during the defense<br>

>      > which is AFD + a sparse flow queue. Which is, BTW, already<br>

>     available in<br>

>      > Cisco nexus switches for data centres.<br>

>      ><br>

>      > I think that the the codel part would still provide the ECN feature,<br>

>      > that all the others cannot have.<br>

>      > However the others, the last one especially can be implemented in<br>

>      > silicon with reasonable cost.<br>

> <br>

> <br>

> _______________________________________________<br>

> Bloat mailing list<br>

> <a href="mailto:Bloat@lists.bufferbloat.net" target="_blank">Bloat@lists.bufferbloat.net</a><br>

> <a href="https://lists.bufferbloat.net/listinfo/bloat" rel="noreferrer" target="_blank">https://lists.bufferbloat.net/listinfo/bloat</a><br>

> <br>

</blockquote></div></div></div>