<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div>Mario, </div><div><br></div><div>agreed.</div><div>Two last comments: one should always used fluid approximation with care, because they are approximations, </div><div>the real model is more complex. Nobody considers that the RTT varies during the connection lifetime and that ACK can be delayed.</div><div>So CWD increases in a non simple way.</div><div><br></div><div>This is one paper I wrote some time ago where you can get to an equilibrium with a desired epsilon in the queue.</div><div>This protocol and LoLa seem to have similar objectives.</div><div><br></div><div><a href="https://arxiv.org/pdf/1010.5623.pdf">https://arxiv.org/pdf/1010.5623.pdf</a><br></div><div><br></div><div>Equations and stability conditions are worth a read. Maybe the equations can be reused for LoLa. As you seem doing something similar,</div><div>even if the X(t) introduces non regularities which cannot  be modelled in a simple way. Maybe yes.</div><div><br></div><div>In another work</div><div><br></div><div><span style="color:rgb(0,0,0);font-family:Times;font-size:medium">G. Carofiglio and L. Muscariello, "On the Impact of TCP and Per-Flow Scheduling on Internet Performance," </span></div><div><span style="color:rgb(0,0,0);font-family:Times;font-size:medium">in </span><em style="color:rgb(0,0,0);font-family:Times;font-size:medium">IEEE/ACM Transactions on Networking</em><span style="color:rgb(0,0,0);font-family:Times;font-size:medium">, vol. 20, no. 2, pp. 620-633, April 2012. </span></div><div><font color="#000000" face="Times" size="3"><a href="https://doi.org/10.1109/TNET.2011.2164553">https://doi.org/10.1109/TNET.2011.2164553</a></font></div><div><br></div><div>I, my co-author actually, did the calculations nobody wants to do to obtain the limit cycle which is pretty messy.</div><div>They are all fluid approximations. So in practice the target queuing latency can be larger or smaller.<br></div><div>These models are useful to get a preliminary idea about how the system works and a publication...</div><div><br></div><div><br></div><div><br></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, Nov 30, 2018 at 10:55 AM Mario Hock <<a href="mailto:mario.hock@kit.edu">mario.hock@kit.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Luca,<br>

<br>

I'm not that happy with the theorem either, since it stresses a <br>

limitation that can actually be overcome. However, I quoted it to <br>

demonstrate there is a challenge involved.<br>

<br>

In my point of view there is actually a subtle thing that's often lost <br>

when modeling the queue based on input rates, output rates, and, e.g., <br>

queuing theory. The inflight data (e.g., controlled by the CWnds) and <br>

the resulting standing queue. But this standing queue is (probably) the <br>

major reasoning behind LoLa and CoDel.<br>

<br>

Let's simplify the situation to single sender, to ease the explanation. <br>

(Also holds for multiple senders.) Let the sender have a CWnd-based <br>

congestion control, but let's fix this CWnd for now. Also, let's assume <br>

the CWnd is larger than the BDP (without queuing delay); CWnd = BDP + x.<br>

<br>

In this case, the sending rate will be above the bottleneck rate, as <br>

long as less than x is queued at the bottleneck. This increases the <br>

queue. As soon as x is queued at the bottleneck, the sending rate will <br>

(exactly) approach the bottleneck rate. The reason is, that the BDP <br>

(with buffering) will now be exactly equal to the CWnd (since the RTT is <br>

increased).<br>

<br>

This means, a standing queue will build up. But as soon as it's there, <br>

sending rate will (on average) be equal to the bottleneck rate.<br>

<br>

There is also a stability condition here. If the queuing delay (for some <br>

reason) falls below x/rate (i.e., amount of queuing falls below x), the <br>

sending rate increases. If the queuing delay raises above x/rate, <br>

sending rate is reduced. Note: That this is all with a fixed CWnd. No <br>

interaction of the congestion control algorithms involved here.<br>

<br>

Therefore, there can be a long living standing queue, even if input == <br>

output. Of course during the build up process input > output, however <br>

this build up can e.g. happen in one RTT, while the standing queue can <br>

survive seconds or minutes (in reality, and indefinitely in theory). <br>

Though, I found that this correlation is often not modeled in <br>

queue-theoretical models.<br>

<br>

Best, Mario<br>

<br>

<br>

Am 29.11.18 um 23:30 schrieb Luca Muscariello:<br>

> Mario,<br>

> <br>

> putting aside LoLa for a second,<br>

> I'm not quite sure that the theorem you cite is valid.<br>

> <br>

> According to the model R_i is the sending rate.<br>

> The sum across all flows bottlenecked at the link does not need to be <br>

> equal to the link capacity.<br>

> The input rate can be instantaneously below or above the link capacity.<br>

> Even If the link is never idle and the output rate is always C for all t.<br>

> <br>

> If the sum_i R_i = C is false because of what I said, than p_i which is <br>

> nothing more than<br>

> the shadow price of the link capacity constraint, can be a function of a <br>

> constant delay d, i.e. p_i = cost * d for all i.<br>

> <br>

> This theorem can be valid only if the input rate of a queue is <br>

> instantaneously equal to the output queue.<br>

> We all know that a queue exists just because there is an instantaneous <br>

> difference of input and output rates<br>

> at the link. So to conclude,  this theorem if valid iff input == output <br>

> rate then the queue is always zero, i.e.  d=0.<br>

> <br>

> The theorem is either an artefact of the model or just wrong. Or I'm <br>

> missing something...<br>

> <br>

> <br>

> <br>

> <br>

> <br>

> <br>

> On Thu, Nov 29, 2018 at 6:07 PM Mario Hock <<a href="mailto:mario.hock@kit.edu" target="_blank">mario.hock@kit.edu</a> <br>

> <mailto:<a href="mailto:mario.hock@kit.edu" target="_blank">mario.hock@kit.edu</a>>> wrote:<br>

> <br>

>     Hi Luca,<br>

> <br>

>     I'm answering on behalf of Roland, since I am a co-author of the paper.<br>

> <br>

>     This is an excellent question, since it goes right at the heart of how<br>

>     LoLa works.<br>

> <br>

>     Indeed, the paper is a first of a series. A second one, looking deeper<br>

>     into the fair flow balancing mechanism, is currently under submission.<br>

> <br>

>     Similar as other delay based congestion controls, LoLa tries to achieve<br>

>     fairness by allowing each flow to buffer the same amount of data at the<br>

>     bottleneck. We have this, e.g., in TCP Vegas, and (in a way) also in<br>

>     Copa (a recently proposed congestion control) and many others. If this<br>

>     is achieved, we get flow rate fairness independent of a flow's RTT.<br>

> <br>

>     Usually (in other congestion controls) this "allowed amount of data" is<br>

>     fixed per flow. We presume that this approach does not scale well to<br>

>     high speed networks. Since the queuing delay resulting from this amount<br>

>     of data is reduced with increasing bottleneck rate. Thus, it becomes<br>

>     harder to measure it right. This can easily be seen (and proven) for<br>

>     TCP<br>

>     Vegas.<br>

> <br>

>     Note: Just using higher fixed values is not an option, since it would<br>

>     not work at lower speeds anymore and also not with a large number of<br>

>     flows.<br>

> <br>

>     Therefore, LoLa tries to find a suitable value for the "allowed amount<br>

>     of data" dynamically. This is X(t).<br>

> <br>

>     Our approach is to grow X(t) over time during the Fair Flow Balancing<br>

>     phase. This phase ends when the queuing delay reaches 5ms. Thus, (in<br>

>     the<br>

>     ideal case) at the end of Fair Flow Balancing, X(t) is just as large<br>

>     that all flows at bottleneck create a queuing delay of 5ms, and all<br>

>     flows contribute equally to this queue. Hence, flow rate fairness is<br>

>     achieved. (Note that LoLa is designed in a way that t is (almost)<br>

>     synchronized among the competing flows.)<br>

> <br>

>     Generally, other ways of determining a suitable X(t) are<br>

>     conceivable. In<br>

>     our approach X(t) is a monotonically increasing function, but it is<br>

>     regularly reset as LoLa cycles between its states; i.e., after a<br>

>     queuing<br>

>     delay of 5ms is reached, the queue is drained and everything starts<br>

>     again. (Thus, the timespan where X(t) is monotonically increased is<br>

>     called a "round of fair flow balancing".)<br>

> <br>

>     This way we can overcome the constraint given in [1]:<br>

> <br>

>     """<br>

>     THEOREM 6 (FAIRNESS/DELAY TRADEOFF). For congestion control mechanisms<br>

>     that have steady state throughput of the kind R = f(d, p), for some<br>

>     function f, delay d and feedback p, if the feedback is based on purely<br>

>     end to end delay measurements, you can either have fairness or a fixed<br>

>     delay, but not both simultaneously<br>

>     """<br>

> <br>

>     [1] "ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY"<br>

>     Yibo Zhu et al., <a href="https://dl.acm.org/citation.cfm?id=2999593" rel="noreferrer" target="_blank">https://dl.acm.org/citation.cfm?id=2999593</a><br>

> <br>

>     Best, Mario<br>

> <br>

> <br>

>     Am 29.11.18 um 17:09 schrieb Luca Muscariello:<br>

>      > Hi Roland,<br>

>      ><br>

>      > It took me quite a lot of time to find this message in the thread...<br>

>      > I read the paper you sent and I guess this is the first of a<br>

>     series as<br>

>      > many things stay uncovered.<br>

>      ><br>

>      > Just a quick question: why is X(t) always increasing with  t?<br>

>      ><br>

>      ><br>

>      > On Tue, Nov 27, 2018 at 11:26 AM Bless, Roland (TM)<br>

>      > <<a href="mailto:roland.bless@kit.edu" target="_blank">roland.bless@kit.edu</a> <mailto:<a href="mailto:roland.bless@kit.edu" target="_blank">roland.bless@kit.edu</a>><br>

>     <mailto:<a href="mailto:roland.bless@kit.edu" target="_blank">roland.bless@kit.edu</a> <mailto:<a href="mailto:roland.bless@kit.edu" target="_blank">roland.bless@kit.edu</a>>>> wrote:<br>

>      ><br>

>      >     Hi Luca,<br>

>      ><br>

>      >     Am 27.11.18 um 10:24 schrieb Luca Muscariello:<br>

>      >      > A congestion controlled protocol such as TCP or others,<br>

>     including<br>

>      >     QUIC,<br>

>      >      > LEDBAT and so on<br>

>      >      > need at least the BDP in the transmission queue to get<br>

>     full link<br>

>      >      > efficiency, i.e. the queue never empties out.<br>

>      ><br>

>      >     This is not true. There are congestion control algorithms<br>

>      >     (e.g., TCP LoLa [1] or BBRv2) that can fully utilize the<br>

>     bottleneck link<br>

>      >     capacity without filling the buffer to its maximum capacity.<br>

>     The BDP<br>

>      >     rule of thumb basically stems from the older loss-based<br>

>     congestion<br>

>      >     control variants that profit from the standing queue that<br>

>     they built<br>

>      >     over time when they detect a loss:<br>

>      >     while they back-off and stop sending, the queue keeps the<br>

>     bottleneck<br>

>      >     output busy and you'll not see underutilization of the link.<br>

>     Moreover,<br>

>      >     once you get good loss de-synchronization, the buffer size<br>

>     requirement<br>

>      >     for multiple long-lived flows decreases.<br>

>      ><br>

>      >      > This gives rule of thumbs to size buffers which is also very<br>

>      >     practical<br>

>      >      > and thanks to flow isolation becomes very accurate.<br>

>      ><br>

>      >     The positive effect of buffers is merely their role to absorb<br>

>      >     short-term bursts (i.e., mismatch in arrival and departure rates)<br>

>      >     instead of dropping packets. One does not need a big buffer to<br>

>      >     fully utilize a link (with perfect knowledge you can keep the<br>

>     link<br>

>      >     saturated even without a single packet waiting in the buffer).<br>

>      >     Furthermore, large buffers (e.g., using the BDP rule of thumb)<br>

>      >     are not useful/practical anymore at very high speed such as<br>

>     100 Gbit/s:<br>

>      >     memory is also quite costly at such high speeds...<br>

>      ><br>

>      >     Regards,<br>

>      >       Roland<br>

>      ><br>

>      >     [1] M. Hock, F. Neumeister, M. Zitterbart, R. Bless.<br>

>      >     TCP LoLa: Congestion Control for Low Latencies and High<br>

>     Throughput.<br>

>      >     Local Computer Networks (LCN), 2017 IEEE 42nd Conference on, pp.<br>

>      >     215-218, Singapore, Singapore, October 2017<br>

>      > <a href="http://doc.tm.kit.edu/2017-LCN-lola-paper-authors-copy.pdf" rel="noreferrer" target="_blank">http://doc.tm.kit.edu/2017-LCN-lola-paper-authors-copy.pdf</a><br>

>      ><br>

>      >      > Which is:<br>

>      >      ><br>

>      >      > 1) find a way to keep the number of backlogged flows at a<br>

>      >     reasonable value.<br>

>      >      > This largely depends on the minimum fair rate an<br>

>     application may<br>

>      >     need in<br>

>      >      > the long term.<br>

>      >      > We discussed a little bit of available mechanisms to<br>

>     achieve that<br>

>      >     in the<br>

>      >      > literature.<br>

>      >      ><br>

>      >      > 2) fix the largest RTT you want to serve at full<br>

>     utilization and size<br>

>      >      > the buffer using BDP * N_backlogged.<br>

>      >      > Or the other way round: check how much memory you can use<br>

>      >      > in the router/line card/device and for a fixed N, compute<br>

>     the largest<br>

>      >      > RTT you can serve at full utilization.<br>

>      >      ><br>

>      >      > 3) there is still some memory to dimension for sparse flows in<br>

>      >     addition<br>

>      >      > to that, but this is not based on BDP.<br>

>      >      > It is just enough to compute the total utilization of sparse<br>

>      >     flows and<br>

>      >      > use the same simple model Toke has used<br>

>      >      > to compute the (de)prioritization probability.<br>

>      >      ><br>

>      >      > This procedure would allow to size FQ_codel but also SFQ.<br>

>      >      > It would be interesting to compare the two under this<br>

>     buffer sizing.<br>

>      >      > It would also be interesting to compare another mechanism<br>

>     that we<br>

>      >     have<br>

>      >      > mentioned during the defense<br>

>      >      > which is AFD + a sparse flow queue. Which is, BTW, already<br>

>      >     available in<br>

>      >      > Cisco nexus switches for data centres.<br>

>      >      ><br>

>      >      > I think that the the codel part would still provide the<br>

>     ECN feature,<br>

>      >      > that all the others cannot have.<br>

>      >      > However the others, the last one especially can be<br>

>     implemented in<br>

>      >      > silicon with reasonable cost.<br>

>      ><br>

>      ><br>

>      > _______________________________________________<br>

>      > Bloat mailing list<br>

>      > <a href="mailto:Bloat@lists.bufferbloat.net" target="_blank">Bloat@lists.bufferbloat.net</a> <mailto:<a href="mailto:Bloat@lists.bufferbloat.net" target="_blank">Bloat@lists.bufferbloat.net</a>><br>

>      > <a href="https://lists.bufferbloat.net/listinfo/bloat" rel="noreferrer" target="_blank">https://lists.bufferbloat.net/listinfo/bloat</a><br>

>      ><br>

> <br>

</blockquote></div>