From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <roland.bless@kit.edu>
Received: from iramx2.ira.uni-karlsruhe.de (iramx2.ira.uni-karlsruhe.de
 [IPv6:2a00:1398:2::10:81])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 1C7D03B2A4
 for <bloat@lists.bufferbloat.net>; Mon,  6 Jul 2020 14:32:15 -0400 (EDT)
Received: from [2a00:1398:2:4006:617e:e00a:afea:6ea2]
 (helo=i72vorta.tm.kit.edu)
 by iramx2.ira.uni-karlsruhe.de with esmtpsa port 25 
 iface 2a00:1398:2::10:8 id 1jsVuQ-0001aC-EB; Mon, 06 Jul 2020 20:32:06 +0200
Received: from [IPv6:::1] (localhost [127.0.0.1])
 by i72vorta.tm.kit.edu (Postfix) with ESMTPS id 31298421196;
 Mon,  6 Jul 2020 20:32:06 +0200 (CEST)
To: Matt Mathis <mattmathis@google.com>, Dave Taht <dave.taht@gmail.com>
Cc: Carlo Augusto Grazia <carloaugusto.grazia@unimore.it>,
 jamshid@whatsapp.com, bloat <bloat@lists.bufferbloat.net>
References: <CAA93jw5RuBfDA=Yku6+Rm+YEdrUzvZMsoAwVXYduZjBmMVf43Q@mail.gmail.com>
 <CALDN43m=2SzkT4vLeqiFxE6PRd+ZKR1hdeMRwtqbTFuAL7nMLA@mail.gmail.com>
 <CAA93jw40TT08WF9we18gYKPe9xcfhAskw6J5soey=riehV_90Q@mail.gmail.com>
 <mailman.763.1593883755.24343.bloat@lists.bufferbloat.net>
From: Roland Bless <roland.bless@kit.edu>
Autocrypt: addr=roland.bless@kit.edu; prefer-encrypt=mutual; keydata=
 mQINBFi0OxABEACy2VohJ7VhSu/xPCt4/6qCrw4Pw2nSklWPfAYEk1QgrbiwgvLAP9WEhAIU
 w45cojBaDxytIGg8eaYeIKSmsXjHGbV/ZTfo8r11LX8yPYR0WHiMWZpl0SHUd/CZIkv2pChO
 88vF/2FKN95HDcp24pwONF4VhxJoSFk6c0mDNf8Em/Glt9BcWX2AAvizTmpQDshaPje18WH3
 4++KwPZDd/sJ/hHSXiPg1Gdhs/OG/C0CJguOAlqbgSVAe3qKOr1M4K5M+wVpsk373pXRfxd7
 ZAmZ05iBTn+LfgVcz+AfaKKcsWri5CdTT+7JDL6QNQpox+b5FXZFSHnEIST+/qzfG7G2LqqY
 mml6TYY8XbaNyXZP0QKncfSpRx8uTRWReHUa1YbSuOxXYh6bXpcugD25mlC/Lu0g7tz4ijiK
 iIwq9+P2H1KfAAfYyYZh6nOoE6ET0TjOjUSa+mA8cqjPWX99kEEgf1Xo+P9fx9QLCLWIY7zc
 mSM+vjQKgdUFpMSCKcYEKOuwlPuOz8bVECafxaEtJJHjCOK8zowe2eC9OM+G+bmtAO3qYcYZ
 hQ/PV3sztt/PjgdtnFAYPFLc9189rHRxKsWSOb4xPkRw/YQAI9l15OlUEpsyOehxmAmTsesn
 tSViCz++PCdeXrQc1BCgl8nDytrxW+n5w1aaE8aL3hn8M0tonQARAQABtChSb2xhbmQgQmxl
 c3MgKFRNKSA8cm9sYW5kLmJsZXNzQGtpdC5lZHU+iQJABBMBCAAqAhsDBQkSzAMABQsJCAcC
 BhUICQoLAgQWAgMBAh4BAheABQJYtYdHAhkBAAoJEKON2tlkOJXuzWkP+wfjUnDNzRm4r34a
 AMWepcQziTgqf4I1crcL6VD44767HhyFsjcKH31E5G5gTDxbpsM4pmkghKeLrpPo30YK3qb7
 E9ifIkpJTvMu0StSUmcXq0zPyHZ+HxHeMWkosljG3g/4YekCqgWwrB62T7NMYq0ATQe1MGCZ
 TAPwSPGCUZT3ioq50800FMI8okkGTXS3h2U922em7k8rv7E349uydv19YEcS7tI78pggMdap
 ASoP3QWB03tzPKwjqQqSevy64uKDEa0UgvAM3PRbJxOYZlX1c3q/CdWwpwgUiAhMtPWvavWW
 Tcw6Kkk6e0gw4oFlDQ+hZooLv5rlYR3egdV4DPZ1ugL51u0wQCQG9qKIMXslAdmKbRDkEcWG
 Oi2bWAdYyIHhhQF5LSuaaxC2P2vOYRHnE5yv5KTV3V7piFgPFjKDW+giCRd7VGfod6DY2b2y
 zwidCMve1Qsm8+NErH6U+hMpMLeCJDMu1OOvXYbFnTkqjeg5sKipUoSdgXsIo4kl+oArZlpK
 qComSTPhij7rMyeu/1iOwbNCjtiqgb55ZE7Ekd84mr9sbq4Jm/4QGnVI30q4U2vdGSeNbVjo
 d1nqjf3UNzP2ZC+H9xjsCFuKYbCX6Yy4SSuEcubtdmdBqm13pxua4ZqPSI0DQST2CHC7nxL1
 AaRGRYYh5zo2vRg3ipkEuQINBFi0OxABEAC2CJNp0/Ivkv4KOiXxitsMXZeK9fI0NU2JU1rW
 04dMLF63JF8AFiJ6qeSL2mPHoMiL+fG5jlxy050xMdpMKxnhDVdMxwPtMiGxbByfvrXu18/M
 B7h+E1DHYVRdFFPaL2jiw+Bvn6wTT31MiuG9Wh0WAhoW8jY8IXxKQrUn7QUOKsWhzNlvVpOo
 SjMiW4WXksUA0EQVbmlskS/MnFOgCr8q/FqwC81KPy+VLHPB9K/B65uQdpaw78fjAgQVQqpx
 H7gUF1EYpdZWyojN+V8HtLJx+9yWAZjSFO593OF3/r0nDHEycuOjhefCrqr0DDgTYUNthOdU
 KO2CzT7MtweRtAf0n27zbwoYvkTviIbR+1lV1vNkxaUtZ6e1rtOxvonRM1O3ddFIzRp/Qufu
 HfPe0YqhEsrBIGW1aE/pZW8khNQlB6qt20snL9cFDrnB6+8kDG3e//OjK1ICQj9Y/yyrJVaX
 KfPbdHhLpsgh8TMDPoH+XXQlDJljMD0++/o7ckO3Sfa8Zsyh1WabyKQDYXDmDgi9lCoaQ7Lf
 uLUpoMvJV+EWo0jE4RW/wBGQbLJp5usy5i0fhBKuDwsKdLG3qOCf4depIcNuja6ZmZHRT+3R
 FFjvZ/dAhrCWpRTxZANlWlLZz6htToJulAZQJD6lcpVr7EVgDX/y4cNwKF79egWXPDPOvQAR
 AQABiQIlBBgBCAAPBQJYtDsQAhsMBQkSzAMAAAoJEKON2tlkOJXukMoP/jNeiglj8fenH2We
 7SJuyBp8+5L3n8eNwfwY5C5G+etD0E6/lkt/Jj9UddTazxeB154rVFXRzmcN3+hGCOZgGAyV
 1N7d8xM6dBqRtHmRMPu5fUxfSqrM9pmqAw2gmzAe0eztVvaM+x5x5xID2WZOiOq8dx9KOKrp
 Zorekjs3GEA3V1wlZ7Nksx/o8KZ04hLeKcR1r06zEDLN/yA+Fz8IPa0KqpuhrL010bQDgAhe
 9o5TA0/cMJpxpLqHhX2As+5cQAhKDDsWJu3oBzZRkN7Hh/HTpWurmTQRRniLGSeiL0zdtilX
 fowyxGXH6QWi3MZYmpOq+etr7o4EGGbm2inxpVbM+NYmaJs+MAi/z5bsO/rABwdM5ysm8hwb
 CGt+1oEMORyMcUk/uRjclgTZM1NhGoXm1Un67+Rehu04i7DA6b8dd1H8AFgZSO2H4IKi+5yA
 Ldmo+ftCJS83Nf6Wi6hJnKG9aWQjKL+qmZqBEct/D2uRJGWAERU5+D0RwNV/i9lQFCYNjG9X
 Tew0BPYYnBtHFlz9rJTqGhDu4ubulSkbxAK3TIk8XzKdMvef3tV/7mJCmcaVbJ2YoNUtkdKJ
 goOigJTMBXMRu4Ibyq1Ei+d90lxhojKKlf9yguzpxk5KYFGUizp0dtvdNuXRBtYrwzykS6vB
 zTlLqHZ0pvGjNfTSvuuN
Organization: Institute of Telematics, Karlsruhe Institute of Technology (KIT)
Message-ID: <1e04c5f8-4bdc-1835-1070-04b0c2224526@kit.edu>
Date: Mon, 6 Jul 2020 20:32:05 +0200
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.1) Gecko/20060111
 Thunderbird/1.5 Mnenhy/0.7.3.0
MIME-Version: 1.0
In-Reply-To: <mailman.763.1593883755.24343.bloat@lists.bufferbloat.net>
Content-Type: multipart/alternative;
 boundary="------------1E71E0708BC41F7FF6874B53"
Content-Language: en-US
X-ATIS-AV: ClamAV (iramx2.ira.uni-karlsruhe.de)
X-ATIS-Checksum: v3zoCAcc32ckk
X-ATIS-Timestamp: iramx2.ira.uni-karlsruhe.de  esmtpsa 1594060326.505640901
Subject: Re: [Bloat] the future belongs to pacing
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Mon, 06 Jul 2020 18:32:20 -0000

This is a multi-part message in MIME format.
--------------1E71E0708BC41F7FF6874B53
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

Hi Matt and Jamshid,

On 04.07.20 at 19:29 Matt Mathis via Bloat wrote:

> Key takeaway: pacing is inevitable, because it saves large content
> providers money (more efficient=C2=A0use of the most expensive silicon =
in
> the data center, the switch buffer memory), however to use pacing we
> walk away from 30 years of experience with TCP self clock, which is
> the foundation of all of our CC research....

Thanks for the interesting read. I have a few comments:

  * IMO, many of the mentioned problems are related to using packet loss
    as congestion signal rather than self-clocking.

  * In principle, one can keep utilization high and queuing delay low
    with a congestion window based and ACK-clock
    driven approach (see TCP LoLa
    https://ieeexplore.ieee.org/document/8109356). However, it currently
    lacks
    heuristics to deal with stretch/aggregated ACKs, but I think one can
    extend this like already done in BBR.

  * Pacing is really useful and I think it is important to keep sending
    in case the ACK-clock is distorted
    by the mentioned problems, but only for a limited time. If one's
    estimate for the correct sending rate
    is too high, the amount of inflight data increases over time, which
    leads to queuing delay and/or loss.
    So having the inflight cap as in BBRv1 is also an important safety
    measure.

  * "The maximum receive rate is probed by sending at 125% of max_BW .
    If the network is already full and flows have reached their fair shar=
e,
    the observed max_BW won=E2=80=99t change."
    This assumption isn't true if several flows are present at the
    bottleneck.
    If a flow sends with 1.25*max_BW on the saturated link, *the observed=
**
    **max_BW will change* (unless all flows are probing at the same
    time) because the probing flow preempts other flows, thereby
    reducing their current share. Together with the applied max-filter
    this is the reason why BBRv1 is constantly overestimating the availab=
le
    capacity and thus persistently increasing the amount inflight data
    until the inflight cap is hit. The math is in [32] (section 3) of you=
r
    references. Luckily BBRv2 has much more safeguards built-in.

  * "The lower queue occupancy indicates that it is not generally taking
    capacity away from other transport protocols..."
    I think that this indication is not very robust, e.g., it may hold
    in case
    there isn't significant packet loss observed. Observing an overall
    lower buffer occupancy does not necessarily tell you something about
    the individual flow shares. In BBRv1 you could have starving Cubic
    flows, because they were backing-off due to loss, while BBR kept
    sending.

  * Last but not least, even BBR requires an ACK stream as
    feedback in order to estimate the delivery rate. But it is actually
    not self-clocked and keeps sending "blindly" for a while. This is
    quite useful to deal with the mentioned stretch/aggregated ACKs,
    if done with care.

Regards,
=C2=A0Roland


--------------1E71E0708BC41F7FF6874B53
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <div class="moz-cite-prefix">Hi Matt and Jamshid,</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">On 04.07.20 at 19:29 Matt Mathis via
      Bloat wrote:<br>
    </div>
    <p>
      <blockquote type="cite">Key takeaway: pacing is inevitable,
        because it saves large content providers money (more
        efficient use of the most expensive silicon in the data center,
        the switch buffer memory), however to use pacing we walk away
        from 30 years of experience with TCP self clock, which is the
        foundation of all of our CC research....</blockquote>
    </p>
    <p>Thanks for the interesting read. I have a few comments:</p>
    <ul>
      <li>IMO, many of the mentioned problems are related to using
        packet loss as congestion signal rather than self-clocking.<br>
        <br>
      </li>
      <li>In principle, one can keep utilization high and queuing delay
        low with a congestion window based and ACK-clock <br>
        driven approach (see TCP LoLa
        <a class="moz-txt-link-freetext" href="https://ieeexplore.ieee.org/document/8109356">https://ieeexplore.ieee.org/document/8109356</a>). However, it
        currently lacks <br>
        heuristics to deal with stretch/aggregated ACKs, but I think one
        can extend this like already done in BBR.<br>
        <br>
      </li>
      <li>Pacing is really useful and I think it is important to keep
        sending in case the ACK-clock is distorted<br>
        by the mentioned problems, but only for a limited time. If one's
        estimate for the correct sending rate <br>
        is too high, the amount of inflight data increases over time,
        which leads to queuing delay and/or loss. <br>
        So having the inflight cap as in BBRv1 is also an important
        safety measure.<br>
        <br>
      </li>
      <li>"The maximum receive rate is probed by sending at 125% of
        max_BW .<br>
        If the network is already full and flows have reached their fair
        share,<br>
        the observed max_BW won’t change."<br>
        This assumption isn't true if several flows are present at the
        bottleneck.<br>
        If a flow sends with 1.25*max_BW on the saturated link, <b>the
          observed</b><b><br>
        </b><b>max_BW will change</b> (unless all flows are probing at
        the same<br>
        time) because the probing flow preempts other flows, thereby<br>
        reducing their current share. Together with the applied
        max-filter<br>
        this is the reason why BBRv1 is constantly overestimating the
        available<br>
        capacity and thus persistently increasing the amount inflight
        data <br>
        until the inflight cap is hit. The math is in [32] (section 3)
        of your<br>
        references. Luckily BBRv2 has much more safeguards built-in.<br>
        <br>
      </li>
      <li>"The lower queue occupancy indicates that it is not generally
        taking <br>
        capacity away from other transport protocols..."<br>
        I think that this indication is not very robust, e.g., it may
        hold in case <br>
        there isn't significant packet loss observed. Observing an
        overall<br>
        lower buffer occupancy does not necessarily tell you something
        about<br>
        the individual flow shares. In BBRv1 you could have starving
        Cubic <br>
        flows, because they were backing-off due to loss, while BBR kept
        <br>
        sending. <br>
        <br>
      </li>
      <li>Last but not least, even BBR requires an ACK stream as<br>
        feedback in order to estimate the delivery rate. But it is
        actually<br>
        not self-clocked and keeps sending "blindly" for a while. This
        is<br>
        quite useful to deal with the mentioned stretch/aggregated ACKs,
        <br>
        if done with care.<br>
      </li>
    </ul>
    <p>Regards,<br>
       Roland</p>
    <p><br>
    </p>
  </body>
</html>

--------------1E71E0708BC41F7FF6874B53--