<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><font size="-1">At work, I recently had a database outage due to

        network saturation and timeouts, which we proposed to address by

        setting up a QOS policy for the machines in question. However,

        from the discussion in Ms Drucker's BBR talk, that could lead us

        to doing <i>A Bad Thing</i> (;-))</font></p>

    <p><font size="-1"><br>

      </font></p>

    <p><font size="-1">Let's start at the beginning, though.  The talk,

        mentioned before in the list[1], was about the interaction of

        BBR and large values of buffering, specifically for video

        traffic.  I attended it, and listened with interest to the

        questions from the committee. She subsequently gave me a copy of

        the paper and presentation, which I appreciate: it's very good

        work.<br>

      </font></p>

    <p><font size="-1">She reported the severity of </font><font

        size="-1"><font size="-1">the effect of large buffers on BBR.

          I've attached a screenshot, but the list probably won't take

          it, so I'll describe it. After the first few packets with

          large buffers, RTT rises, throughput plummets and then

          throughput stays low for about 200,000 ms. Then it rises to

          about half the initial throughput for about 50,000 ms as RTT

          falls, then throughput plummets once more. This pattern

          repeats throughout the test.</font></font></p>

    <p><font size="-1"><font size="-1">Increasing the buffering in the

          test environment turns perfectly reasonable performance into a

          real disappointment, even though BBR is trying to estimate <i>the

            network’s bandwidth-delay product, BDP, and regulating its </i><i>sending

            rate to maximize throughput while attempting to maintain BDP

            worth of packets in the </i><i>buffer, irrespective of the

            size of the buffer</i>.<br>

        </font></font></p>

    <p><font size="-1">One of the interesting questions was about the

        token-bucket algorithm used in the router to limit performance.

        In her paper, she discusses the token bucket filter used by </font><font

        size="-1"><font size="-1">OpenWRT 19.07.1 on </font></font><font

        size="-1"><font size="-1"><font size="-1">a Linksys WRT1900ACS

            router</font></font>. Allowing more than the actual

        bandwidth of the interface as the <i>burst rate</i> can

        exacerbate the buffering problem, so the listener was concerned

        that routers "in the wild" might also be contributing to the

        poor performance by using token-bucket algorithms with "excess

        burst size" parameters.</font></p>

    <p><font size="-1">The very first Cisco manual I found in a Google

        search explained how to <b><i>set</i></b> excess burst size (!)</font></p>

    <p><font size="-1"><a class="moz-txt-link-freetext"

href="https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/qos_plcshp/configuration/12-4/qos-plcshp-12-4-book.pdf"

          moz-do-not-send="true">https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/qos_plcshp/configuration/12-4/qos-plcshp-12-4-book.pdf</a>

        defined excess burst size as <i>Traffic that falls between the

          normal burst size and the Excess Burst size</i> and specifies

        it will be sent regardless, <i>with a probability that

          increases as the burst size increases.</i></font></p>

    <p><font size="-1">A little later, it explains that the excess or

        "extended" burst size </font><font size="-1"><i><font size="-1">exists

            so as to avoid tail-drop behavior, and, instead,<br>

            engage behavior like that of Random Early Detection (RED).</font></i></font></p>

    <p><font size="-1">In order to avoid tail drop, they suggest the

        "extended burst" be set to twice the burst size, where the burst

        size by definition is the capacity of the interface, per unit

        time.</font></p>

    <p><font size="-1"><br>

      </font></p>

    <p><font size="-1">So, folks, am I right in thinking that Cisco's

        recommendation just might be a <i>terrible</i> piece of

        advice?  <br>

      </font></p>

    <p><font size="-1"><font size="-1">As a capacity planner, it sounds

          a lot like they're praying for a conveniently timed lull after

          every time they let too many bytes through.</font></font></p>

    <p><font size="-1"><font size="-1">As a follower of the discussion

          here, the reference to tail drop and RED sound faintly ...

          antique.</font></font></p>

    <p><font size="-1">--dave c-b</font></p>

    <p><font size="-1">[1. </font> <a

href="https://www.cs.stonybrook.edu/Rebecca-Drucker-Research-Proficiency-Presentation-Investigating-BBR-Bufferbloat-Problem-DASH-Video"

        moz-do-not-send="true">https://www.cs.stonybrook.edu/Rebecca-Drucker-Research-Proficiency-Presentation-Investigating-BBR-Bufferbloat-Problem-DASH-Video

        ]</a> </p>

    <pre class="moz-signature" cols="72">-- 

David Collier-Brown,         | Always do right. This will gratify

System Programmer and Author | some people and astonish the rest

<a class="moz-txt-link-abbreviated" href="mailto:davecb@spamcop.net" moz-do-not-send="true">davecb@spamcop.net</a>           |                      -- Mark Twain

</pre>

  </body>

</html>