[Bloat] [Codel] RFC: Realtime Response Under Load (rrul) test specification

Fri Nov 9 05:34:38 EST 2012

On Tue, Nov 6, 2012 at 7:14 PM, Rick Jones <rick.jones2 at hp.com> wrote:
> On 11/06/2012 04:42 AM, Dave Taht wrote:
>>
>> I have been working on developing a specification for testing networks
>> more effectively for various side effects of bufferbloat, notably
>> gaming and voip performance, and especially web performance.... as
>> well as a few other things that concerned me, such as IPv6 behavior,
>> and the effects of packet classification.
>>
>> A key goal is to be able to measure the quality of the user experience
>> while a network is otherwise busy, with complex stuff going on in the
>> background, but with a simple presentation of the results in the end,
>> in under 60 seconds.
>
>
> Would you like fries with that?

and a shake!

>
> Snark aside, I think that being able to capture the state of the user
> experience in only 60 seconds is daunting at best.  Especially if this

Concur.

> testing is going to run over the Big Bad Internet (tm) rather than in a
> controlled test lab.

In my testing of this scheme, from networks ranging in size and
quality from a 4 hop mesh network to the internet, to testing at
random hotels throughout the US and eu at baseline RTT up to 200ms, to
lab testing at multiple other locations, I was generally able to
generate a load that had "interesting" side effects inside of 40
seconds, and generally shorter.

My suggestion, as I always do, is for you (and others) to simply try
out the prototypes that are in the netperf-wrapper git repo, and see
what you can see, and learn what you can learn, and feedback what you
can.

In the longer term I would certainly like the simplest test for
latency under load added directly to netperf. The VOIP test is also
nifty.

>
>
>> While it's not done yet, it escaped into the wild today, and I might
>> as well solicit wider opinions on it, sooo... get the spec at:
>>
>> https://github.com/dtaht/deBloat/blob/master/spec/rrule.doc?raw=true
>
>
> Github is serving that up as a plain text file, which then has Firefox
> looking to use gedit to look at the file, and gedit does not seem at all
> happy with it.  It was necessary to download the file and open it "manually"
> in LibreOffice.

Sorry. The original was in emacs org mode. Shall I put that up instead?

>
>> MUST run long enough to defeat bursty bandwidth optimizations such as
>> PowerBoost and discard data from that interval.
>
>
> I'll willingly display my ignorance, but for how long does PowerBoost and
> its cousins boost bandwidth?
>
> I wasn't looking for PowerBoost, and given the thing being examined I wasn't
> seeing that, but recently when I was evaluating the network performance of
> something "out there" in the cloud (not my home cloud as it were though) I
> noticed performance spikes repeating at intervals which would require > 60
> seconds to "defeat"

I too have seen oddball spikes - for example, older forms of sfq do
permutation every 10 seconds and totally wipe out many tcp connections
by doing so.

I regard your problem detailed above as an edge case, compared to the
much more gross effects this benchmark generates.

Certainly being able to run the tests for longer intervals and capture
traffic would be useful for network engineers.

>
>> MUST track and sum bi-directional throughput, using estimates for ACK
>> sizes of ipv4, ipv6, and encapsulated ipv6 packets, udp and tcp_rr
>> packets, etc.
>
>
> Estimating the bandwidth consumed by ACKs and/or protocol headers, using
> code operating at user-space, is going to be guessing.  Particularly
> portable user-space.  While those things may indeed affect the user's
> experience, the user doesn't particularly care about ACKs or header sizes.
> She cares how well the page loads or the call sounds.

I feel an "optimum" ack overhead should be calculated, vs the actual
(which is impossible)
>> MUST have the test server(s) within 80ms of the testing client
>
>
> Why?  Perhaps there is something stating that some number of nines worth of
> things being accessed are within 80ms of the user.  If there is, that should
> be given in support of the requirement.

Con-US distance. Despite me pushing the test to 200ms, I have a great
deal more confidence it will work consistently at 80ms.

Can make this a "SHOULD" if you like.

>> This portion of the test will take your favorite website as a target
>> and show you how much it will slow down, under load.
>
>
> Under load on the website itself, or under load on one's link.  I ass-u-me
> the latter, but that should be made clear.  And while the chances of the
> additional load on a web site via this testing is likely epsilon, there is
> still the matter of its "optics" if you will - how it looks.  Particularly
> if there is going to be something distributed with a default website coded
> into it.
>
> Further, websites are not going to remain static, so there will be the
> matter of being able to compare results over time.  Perhaps that can be
> finessed with the "unloaded" (again I assume relative to the link of
> interest/test) measurement.

A core portion of the test really is comparing unloaded vs loaded
performance of the same place, in the same test, over the course of
about a minute.

And as these two baseline figures are kept, those can be compared for
any given website from any given location, over history, and changes
in the underlying network.

> rick jones

-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html