[Bloat] Progress with latency-under-load tool
Jonathan Morton
chromatix99 at gmail.com
Wed Mar 23 13:40:50 PDT 2011
On 23 Mar, 2011, at 9:27 pm, Otto Solares wrote:
>> Unfortunately that patch will not work - it completely breaks part of the on-wire protocol. It is much better to simply convert the final results for an auxiliary display.
>
> Yeah, after hours of running some results seems wrong, hopefully is not
> broken on my missing part of converting a float to network byte order.
Well, it was that incorrect endianness-conversion which finally made me throw my hands up in despair. I didn't bother reading to the end of the patch.
By contrast, changing the random-number generator has only a minor impact on the program, namely how fast it can generate traffic and whether a network compression engine might find compressible weaknesses in it (the default UNIX rand() function is potentially weak enough for that). I could quite happily drop in a standalone Mersenne Twister implementation, so long as I could still show it was fast enough on my old Pentium-MMX.
>> I should also point out that I have very strong reasons for providing the measurements in non-traditional units by default. I'm measuring characteristics as they matter to applications and users, who measure things in bytes and frames per second, not bits and milliseconds. It is also much easier to get nontechnical people (who tend to be in charge of budgets) to respond to bigger-is-better numbers.
>
> Understood your PoV, sadly even my bosses (who lacks any degree of
> technicallity) knows that the Internet is sold to us in Mb/s (decimal)
> and delay (as he call it) is measured in ms.
I do plan to add the traditional units as a secondary output, but I want to finish proving that high-frequency networks actually do exist first. The existing units will remain primary for the reasons outlined below.
> Good luck doing that nontechnical people use a CLI tool! ;)
That is also a valid point, though I expect the tool to be used by at least moderately technical people, and the results to be usable by less technical people. It's also possible that it might eventually be developed into a GUI tool. At the moment, the length of time it takes to run a test creates a substantial selection for patience - but this is partially deliberate, because it takes time to be sure of exercising the network's worst-case performance modes.
The real point is that applications don't care one jot about bits-per-second, which is universally contaminated by overheads such as packet headers, error correction and retransmissions. What they care about is the bytes in the payload, so that's what I'm measuring. Even if the Internet is *sold* in Mbps, it is *used* in KiB/s, and the usual conversion factors between the two are routinely found to be inaccurate (not least when the weasel-words "up to" are involved).
Similarly, games run in frames-per-second, so comparing the Responsiveness number to the performance of your graphics card let you know how many frames of lag are induced purely by network conditions. Equivalently, the Smoothness number compared to the framerate of a typical video (30fps = 30Hz) tell you how many frames the video player needs to buffer before it can reliably let you see anything. There is not such an obvious link between the network and the application if you measure the network in milliseconds, although the smoothness of current networks is so poor that you can often see it in a Web browser's download progress bar.
The smoothness and responsiveness numbers I'm getting over Ethernet are both absurdly low, suggesting that packet loss due to queue overstuffing is endemic, even with packet buffers that are already far too large in attempt to stave it off. In some cases I am even seeing connections dropping due to complete starvation - multiple consecutive retransmissions are being dropped. This is not how people think of a modern network, especially not the wired, full-duplex, switched Ethernet that I'm measuring right now.
- Jonathan
More information about the Bloat
mailing list