[Bloat] Updated Bufferbloat Test

Wed Feb 24 13:22:05 EST 2021

Hi all,

A couple of months ago my co-founder Sam posted an early beta of the
Bufferbloat test that we’ve been working on, and Dave also linked to
it a couple of weeks ago.

Thank you all so much for your feedback - we almost entirely
redesigned the tool and the UI based on the comments we received.
We’re almost ready to launch the tool officially today at this URL,
but wanted to show it to the list in case anyone finds any last bugs
that we might have overlooked:

https://www.waveform.com/tools/bufferbloat

If you find a bug, please share the "Share Your Results" link with us
along with what happened. We capture some debugging information on the
backend, and having a share link allows us to diagnose any issues.

This is really more of a passion project than anything else for us –
we don’t anticipate we’ll try to commercialize it or anything like
that. We're very thankful for all the work the folks on this list have
done to identify and fix bufferbloat, and hope this is a useful
contribution. I’ve personally been very frustrated by bufferbloat on a
range of devices, and decided it might be helpful to build another
bufferbloat test when the DSLReports test was down at some point last
year.

Our goals with this project were:
  * To build a second solid bufferbloat test in case DSLReports goes down again.
  * Build a test where bufferbloat is front and center as the primary
purpose of the test, rather than just a feature.
  * Try to explain bufferbloat and its effect on a user's connection
as clearly as possible for a lay audience.

A few notes:
  * On the backend, we’re using Cloudflare’s CDN to perform the actual
download and upload speed test. I know John Graham-Cunning has posted
to this list in the past; if he or anyone from Cloudflare sees this,
we’d love some help. Our Cloudflare Workers are being
bandwidth-throttled due to having a non-enterprise grade account.
We’ve worked around this in a kludgy way, but we’d love to get it
resolved.
  * We have lots of ideas for improvements, e.g. simultaneous
upload/downloads, trying different file size chunks, time-series
latency graphs, using WebRTC to test UDP traffic etc, but in the
interest of getting things launched we're sticking with the current
featureset.
  * There are a lot of browser-specific workarounds that we had to
implement, and latency itself is measured in different ways on
Safari/Webkit vs Chromium/Firefox due to limitations of the
PerformanceTiming APIs. You may notice that latency is different on
different browsers, however the actual bufferbloat (relative increase
in latency) should be pretty consistent.

In terms of some of the changes we made based on the feedback we
receive on this list:

Based on Toke’s feedback:
https://lists.bufferbloat.net/pipermail/bloat/2020-November/015960.html
https://lists.bufferbloat.net/pipermail/bloat/2020-November/015976.html
  * We changed the way the speed tests run to show an instantaneous
speed as the test is being run.
  * We moved the bufferbloat grade into the main results box.
  * We tried really hard to get as close to saturating gigabit
connections as possible. We redesigned completely the way we chunk
files, added a “warming up” period, and spent quite a bit optimizing
our code to minimize CPU usage, as we found that was often the
limiting factor to our speed test results.
  * We changed the shield grades altogether and went through a few
different iterations of how to show the effect of bufferbloat on
connectivity, and ended up with a “table view” to try to show the
effect that bufferbloat specifically is having on the connection
(compared to when the connection is unloaded).
  * We now link from the results table view to the FAQ where the
conditions for each type of connection are explained.
  * We also changed the way we measure latency and now use the faster
of either Google’s CDN or Cloudflare at any given location. We’re also
using the WebTiming APIs to get a more accurate latency number, though
this does not work on some mobile browsers (e.g. iOS Safari) and as a
result we show a higher latency on mobile devices. Since our test is
less a test of absolute latency and more a test of relative latency
with and without load, we felt this was workable.
  * Our jitter is now an average (was previously RMS).
  * The “before you start” text was rewritten and moved above the start button.
  * We now spell out upload and download instead of having arrows.
  * We hugely reduced the number of cross-site scripts. I was a bit
embarrassed by this if I’m honest - I spent a long time building web
tools for the EFF, where we almost never allowed any cross-site
scripts. * Our site is hosted on Shopify, and adding any features via
their app store ends up adding a whole lot of gunk. But we uninstalled
some apps, rewrote our template, and ended up removing a whole lot of
the gunk. There’s still plenty of room for improvement, but it should
be a lot better than before.

Based on Dave Collier-Brown’s feedback:
https://lists.bufferbloat.net/pipermail/bloat/2020-November/015966.html
  * We replaced the “unloaded” and “loaded” language with “unloaded”
and then “download active”  and “upload active.” In the grade box we
indicate that, for example, “Your latency increased moderately under
load.”
  * We tried to generally make it easier for non-techie folks to
understand by emphasizing the grade and adding the table showing how
bufferbloat affects some commonly-used services.
  * We didn’t really change the candle charts too much - they’re
mostly just to give a basic visual - we focused more on the actual
meat of the results above that.

Based on Sebastian Moeller’s feedback:
https://lists.bufferbloat.net/pipermail/bloat/2020-November/015963.html
  * We considered doing a bidirectional saturating load, but decided
to skip on implementing it for now. * It’s definitely something we’d
like to experiment with more in the future.
  * We added a “warming up” period as well as a “draining” period to
help fill and empty the buffer. We haven’t added the option for an
extended test, but have this on our list of backlog changes to make in
the future.

Based on Y’s feedback (link):
https://lists.bufferbloat.net/pipermail/bloat/2020-November/015962.html
  * We actually ended up removing the grades, but we explained our
criteria for the new table in the FAQ.

Based on Greg White's feedback (shared privately):
* We added an FAQ answer explaining jitter and how we measure it.

We’d love for you all to play with the new version of the tool and
send over any feedback you might have. We’re going to be in a feature
freeze before launch but we'd love to get any bugs sorted out. We'll
likely put this project aside after we iron out a last round of bugs
and launch, and turn back to working on projects that help us pay the
bills, but we definitely hope to revisit and improve the tool over
time.

Best,

Sina, Arshan, and Sam.