For web browsing average jitter is a legitimate measure but for interactive 
media (2 way voip or video conferencing) peak or maximum is the only valid 
measurement.

Simon

On February 24, 2021 9:57:13 PM Sina Khanifar <sina@waveform.com> wrote:

> Thanks for the feedback, Dave!
>
>> 0) "average" jitter is a meaningless number. In the case of a 
>> videoconferencing application, what matters most is max jitter, where the 
>> app will choose to ride the top edge of that, rather than follow it. I'd 
>> prefer using a 98% number, rather than 75% number, to weight where the 
>> typical delay in a videoconfernce might end up.
>
> Both DSLReports and Ookla's desktop app report jitter as an average
> rather than as a max number, so I'm a little hesitant to go against
> the norm - users might find it a bit surprising to see much larger
> jitter numbers reported. We're also not taking a whole ton of latency
> tests in each phase, so the 98% will often end up being the max
> number.
>
> With regards to the videoconferencing, we actually ran some real-world
> tests of Zoom with various levels of bufferbloat/jitter/latency, and
> calibrated our "real-world results" table on that basis. We used
> average jitter in those tests ... I think if we used 98% or even 95%
> the allowable number would be quite high.
>
>> 1) The worst case scenario of bloat affecting a users experience is during 
>> a simultaneous up and download, and I'd rather you did that rather than 
>> test them separately. Also you get a more realistic figure for the actual 
>> achievable bandwidth under contention and can expose problems like strict 
>> priority queuing in one direction or another locking out further flows.
>
> We did consider this based on another user's feedback, but didn't
> implement it. Perhaps we can do this next time we revisit, though!
>
>> This points to any of number of problems (features!) It's certainly my hope 
>> that all the cdn makers at this point have installed bufferbloat 
>> mitigations. Testing a cdn's tcp IS a great idea, but as a bufferbloated 
>> test, maybe not so much.
>
> We chose to use a CDN because it seemed like the only feasible way to
> saturate gigabit links at least somewhat consistently for a meaningful
> part of the globe, without setting up a whole lot of servers at quite
> high cost.
>
> But we weren't aware that bufferbloat could be abated from the CDN's
> end. This is a bit surprising to me, as our test results indicate that
> bufferbloat is regularly an issue even though we're using a CDN for
> the speed and latency tests. For example, these are the results on my
> own connection here (Cox, in Southern California), showing meaningful
> bufferbloat:
>
> https://www.waveform.com/tools/bufferbloat?test-id=ece467bd-e07a-45ea-9db6-e64d8da2c1d2
>
> I get even larger bufferbloat effects when running the test on a 4G LTE 
> network:
>
> https://www.waveform.com/tools/bufferbloat?test-id=e99ae561-88e0-4e1e-bafd-90fe1de298ac
>
> If the CDN was abating bufferbloat, surely I wouldn't see results like these?
>
>> 3) Are you tracking an ecn statistics at this point (ecnseen)?
>
> We are not, no. I'd definitely be curious to see if we can add this in
> the future, though!
>
> Best,
>
> On Wed, Feb 24, 2021 at 2:10 PM Dave Taht <dave.taht@gmail.com> wrote:
>>
>> So I've taken a tiny amount of time to run a few tests. For starters,
>> thank you very much
>> for your dedication and time into creating such a usable website, and faq.
>>
>> I have several issues though I really haven't had time to delve deep
>> into the packet captures. (others, please try taking em, and put them
>> somewhere?)
>>
>> 0) "average" jitter is a meaningless number. In the case of a
>> videoconferencing application,
>> what matters most is max jitter, where the app will choose to ride the
>> top edge of that, rather than follow it. I'd prefer using a 98%
>> number, rather than 75% number, to weight where the typical delay in a
>> videoconfernce might end up.
>>
>> 1) The worst case scenario of bloat affecting a users experience is
>> during a simultaneous up and download, and I'd rather you did that
>> rather than test them separately. Also you get
>> a more realistic figure for the actual achievable bandwidth under
>> contention and can expose problems like strict priority queuing in one
>> direction or another locking out further flows.
>>
>> 2) I get absurdly great results from it with or without sqm on on a
>> reasonably modern cablemodem (buffercontrol and pie and a cmts doing
>> the right things)
>>
>> This points to any of number of problems (features!) It's certainly my
>> hope that all the cdn makers at this point have installed bufferbloat
>> mitigations. Testing a cdn's tcp IS a great idea, but as a
>> bufferbloated test, maybe not so much.
>>
>> The packet capture of the tcp flows DOES show about 60ms jitter... but
>> no loss. Your test shows:
>>
>> https://www.waveform.com/tools/bufferbloat?test-id=6fc7dd95-8bfa-4b76-b141-ed423b6580a9
>>
>> And is very jittery in the beginning of the test on its estimates. I
>> really should be overjoyed at knowing a cdn is doing more of the right
>> things, but in terms of a test... and linux also has got a ton of
>> mitigations on the client side.
>>
>> 3) As a side note, ecn actually is negotiated on the upload, if it's
>> enabled on your system.
>> Are you tracking an ecn statistics at this point (ecnseen)? It is not
>> negotiated on the download (which is fine by me).
>>
>> I regrettable at this precise moment am unable to test a native
>> cablemodem at the same speed as a sqm box, hope to get further on this
>> tomorrow.
>>
>> Again, GREAT work so far, and I do think a test tool for all these
>> cdns - heck, one that tested all of them at the same time, is very,
>> very useful.
>>
>> On Wed, Feb 24, 2021 at 10:22 AM Sina Khanifar <sina@waveform.com> wrote:
>> >
>> > Hi all,
>> >
>> > A couple of months ago my co-founder Sam posted an early beta of the
>> > Bufferbloat test that we’ve been working on, and Dave also linked to
>> > it a couple of weeks ago.
>> >
>> > Thank you all so much for your feedback - we almost entirely
>> > redesigned the tool and the UI based on the comments we received.
>> > We’re almost ready to launch the tool officially today at this URL,
>> > but wanted to show it to the list in case anyone finds any last bugs
>> > that we might have overlooked:
>> >
>> > https://www.waveform.com/tools/bufferbloat
>> >
>> > If you find a bug, please share the "Share Your Results" link with us
>> > along with what happened. We capture some debugging information on the
>> > backend, and having a share link allows us to diagnose any issues.
>> >
>> > This is really more of a passion project than anything else for us –
>> > we don’t anticipate we’ll try to commercialize it or anything like
>> > that. We're very thankful for all the work the folks on this list have
>> > done to identify and fix bufferbloat, and hope this is a useful
>> > contribution. I’ve personally been very frustrated by bufferbloat on a
>> > range of devices, and decided it might be helpful to build another
>> > bufferbloat test when the DSLReports test was down at some point last
>> > year.
>> >
>> > Our goals with this project were:
>> >   * To build a second solid bufferbloat test in case DSLReports goes down 
>> again.
>> >   * Build a test where bufferbloat is front and center as the primary
>> > purpose of the test, rather than just a feature.
>> >   * Try to explain bufferbloat and its effect on a user's connection
>> > as clearly as possible for a lay audience.
>> >
>> > A few notes:
>> >   * On the backend, we’re using Cloudflare’s CDN to perform the actual
>> > download and upload speed test. I know John Graham-Cunning has posted
>> > to this list in the past; if he or anyone from Cloudflare sees this,
>> > we’d love some help. Our Cloudflare Workers are being
>> > bandwidth-throttled due to having a non-enterprise grade account.
>> > We’ve worked around this in a kludgy way, but we’d love to get it
>> > resolved.
>> >   * We have lots of ideas for improvements, e.g. simultaneous
>> > upload/downloads, trying different file size chunks, time-series
>> > latency graphs, using WebRTC to test UDP traffic etc, but in the
>> > interest of getting things launched we're sticking with the current
>> > featureset.
>> >   * There are a lot of browser-specific workarounds that we had to
>> > implement, and latency itself is measured in different ways on
>> > Safari/Webkit vs Chromium/Firefox due to limitations of the
>> > PerformanceTiming APIs. You may notice that latency is different on
>> > different browsers, however the actual bufferbloat (relative increase
>> > in latency) should be pretty consistent.
>> >
>> > In terms of some of the changes we made based on the feedback we
>> > receive on this list:
>> >
>> > Based on Toke’s feedback:
>> > https://lists.bufferbloat.net/pipermail/bloat/2020-November/015960.html
>> > https://lists.bufferbloat.net/pipermail/bloat/2020-November/015976.html
>> >   * We changed the way the speed tests run to show an instantaneous
>> > speed as the test is being run.
>> >   * We moved the bufferbloat grade into the main results box.
>> >   * We tried really hard to get as close to saturating gigabit
>> > connections as possible. We redesigned completely the way we chunk
>> > files, added a “warming up” period, and spent quite a bit optimizing
>> > our code to minimize CPU usage, as we found that was often the
>> > limiting factor to our speed test results.
>> >   * We changed the shield grades altogether and went through a few
>> > different iterations of how to show the effect of bufferbloat on
>> > connectivity, and ended up with a “table view” to try to show the
>> > effect that bufferbloat specifically is having on the connection
>> > (compared to when the connection is unloaded).
>> >   * We now link from the results table view to the FAQ where the
>> > conditions for each type of connection are explained.
>> >   * We also changed the way we measure latency and now use the faster
>> > of either Google’s CDN or Cloudflare at any given location. We’re also
>> > using the WebTiming APIs to get a more accurate latency number, though
>> > this does not work on some mobile browsers (e.g. iOS Safari) and as a
>> > result we show a higher latency on mobile devices. Since our test is
>> > less a test of absolute latency and more a test of relative latency
>> > with and without load, we felt this was workable.
>> >   * Our jitter is now an average (was previously RMS).
>> >   * The “before you start” text was rewritten and moved above the start 
>> button.
>> >   * We now spell out upload and download instead of having arrows.
>> >   * We hugely reduced the number of cross-site scripts. I was a bit
>> > embarrassed by this if I’m honest - I spent a long time building web
>> > tools for the EFF, where we almost never allowed any cross-site
>> > scripts. * Our site is hosted on Shopify, and adding any features via
>> > their app store ends up adding a whole lot of gunk. But we uninstalled
>> > some apps, rewrote our template, and ended up removing a whole lot of
>> > the gunk. There’s still plenty of room for improvement, but it should
>> > be a lot better than before.
>> >
>> > Based on Dave Collier-Brown’s feedback:
>> > https://lists.bufferbloat.net/pipermail/bloat/2020-November/015966.html
>> >   * We replaced the “unloaded” and “loaded” language with “unloaded”
>> > and then “download active”  and “upload active.” In the grade box we
>> > indicate that, for example, “Your latency increased moderately under
>> > load.”
>> >   * We tried to generally make it easier for non-techie folks to
>> > understand by emphasizing the grade and adding the table showing how
>> > bufferbloat affects some commonly-used services.
>> >   * We didn’t really change the candle charts too much - they’re
>> > mostly just to give a basic visual - we focused more on the actual
>> > meat of the results above that.
>> >
>> > Based on Sebastian Moeller’s feedback:
>> > https://lists.bufferbloat.net/pipermail/bloat/2020-November/015963.html
>> >   * We considered doing a bidirectional saturating load, but decided
>> > to skip on implementing it for now. * It’s definitely something we’d
>> > like to experiment with more in the future.
>> >   * We added a “warming up” period as well as a “draining” period to
>> > help fill and empty the buffer. We haven’t added the option for an
>> > extended test, but have this on our list of backlog changes to make in
>> > the future.
>> >
>> > Based on Y’s feedback (link):
>> > https://lists.bufferbloat.net/pipermail/bloat/2020-November/015962.html
>> >   * We actually ended up removing the grades, but we explained our
>> > criteria for the new table in the FAQ.
>> >
>> > Based on Greg White's feedback (shared privately):
>> > * We added an FAQ answer explaining jitter and how we measure it.
>> >
>> > We’d love for you all to play with the new version of the tool and
>> > send over any feedback you might have. We’re going to be in a feature
>> > freeze before launch but we'd love to get any bugs sorted out. We'll
>> > likely put this project aside after we iron out a last round of bugs
>> > and launch, and turn back to working on projects that help us pay the
>> > bills, but we definitely hope to revisit and improve the tool over
>> > time.
>> >
>> > Best,
>> >
>> > Sina, Arshan, and Sam.
>> > _______________________________________________
>> > Bloat mailing list
>> > Bloat@lists.bufferbloat.net
>> > https://lists.bufferbloat.net/listinfo/bloat
>>
>>
>>
>> --
>> "For a successful technology, reality must take precedence over public
>> relations, for Mother Nature cannot be fooled" - Richard Feynman
>>
>> dave@taht.net <Dave Täht> CTO, TekLibre, LLC Tel: 1-831-435-0729
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat