From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from masada.superduper.net (masada.superduper.net [85.119.82.91]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id E84F53B2A4 for ; Thu, 25 Feb 2021 08:46:46 -0500 (EST) Received: from 52-119-118-48.public.monkeybrains.net ([52.119.118.48] helo=[192.168.131.74]) by masada.superduper.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from ) id 1lFGyX-0004ON-4y; Thu, 25 Feb 2021 13:46:43 +0000 From: Simon Barber To: Sina Khanifar CC: Dave Taht , , bloat Date: Thu, 25 Feb 2021 05:46:40 -0800 Message-ID: <177d9710618.27a9.e972a4f4d859b00521b2b659602cb2f9@superduper.net> In-Reply-To: References: <177d80af608.27a9.e972a4f4d859b00521b2b659602cb2f9@superduper.net> User-Agent: AquaMail/1.28.1-1760 (build: 102800003) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----------177d9710a2532f727a98d2ae4f" X-Spam-Score: -2.9 (--) Subject: Re: [Bloat] Updated Bufferbloat Test X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Feb 2021 13:46:47 -0000 This is a multi-part message in MIME format. ------------177d9710a2532f727a98d2ae4f Content-Type: text/plain; format=flowed; charset="UTF-8" Content-Transfer-Encoding: 8bit And what counts is round trip end to end total latency. This is fixed latency plus jitter (variation above the fixed) - ie peak total latency. Peak total or high percentile 95/98 will be a much closer approximation to the performance of a real world jitter buffer in a voip system than average jitter. The higher the percentile the better. 2% drops distributed evenly over the call is one dropout every second. Most users would notice that and most jitter buffers would expand to avoid that high a level of loss. One quarter second burst loss every 10 seconds is lower impact but about the same percentage, so you see the loss pattern matters. Every jitter buffer is designed slightly differently so this measurement is an approximation. But the key is that peak or close to peak latency is the right measure. Simon On February 24, 2021 11:33:07 PM Sina Khanifar wrote: > Thanks for the explanation. > > Right now, our criteria for phone/audio is "latency < 300 ms" and > "jitter < 40 ms". > > It seems like something along the lines of "95th percentile latency < > 300 ms" might be advisable in place of the two existing criteria? > > > Sina. > > On Wed, Feb 24, 2021 at 11:15 PM Simon Barber wrote: >> >> >> >> On February 24, 2021 9:57:13 PM Sina Khanifar wrote: >> >>> Thanks for the feedback, Dave! >>> >>>> 0) "average" jitter is a meaningless number. In the case of a >>>> videoconferencing application, what matters most is max jitter, where the >>>> app will choose to ride the top edge of that, rather than follow it. I'd >>>> prefer using a 98% number, rather than 75% number, to weight where the >>>> typical delay in a videoconfernce might end up. >>> >>> >>> Both DSLReports and Ookla's desktop app report jitter as an average >>> rather than as a max number, so I'm a little hesitant to go against >>> the norm - users might find it a bit surprising to see much larger >>> jitter numbers reported. We're also not taking a whole ton of latency >>> tests in each phase, so the 98% will often end up being the max >>> number. >>> >>> With regards to the videoconferencing, we actually ran some real-world >>> tests of Zoom with various levels of bufferbloat/jitter/latency, and >>> calibrated our "real-world results" table on that basis. We used >>> average jitter in those tests ... I think if we used 98% or even 95% >>> the allowable number would be quite high. >> >> >> Video and audio cannot be played out until the packets have arrived, so >> late packets are effectively dropped, or the playback buffer must expand to >> accommodate the most late packets. If the playback buffer expands to >> accommodate the most late packets then the result is that the whole >> conversation is delayed by that amount. More than a fraction of a percent >> of dropped packets results in a very poor video or audio experience, this >> is why average jitter is irrelevant and peak or maximum latency is the >> correct measure to use. >> >> Yes, humans can tolerate quite a bit of delay. The conversation is >> significantly less fluid though. >> >> Simon >> >> >> >> >> >>> >>>> 1) The worst case scenario of bloat affecting a users experience is during >>>> a simultaneous up and download, and I'd rather you did that rather than >>>> test them separately. Also you get a more realistic figure for the actual >>>> achievable bandwidth under contention and can expose problems like strict >>>> priority queuing in one direction or another locking out further flows. >>> >>> >>> We did consider this based on another user's feedback, but didn't >>> implement it. Perhaps we can do this next time we revisit, though! >>> >>>> This points to any of number of problems (features!) It's certainly my hope >>>> that all the cdn makers at this point have installed bufferbloat >>>> mitigations. Testing a cdn's tcp IS a great idea, but as a bufferbloated >>>> test, maybe not so much. >>> >>> >>> We chose to use a CDN because it seemed like the only feasible way to >>> saturate gigabit links at least somewhat consistently for a meaningful >>> part of the globe, without setting up a whole lot of servers at quite >>> high cost. >>> >>> But we weren't aware that bufferbloat could be abated from the CDN's >>> end. This is a bit surprising to me, as our test results indicate that >>> bufferbloat is regularly an issue even though we're using a CDN for >>> the speed and latency tests. For example, these are the results on my >>> own connection here (Cox, in Southern California), showing meaningful >>> bufferbloat: >>> >>> https://www.waveform.com/tools/bufferbloat?test-id=ece467bd-e07a-45ea-9db6-e64d8da2c1d2 >>> >>> I get even larger bufferbloat effects when running the test on a 4G LTE >>> network: >>> >>> https://www.waveform.com/tools/bufferbloat?test-id=e99ae561-88e0-4e1e-bafd-90fe1de298ac >>> >>> If the CDN was abating bufferbloat, surely I wouldn't see results like these? >>> >>>> 3) Are you tracking an ecn statistics at this point (ecnseen)? >>> >>> >>> We are not, no. I'd definitely be curious to see if we can add this in >>> the future, though! >>> >>> Best, >>> >>> On Wed, Feb 24, 2021 at 2:10 PM Dave Taht wrote: >>>> >>>> >>>> So I've taken a tiny amount of time to run a few tests. For starters, >>>> thank you very much >>>> for your dedication and time into creating such a usable website, and faq. >>>> >>>> I have several issues though I really haven't had time to delve deep >>>> into the packet captures. (others, please try taking em, and put them >>>> somewhere?) >>>> >>>> 0) "average" jitter is a meaningless number. In the case of a >>>> videoconferencing application, >>>> what matters most is max jitter, where the app will choose to ride the >>>> top edge of that, rather than follow it. I'd prefer using a 98% >>>> number, rather than 75% number, to weight where the typical delay in a >>>> videoconfernce might end up. >>>> >>>> 1) The worst case scenario of bloat affecting a users experience is >>>> during a simultaneous up and download, and I'd rather you did that >>>> rather than test them separately. Also you get >>>> a more realistic figure for the actual achievable bandwidth under >>>> contention and can expose problems like strict priority queuing in one >>>> direction or another locking out further flows. >>>> >>>> 2) I get absurdly great results from it with or without sqm on on a >>>> reasonably modern cablemodem (buffercontrol and pie and a cmts doing >>>> the right things) >>>> >>>> This points to any of number of problems (features!) It's certainly my >>>> hope that all the cdn makers at this point have installed bufferbloat >>>> mitigations. Testing a cdn's tcp IS a great idea, but as a >>>> bufferbloated test, maybe not so much. >>>> >>>> The packet capture of the tcp flows DOES show about 60ms jitter... but >>>> no loss. Your test shows: >>>> >>>> https://www.waveform.com/tools/bufferbloat?test-id=6fc7dd95-8bfa-4b76-b141-ed423b6580a9 >>>> >>>> And is very jittery in the beginning of the test on its estimates. I >>>> really should be overjoyed at knowing a cdn is doing more of the right >>>> things, but in terms of a test... and linux also has got a ton of >>>> mitigations on the client side. >>>> >>>> 3) As a side note, ecn actually is negotiated on the upload, if it's >>>> enabled on your system. >>>> Are you tracking an ecn statistics at this point (ecnseen)? It is not >>>> negotiated on the download (which is fine by me). >>>> >>>> I regrettable at this precise moment am unable to test a native >>>> cablemodem at the same speed as a sqm box, hope to get further on this >>>> tomorrow. >>>> >>>> Again, GREAT work so far, and I do think a test tool for all these >>>> cdns - heck, one that tested all of them at the same time, is very, >>>> very useful. >>>> >>>> On Wed, Feb 24, 2021 at 10:22 AM Sina Khanifar wrote: >>>>> >>>>> >>>>> Hi all, >>>>> >>>>> A couple of months ago my co-founder Sam posted an early beta of the >>>>> Bufferbloat test that we’ve been working on, and Dave also linked to >>>>> it a couple of weeks ago. >>>>> >>>>> Thank you all so much for your feedback - we almost entirely >>>>> redesigned the tool and the UI based on the comments we received. >>>>> We’re almost ready to launch the tool officially today at this URL, >>>>> but wanted to show it to the list in case anyone finds any last bugs >>>>> that we might have overlooked: >>>>> >>>>> https://www.waveform.com/tools/bufferbloat >>>>> >>>>> If you find a bug, please share the "Share Your Results" link with us >>>>> along with what happened. We capture some debugging information on the >>>>> backend, and having a share link allows us to diagnose any issues. >>>>> >>>>> This is really more of a passion project than anything else for us – >>>>> we don’t anticipate we’ll try to commercialize it or anything like >>>>> that. We're very thankful for all the work the folks on this list have >>>>> done to identify and fix bufferbloat, and hope this is a useful >>>>> contribution. I’ve personally been very frustrated by bufferbloat on a >>>>> range of devices, and decided it might be helpful to build another >>>>> bufferbloat test when the DSLReports test was down at some point last >>>>> year. >>>>> >>>>> Our goals with this project were: >>>>> * To build a second solid bufferbloat test in case DSLReports goes down again. >>>>> * Build a test where bufferbloat is front and center as the primary >>>>> purpose of the test, rather than just a feature. >>>>> * Try to explain bufferbloat and its effect on a user's connection >>>>> as clearly as possible for a lay audience. >>>>> >>>>> A few notes: >>>>> * On the backend, we’re using Cloudflare’s CDN to perform the actual >>>>> download and upload speed test. I know John Graham-Cunning has posted >>>>> to this list in the past; if he or anyone from Cloudflare sees this, >>>>> we’d love some help. Our Cloudflare Workers are being >>>>> bandwidth-throttled due to having a non-enterprise grade account. >>>>> We’ve worked around this in a kludgy way, but we’d love to get it >>>>> resolved. >>>>> * We have lots of ideas for improvements, e.g. simultaneous >>>>> upload/downloads, trying different file size chunks, time-series >>>>> latency graphs, using WebRTC to test UDP traffic etc, but in the >>>>> interest of getting things launched we're sticking with the current >>>>> featureset. >>>>> * There are a lot of browser-specific workarounds that we had to >>>>> implement, and latency itself is measured in different ways on >>>>> Safari/Webkit vs Chromium/Firefox due to limitations of the >>>>> PerformanceTiming APIs. You may notice that latency is different on >>>>> different browsers, however the actual bufferbloat (relative increase >>>>> in latency) should be pretty consistent. >>>>> >>>>> In terms of some of the changes we made based on the feedback we >>>>> receive on this list: >>>>> >>>>> Based on Toke’s feedback: >>>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015960.html >>>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015976.html >>>>> * We changed the way the speed tests run to show an instantaneous >>>>> speed as the test is being run. >>>>> * We moved the bufferbloat grade into the main results box. >>>>> * We tried really hard to get as close to saturating gigabit >>>>> connections as possible. We redesigned completely the way we chunk >>>>> files, added a “warming up” period, and spent quite a bit optimizing >>>>> our code to minimize CPU usage, as we found that was often the >>>>> limiting factor to our speed test results. >>>>> * We changed the shield grades altogether and went through a few >>>>> different iterations of how to show the effect of bufferbloat on >>>>> connectivity, and ended up with a “table view” to try to show the >>>>> effect that bufferbloat specifically is having on the connection >>>>> (compared to when the connection is unloaded). >>>>> * We now link from the results table view to the FAQ where the >>>>> conditions for each type of connection are explained. >>>>> * We also changed the way we measure latency and now use the faster >>>>> of either Google’s CDN or Cloudflare at any given location. We’re also >>>>> using the WebTiming APIs to get a more accurate latency number, though >>>>> this does not work on some mobile browsers (e.g. iOS Safari) and as a >>>>> result we show a higher latency on mobile devices. Since our test is >>>>> less a test of absolute latency and more a test of relative latency >>>>> with and without load, we felt this was workable. >>>>> * Our jitter is now an average (was previously RMS). >>>>> * The “before you start” text was rewritten and moved above the start button. >>>>> * We now spell out upload and download instead of having arrows. >>>>> * We hugely reduced the number of cross-site scripts. I was a bit >>>>> embarrassed by this if I’m honest - I spent a long time building web >>>>> tools for the EFF, where we almost never allowed any cross-site >>>>> scripts. * Our site is hosted on Shopify, and adding any features via >>>>> their app store ends up adding a whole lot of gunk. But we uninstalled >>>>> some apps, rewrote our template, and ended up removing a whole lot of >>>>> the gunk. There’s still plenty of room for improvement, but it should >>>>> be a lot better than before. >>>>> >>>>> Based on Dave Collier-Brown’s feedback: >>>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015966.html >>>>> * We replaced the “unloaded” and “loaded” language with “unloaded” >>>>> and then “download active” and “upload active.” In the grade box we >>>>> indicate that, for example, “Your latency increased moderately under >>>>> load.” >>>>> * We tried to generally make it easier for non-techie folks to >>>>> understand by emphasizing the grade and adding the table showing how >>>>> bufferbloat affects some commonly-used services. >>>>> * We didn’t really change the candle charts too much - they’re >>>>> mostly just to give a basic visual - we focused more on the actual >>>>> meat of the results above that. >>>>> >>>>> Based on Sebastian Moeller’s feedback: >>>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015963.html >>>>> * We considered doing a bidirectional saturating load, but decided >>>>> to skip on implementing it for now. * It’s definitely something we’d >>>>> like to experiment with more in the future. >>>>> * We added a “warming up” period as well as a “draining” period to >>>>> help fill and empty the buffer. We haven’t added the option for an >>>>> extended test, but have this on our list of backlog changes to make in >>>>> the future. >>>>> >>>>> Based on Y’s feedback (link): >>>>> https://lists.bufferbloat.net/pipermail/bloat/2020-November/015962.html >>>>> * We actually ended up removing the grades, but we explained our >>>>> criteria for the new table in the FAQ. >>>>> >>>>> Based on Greg White's feedback (shared privately): >>>>> * We added an FAQ answer explaining jitter and how we measure it. >>>>> >>>>> We’d love for you all to play with the new version of the tool and >>>>> send over any feedback you might have. We’re going to be in a feature >>>>> freeze before launch but we'd love to get any bugs sorted out. We'll >>>>> likely put this project aside after we iron out a last round of bugs >>>>> and launch, and turn back to working on projects that help us pay the >>>>> bills, but we definitely hope to revisit and improve the tool over >>>>> time. >>>>> >>>>> Best, >>>>> >>>>> Sina, Arshan, and Sam. >>>>> _______________________________________________ >>>>> Bloat mailing list >>>>> Bloat@lists.bufferbloat.net >>>>> https://lists.bufferbloat.net/listinfo/bloat >>>> >>>> >>>> >>>> >>>> -- >>>> "For a successful technology, reality must take precedence over public >>>> relations, for Mother Nature cannot be fooled" - Richard Feynman >>>> >>>> dave@taht.net CTO, TekLibre, LLC Tel: 1-831-435-0729 >>> >>> _______________________________________________ >>> Bloat mailing list >>> Bloat@lists.bufferbloat.net >>> https://lists.bufferbloat.net/listinfo/bloat >> >> ------------177d9710a2532f727a98d2ae4f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
And what counts is round trip end to end total latency. T= his is fixed latency plus jitter (variation above the fixed) - ie peak tota= l latency.

Peak total or= high percentile 95/98 will be a much closer approximation to the performan= ce of a real world jitter buffer in a voip system than average jitter. The = higher the percentile the better.

2% drops distributed evenly over the call is one dropout every se= cond. Most users would notice that and most jitter buffers would expand to = avoid that high a level of loss. One quarter second burst loss every 10 sec= onds is lower impact but about the same percentage, so you see the loss pat= tern matters. Every jitter buffer is designed slightly differently so this = measurement is an approximation. But the key is that peak or close to peak = latency is the right measure.

Simon

On February 24, 2021 11:33:07 PM Sina Khanifar <sina@w= aveform.com> wrote:

Thanks for the explanation.

Right now, our criteria for phone/audio is "latency < = 300 ms" and
"jitter < 40 ms".

It seems like something along the lines of "95th percenti= le latency <
300 ms" might be advisable in place of the two existing c= riteria?


Sina.

On Wed, Feb 24, 2021 at 11:15 PM Simon Barber <simon@s= uperduper.net> wrote:



On February 24, 2021 9:57:13 PM Sina Khanifar <sina@wa= veform.com> wrote:

Thanks for the feedback, Dave!

0) "average" jitter is a meaningless number. In the case = of a videoconferencing application, what matters most is max jitter, where = the app will choose to ride the top edge of that, rather than follow it. I'= d prefer using a 98% number, rather than 75% number, to weight where the ty= pical delay in a videoconfernce might end up.


Both DSLReports and Ookla's desktop app report jitter as = an average
rather than as a max number, so I'm a little hesitant to = go against
the norm - users might find it a bit surprising to see mu= ch larger
jitter numbers reported. We're also not taking a whole to= n of latency
tests in each phase, so the 98% will often end up being t= he max
number.

With regards to the videoconferencing, we actually ran so= me real-world
tests of Zoom with various levels of bufferbloat/jitter/l= atency, and
calibrated our "real-world results" table on that basis. = We used
average jitter in those tests ... I think if we used 98% = or even 95%
the allowable number would be quite high.


Video and audio cannot be played out until the packets ha= ve arrived, so late packets are effectively dropped, or the playback buffer= must expand to accommodate the most late packets. If the playback buffer e= xpands to accommodate the most late packets then the result is that the who= le conversation is delayed by that amount. More than a fraction of a percen= t of dropped packets results in a very poor video or audio experience, this= is why average jitter is irrelevant and peak or maximum latency is the cor= rect measure to use.

Yes, humans can tolerate quite a bit of delay. The conver= sation is significantly less fluid though.

Simon






1) The worst case scenario of bloat affecting a users exp= erience is during a simultaneous up and download, and I'd rather you did th= at rather than test them separately. Also you get a more realistic figure f= or the actual achievable bandwidth under contention and can expose problems= like strict priority queuing in one direction or another locking out furth= er flows.


We did consider this based on another user's feedback, bu= t didn't
implement it. Perhaps we can do this next time we revisit= , though!

This points to any of number of problems (features!) It's= certainly my hope that all the cdn makers at this point have installed buf= ferbloat mitigations. Testing a cdn's tcp IS a great idea, but as a bufferb= loated test, maybe not so much.


We chose to use a CDN because it seemed like the only fea= sible way to
saturate gigabit links at least somewhat consistently for= a meaningful
part of the globe, without setting up a whole lot of serv= ers at quite
high cost.

But we weren't aware that bufferbloat could be abated fro= m the CDN's
end. This is a bit surprising to me, as our test results = indicate that
bufferbloat is regularly an issue even though we're using= a CDN for
the speed and latency tests. For example, these are the r= esults on my
own connection here (Cox, in Southern California), showin= g meaningful
bufferbloat:

https://www.waveform.com/tools/bufferbloat?test-id=3Dece4= 67bd-e07a-45ea-9db6-e64d8da2c1d2

I get even larger bufferbloat effects when running the te= st on a 4G LTE network:

https://www.waveform.com/tools/bufferbloat?test-id=3De99a= e561-88e0-4e1e-bafd-90fe1de298ac

If the CDN was abating bufferbloat, surely I wouldn't see= results like these?

3) Are you tracking an ecn statistics at this point (ecns= een)?


We are not, no. I'd definitely be curious to see if we ca= n add this in
the future, though!

Best,

On Wed, Feb 24, 2021 at 2:10 PM Dave Taht <dave.taht@g= mail.com> wrote:


So I've taken a tiny amount of time to run a few tests. F= or starters,
thank you very much
for your dedication and time into creating such a usable = website, and faq.

I have several issues though I really haven't had time to= delve deep
into the packet captures. (others, please try taking em, = and put them
somewhere?)

0) "average" jitter is a meaningless number. In the case = of a
videoconferencing application,
what matters most is max jitter, where the app will choos= e to ride the
top edge of that, rather than follow it. I'd prefer using= a 98%
number, rather than 75% number, to weight where the typic= al delay in a
videoconfernce might end up.

1) The worst case scenario of bloat affecting a users exp= erience is
during a simultaneous up and download, and I'd rather you= did that
rather than test them separately. Also you get
a more realistic figure for the actual achievable bandwid= th under
contention and can expose problems like strict priority q= ueuing in one
direction or another locking out further flows.

2) I get absurdly great results from it with or without s= qm on on a
reasonably modern cablemodem (buffercontrol and pie and a= cmts doing
the right things)

This points to any of number of problems (features!) It's= certainly my
hope that all the cdn makers at this point have installed= bufferbloat
mitigations. Testing a cdn's tcp IS a great idea, but as = a
bufferbloated test, maybe not so much.

The packet capture of the tcp flows DOES show about 60ms = jitter... but
no loss. Your test shows:

https://www.waveform.com/tools/bufferbloat?test-id=3D6fc7= dd95-8bfa-4b76-b141-ed423b6580a9

And is very jittery in the beginning of the test on its e= stimates. I
really should be overjoyed at knowing a cdn is doing more= of the right
things, but in terms of a test... and linux also has got = a ton of
mitigations on the client side.

3) As a side note, ecn actually is negotiated on the uplo= ad, if it's
enabled on your system.
Are you tracking an ecn statistics at this point (ecnseen= )? It is not
negotiated on the download (which is fine by me).

I regrettable at this precise moment am unable to test a = native
cablemodem at the same speed as a sqm box, hope to get fu= rther on this
tomorrow.

Again, GREAT work so far, and I do think a test tool for = all these
cdns - heck, one that tested all of them at the same time= , is very,
very useful.

On Wed, Feb 24, 2021 at 10:22 AM Sina Khanifar <sina@w= aveform.com> wrote:


Hi all,

A couple of months ago my co-founder Sam posted an early = beta of the
Bufferbloat test that we=E2=80=99ve been working on, and = Dave also linked to
it a couple of weeks ago.

Thank you all so much for your feedback - we almost entir= ely
redesigned the tool and the UI based on the comments we r= eceived.
We=E2=80=99re almost ready to launch the tool officially = today at this URL,
but wanted to show it to the list in case anyone finds an= y last bugs
that we might have overlooked:

https://www.waveform.com/tools/bufferbloat

If you find a bug, please share the "Share Your Results" = link with us
along with what happened. We capture some debugging infor= mation on the
backend, and having a share link allows us to diagnose an= y issues.

This is really more of a passion project than anything el= se for us =E2=80=93
we don=E2=80=99t anticipate we=E2=80=99ll try to commerci= alize it or anything like
that. We're very thankful for all the work the folks on t= his list have
done to identify and fix bufferbloat, and hope this is a = useful
contribution. I=E2=80=99ve personally been very frustrate= d by bufferbloat on a
range of devices, and decided it might be helpful to buil= d another
bufferbloat test when the DSLReports test was down at som= e point last
year.

Our goals with this project were:
* To build a second solid bufferbloat test in case DSLRep= orts goes down again.
* Build a test where bufferbloat is front and center as t= he primary
purpose of the test, rather than just a feature.
* Try to explain bufferbloat and its effect on a user's c= onnection
as clearly as possible for a lay audience.

A few notes:
* On the backend, we=E2=80=99re using Cloudflare=E2=80=99= s CDN to perform the actual
download and upload speed test. I know John Graham-Cunnin= g has posted
to this list in the past; if he or anyone from Cloudflare= sees this,
we=E2=80=99d love some help. Our Cloudflare Workers are b= eing
bandwidth-throttled due to having a non-enterprise grade = account.
We=E2=80=99ve worked around this in a kludgy way, but we= =E2=80=99d love to get it
resolved.
* We have lots of ideas for improvements, e.g. simultaneo= us
upload/downloads, trying different file size chunks, time= -series
latency graphs, using WebRTC to test UDP traffic etc, but= in the
interest of getting things launched we're sticking with t= he current
featureset.
* There are a lot of browser-specific workarounds that we= had to
implement, and latency itself is measured in different wa= ys on
Safari/Webkit vs Chromium/Firefox due to limitations of t= he
PerformanceTiming APIs. You may notice that latency is di= fferent on
different browsers, however the actual bufferbloat (relat= ive increase
in latency) should be pretty consistent.

In terms of some of the changes we made based on the feed= back we
receive on this list:

Based on Toke=E2=80=99s feedback:
https://lists.bufferbloat.net/pipermail/bloat/2020-Novemb= er/015960.html
https://lists.bufferbloat.net/pipermail/bloat/2020-Novemb= er/015976.html
* We changed the way the speed tests run to show an insta= ntaneous
speed as the test is being run.
* We moved the bufferbloat grade into the main results bo= x.
* We tried really hard to get as close to saturating giga= bit
connections as possible. We redesigned completely the way= we chunk
files, added a =E2=80=9Cwarming up=E2=80=9D period, and s= pent quite a bit optimizing
our code to minimize CPU usage, as we found that was ofte= n the
limiting factor to our speed test results.
* We changed the shield grades altogether and went throug= h a few
different iterations of how to show the effect of bufferb= loat on
connectivity, and ended up with a =E2=80=9Ctable view=E2= =80=9D to try to show the
effect that bufferbloat specifically is having on the con= nection
(compared to when the connection is unloaded).
* We now link from the results table view to the FAQ wher= e the
conditions for each type of connection are explained.
* We also changed the way we measure latency and now use = the faster
of either Google=E2=80=99s CDN or Cloudflare at any given= location. We=E2=80=99re also
using the WebTiming APIs to get a more accurate latency n= umber, though
this does not work on some mobile browsers (e.g. iOS Safa= ri) and as a
result we show a higher latency on mobile devices. Since = our test is
less a test of absolute latency and more a test of relati= ve latency
with and without load, we felt this was workable.
* Our jitter is now an average (was previously RMS).
* The =E2=80=9Cbefore you start=E2=80=9D text was rewritt= en and moved above the start button.
* We now spell out upload and download instead of having = arrows.
* We hugely reduced the number of cross-site scripts. I w= as a bit
embarrassed by this if I=E2=80=99m honest - I spent a lon= g time building web
tools for the EFF, where we almost never allowed any cros= s-site
scripts. * Our site is hosted on Shopify, and adding any = features via
their app store ends up adding a whole lot of gunk. But w= e uninstalled
some apps, rewrote our template, and ended up removing a = whole lot of
the gunk. There=E2=80=99s still plenty of room for improv= ement, but it should
be a lot better than before.

Based on Dave Collier-Brown=E2=80=99s feedback:
https://lists.bufferbloat.net/pipermail/bloat/2020-Novemb= er/015966.html
* We replaced the =E2=80=9Cunloaded=E2=80=9D and =E2=80= =9Cloaded=E2=80=9D language with =E2=80=9Cunloaded=E2=80=9D
and then =E2=80=9Cdownload active=E2=80=9D  and =E2= =80=9Cupload active.=E2=80=9D In the grade box we
indicate that, for example, =E2=80=9CYour latency increas= ed moderately under
load.=E2=80=9D
* We tried to generally make it easier for non-techie fol= ks to
understand by emphasizing the grade and adding the table = showing how
bufferbloat affects some commonly-used services.
* We didn=E2=80=99t really change the candle charts too m= uch - they=E2=80=99re
mostly just to give a basic visual - we focused more on t= he actual
meat of the results above that.

Based on Sebastian Moeller=E2=80=99s feedback:
https://lists.bufferbloat.net/pipermail/bloat/2020-Novemb= er/015963.html
* We considered doing a bidirectional saturating load, bu= t decided
to skip on implementing it for now. * It=E2=80=99s defini= tely something we=E2=80=99d
like to experiment with more in the future.
* We added a =E2=80=9Cwarming up=E2=80=9D period as well = as a =E2=80=9Cdraining=E2=80=9D period to
help fill and empty the buffer. We haven=E2=80=99t added = the option for an
extended test, but have this on our list of backlog chang= es to make in
the future.

Based on Y=E2=80=99s feedback (link):
https://lists.bufferbloat.net/pipermail/bloat/2020-Novemb= er/015962.html
* We actually ended up removing the grades, but we explai= ned our
criteria for the new table in the FAQ.

Based on Greg White's feedback (shared privately):
* We added an FAQ answer explaining jitter and how we mea= sure it.

We=E2=80=99d love for you all to play with the new versio= n of the tool and
send over any feedback you might have. We=E2=80=99re goin= g to be in a feature
freeze before launch but we'd love to get any bugs sorted= out. We'll
likely put this project aside after we iron out a last ro= und of bugs
and launch, and turn back to working on projects that hel= p us pay the
bills, but we definitely hope to revisit and improve the = tool over
time.

Best,

Sina, Arshan, and Sam.
_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat




--
"For a successful technology, reality must take precedenc= e over public
relations, for Mother Nature cannot be fooled" - Richard = Feynman

dave@taht.net <Dave T=C3=A4ht> CTO, TekLibre, LLC T= el: 1-831-435-0729

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat



------------177d9710a2532f727a98d2ae4f--