[Cake] [Make-wifi-fast] [Bloat] dslreports is no longer free

Sergey Fedorov sfedorov at netflix.com
Mon May 4 13:04:19 EDT 2020


>
> Sergey - I wasn't assuming anything about fast.com. The document you
> shared wasn't clear about the methodology's details here. Others sadly,
> have actually used ICMP pings in the way I described. I was making a
> generic comment of concern.
>
> That said, it sounds like what you are doing is really helpful (esp. given
> that your measure is aimed at end user experiential qualities).

David - my apologies, I incorrectly interpreted your statement as being
said in context of fast.com measurements. The blog post linked indeed
doesn't provide the latency measurement details - was written before we
added the extra metrics. We'll see if we can publish an update.

1) a clear definition of lag under load that is from end-to-end in latency,
> and involves, ideally, independent traffic from multiple sources through
> the bottleneck.

 Curious if by multiple sources you mean multiple clients (devices) or
multiple connections sending data?


SERGEY FEDOROV

Director of Engineering

sfedorov at netflix.com

121 Albright Way | Los Gatos, CA 95032




On Sun, May 3, 2020 at 8:07 AM David P. Reed <dpreed at deepplum.com> wrote:

> Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off
> the entry device that has the external IP address for the NAT gets most of
> the RTT measure, and if there's no queueing built up in the NAT device,
> that's a reasonable measure. But...
>
>
>
> However, if the router has "taken up the queueing delay" by rate limiting
> its uplink traffic to slightly less than the capacity (as with Cake and
> other TC shaping that isn't as good as cake), then there is a queue in the
> TC layer itself. This is what concerns me as a distortion in the
> measurement that can fool one into thinking the TC shaper is doing a good
> job, when in fact, lag under load may be quite high from inside the routed
> domain (the home).
>
>
>
> As you point out this unmeasured queueing delay can also be a problem with
> WiFi inside the home. But it isn't limited to that.
>
>
>
> A badly set up shaping/congestion management subsystem inside the NAT can
> look "very good" in its echo of ICMP packets, but be terrible in response
> time to trivial HTTP requests from inside, or equally terrible in twitch
> games and video conferencing.
>
>
>
> So, for example, for tuning settings with "Cake" it is useless.
>
>
>
> To be fair, usually the Access Provider has no control of what is done
> after the cable is terminated at the home, so as a way to decide if the
> provider is badly engineering its side, a ping from a server is a
> reasonable quality measure of the provider.
>
>
>
> But not a good measure of the user experience, and if the provider
> provides the NAT box, even if it has a good shaper in it, like Cake or
> fq_codel, it will just confuse the user and create the opportunity for a
> "finger pointing" argument where neither side understands what is going on.
>
>
>
> This is why we need
>
>
>
> 1) a clear definition of lag under load that is from end-to-end in
> latency, and involves, ideally, independent traffic from multiple sources
> through the bottleneck.
>
>
>
> 2) ideally, a better way to localize where the queues are building up and
> present that to users and access providers.  The flent graphs are not
> interpretable by most non-experts. What we need is a simple visualization
> of a sketch-map of the path (like traceroute might provide) with queueing
> delay measures  shown at key points that the user can understand.
>
> On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" <moeller0 at gmx.de>
> said:
>
> > Hi David,
> >
> > in principle I agree, a NATed IPv4 ICMP probe will be at best reflected
> at the NAT
> > router (CPE) (some commercial home gateways do not respond to ICMP echo
> requests
> > in the name of security theatre). So it is pretty hard to measure the
> full end to
> > end path in that configuration. I believe that IPv6 should make that
> > easier/simpler in that NAT hopefully will be out of the path (but let's
> see what
> > ingenuity ISPs will come up with).
> > Then again, traditionally the relevant bottlenecks often are a) the
> internet
> > access link itself and there the CPE is in a reasonable position as a
> reflector on
> > the other side of the bottleneck as seen from an internet server, b) the
> home
> > network between CPE and end-host, often with variable rate wifi, here I
> agree
> > reflecting echos at the CPE hides part of the issue.
> >
> >
> >
> > > On May 2, 2020, at 19:38, David P. Reed <dpreed at deepplum.com> wrote:
> > >
> > > I am still a bit worried about properly defining "latency under load"
> for a
> > NAT routed situation. If the test is based on ICMP Ping packets *from
> the server*,
> > it will NOT be measuring the full path latency, and if the potential
> congestion
> > is in the uplink path from the access provider's residential box to the
> access
> > provider's router/switch, it will NOT measure congestion caused by
> bufferbloat
> > reliably on either side, since the bufferbloat will be outside the ICMP
> Ping
> > path.
> >
> > Puzzled, as i believe it is going to be the residential box that will
> respond
> > here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo
> requests?
> >
> > >
> > > I realize that a browser based speed test has to be basically run from
> the
> > "server" end, because browsers are not that good at time measurement on
> a packet
> > basis. However, there are ways to solve this and avoid the ICMP Ping
> issue, with a
> > cooperative server.
> > >
> > > I once built a test that fixed this issue reasonably well. It carefully
> > created a TCP based RTT measurement channel (over HTTP) that made the
> echo have to
> > traverse the whole end-to-end path, which is the best and only way to
> accurately
> > define lag under load from the user's perspective. The client end of an
> unloaded
> > TCP connection can depend on TCP (properly prepared by getting it past
> slowstart)
> > to generate a single packet response.
> > >
> > > This "TCP ping" is thus compatible with getting the end-to-end
> measurement on
> > the server end of a true RTT.
> > >
> > > It's like tcp-traceroute tool, in that it tricks anyone in the middle
> boxes
> > into thinking this is a real, serious packet, not an optional low
> priority
> > packet.
> > >
> > > The same issue comes up with non-browser-based techniques for
> measuring true
> > lag-under-load.
> > >
> > > Now as we move HTTP to QUIC, this actually gets easier to do.
> > >
> > > One other opportunity I haven't explored, but which is pregnant with
> > potential is the use of WebRTC, which runs over UDP internally. Since
> JavaScript
> > has direct access to create WebRTC connections (multiple ones), this
> makes
> > detailed testing in the browser quite reasonable.
> > >
> > > And the time measurements can resolve well below 100 microseconds, if
> the JS
> > is based on modern JIT compilation (Chrome, Firefox, Edge all compile to
> machine
> > code speed if the code is restricted and in a loop). Then again, there
> is Web
> > Assembly if you want to write C code that runs in the brower fast.
> WebAssembly is
> > a low level language that compiles to machine code in the browser
> execution, and
> > still has access to all the browser networking facilities.
> >
> > Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to
> spectre
> > side-channel vulnerabilities many browsers seemed to have lowered the
> timer
> > resolution, but even the ~1ms resolution should be fine for typical RTTs.
> >
> > Best Regards
> > Sebastian
> >
> > P.S.: I assume that I simply do not see/understand the full scope of the
> issue at
> > hand yet.
> >
> >
> > >
> > > On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht at gmail.com>
> > said:
> > >
> > > > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce at gmail.com>
> > wrote:
> > > > >
> > > > > > Fast.com reports my unloaded latency as 4ms, my loaded latency
> > as ~7ms
> > > >
> > > > I guess one of my questions is that with a switch to BBR netflix is
> > > > going to do pretty well. If fast.com is using bbr, well... that
> > > > excludes much of the current side of the internet.
> > > >
> > > > > For download, I show 6ms unloaded and 6-7 loaded. But for upload
> > the loaded
> > > > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer
> using
> > any
> > > > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the
> > bloat would
> > > > be nice.
> > > >
> > > > The tests do need to last a fairly long time.
> > > >
> > > > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom
> > <jannie at hanekom.net>
> > > > wrote:
> > > > >>
> > > > >> Michael Richardson <mcr at sandelman.ca>:
> > > > >> > Does it find/use my nearest Netflix cache?
> > > > >>
> > > > >> Thankfully, it appears so. The DSLReports bloat test was
> > interesting,
> > > > but
> > > > >> the jitter on the ~240ms base latency from South Africa (and
> > other parts
> > > > of
> > > > >> the world) was significant enough that the figures returned
> > were often
> > > > >> unreliable and largely unusable - at least in my experience.
> > > > >>
> > > > >> Fast.com reports my unloaded latency as 4ms, my loaded latency
> > as ~7ms
> > > > and
> > > > >> mentions servers located in local cities. I finally have a test
> > I can
> > > > share
> > > > >> with local non-technical people!
> > > > >>
> > > > >> (Agreed, upload test would be nice, but this is a huge step
> > forward from
> > > > >> what I had access to before.)
> > > > >>
> > > > >> Jannie Hanekom
> > > > >>
> > > > >> _______________________________________________
> > > > >> Cake mailing list
> > > > >> Cake at lists.bufferbloat.net
> > > > >> https://lists.bufferbloat.net/listinfo/cake
> > > > >
> > > > > _______________________________________________
> > > > > Cake mailing list
> > > > > Cake at lists.bufferbloat.net
> > > > > https://lists.bufferbloat.net/listinfo/cake
> > > >
> > > >
> > > >
> > > > --
> > > > Make Music, Not War
> > > >
> > > > Dave Täht
> > > > CTO, TekLibre, LLC
> > > > http://www.teklibre.com
> > > > Tel: 1-831-435-0729
> > > > _______________________________________________
> > > > Cake mailing list
> > > > Cake at lists.bufferbloat.net
> > > > https://lists.bufferbloat.net/listinfo/cake
> > > >
> > > _______________________________________________
> > > Cake mailing list
> > > Cake at lists.bufferbloat.net
> > > https://lists.bufferbloat.net/listinfo/cake
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/cake/attachments/20200504/6e7fad7b/attachment-0001.html>


More information about the Cake mailing list