> > Sergey - I wasn't assuming anything about fast.com. The document you > shared wasn't clear about the methodology's details here. Others sadly, > have actually used ICMP pings in the way I described. I was making a > generic comment of concern. > > That said, it sounds like what you are doing is really helpful (esp. given > that your measure is aimed at end user experiential qualities). David - my apologies, I incorrectly interpreted your statement as being said in context of fast.com measurements. The blog post linked indeed doesn't provide the latency measurement details - was written before we added the extra metrics. We'll see if we can publish an update. 1) a clear definition of lag under load that is from end-to-end in latency, > and involves, ideally, independent traffic from multiple sources through > the bottleneck. Curious if by multiple sources you mean multiple clients (devices) or multiple connections sending data? SERGEY FEDOROV Director of Engineering sfedorov@netflix.com 121 Albright Way | Los Gatos, CA 95032 On Sun, May 3, 2020 at 8:07 AM David P. Reed wrote: > Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off > the entry device that has the external IP address for the NAT gets most of > the RTT measure, and if there's no queueing built up in the NAT device, > that's a reasonable measure. But... > > > > However, if the router has "taken up the queueing delay" by rate limiting > its uplink traffic to slightly less than the capacity (as with Cake and > other TC shaping that isn't as good as cake), then there is a queue in the > TC layer itself. This is what concerns me as a distortion in the > measurement that can fool one into thinking the TC shaper is doing a good > job, when in fact, lag under load may be quite high from inside the routed > domain (the home). > > > > As you point out this unmeasured queueing delay can also be a problem with > WiFi inside the home. But it isn't limited to that. > > > > A badly set up shaping/congestion management subsystem inside the NAT can > look "very good" in its echo of ICMP packets, but be terrible in response > time to trivial HTTP requests from inside, or equally terrible in twitch > games and video conferencing. > > > > So, for example, for tuning settings with "Cake" it is useless. > > > > To be fair, usually the Access Provider has no control of what is done > after the cable is terminated at the home, so as a way to decide if the > provider is badly engineering its side, a ping from a server is a > reasonable quality measure of the provider. > > > > But not a good measure of the user experience, and if the provider > provides the NAT box, even if it has a good shaper in it, like Cake or > fq_codel, it will just confuse the user and create the opportunity for a > "finger pointing" argument where neither side understands what is going on. > > > > This is why we need > > > > 1) a clear definition of lag under load that is from end-to-end in > latency, and involves, ideally, independent traffic from multiple sources > through the bottleneck. > > > > 2) ideally, a better way to localize where the queues are building up and > present that to users and access providers. The flent graphs are not > interpretable by most non-experts. What we need is a simple visualization > of a sketch-map of the path (like traceroute might provide) with queueing > delay measures shown at key points that the user can understand. > > On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" > said: > > > Hi David, > > > > in principle I agree, a NATed IPv4 ICMP probe will be at best reflected > at the NAT > > router (CPE) (some commercial home gateways do not respond to ICMP echo > requests > > in the name of security theatre). So it is pretty hard to measure the > full end to > > end path in that configuration. I believe that IPv6 should make that > > easier/simpler in that NAT hopefully will be out of the path (but let's > see what > > ingenuity ISPs will come up with). > > Then again, traditionally the relevant bottlenecks often are a) the > internet > > access link itself and there the CPE is in a reasonable position as a > reflector on > > the other side of the bottleneck as seen from an internet server, b) the > home > > network between CPE and end-host, often with variable rate wifi, here I > agree > > reflecting echos at the CPE hides part of the issue. > > > > > > > > > On May 2, 2020, at 19:38, David P. Reed wrote: > > > > > > I am still a bit worried about properly defining "latency under load" > for a > > NAT routed situation. If the test is based on ICMP Ping packets *from > the server*, > > it will NOT be measuring the full path latency, and if the potential > congestion > > is in the uplink path from the access provider's residential box to the > access > > provider's router/switch, it will NOT measure congestion caused by > bufferbloat > > reliably on either side, since the bufferbloat will be outside the ICMP > Ping > > path. > > > > Puzzled, as i believe it is going to be the residential box that will > respond > > here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo > requests? > > > > > > > > I realize that a browser based speed test has to be basically run from > the > > "server" end, because browsers are not that good at time measurement on > a packet > > basis. However, there are ways to solve this and avoid the ICMP Ping > issue, with a > > cooperative server. > > > > > > I once built a test that fixed this issue reasonably well. It carefully > > created a TCP based RTT measurement channel (over HTTP) that made the > echo have to > > traverse the whole end-to-end path, which is the best and only way to > accurately > > define lag under load from the user's perspective. The client end of an > unloaded > > TCP connection can depend on TCP (properly prepared by getting it past > slowstart) > > to generate a single packet response. > > > > > > This "TCP ping" is thus compatible with getting the end-to-end > measurement on > > the server end of a true RTT. > > > > > > It's like tcp-traceroute tool, in that it tricks anyone in the middle > boxes > > into thinking this is a real, serious packet, not an optional low > priority > > packet. > > > > > > The same issue comes up with non-browser-based techniques for > measuring true > > lag-under-load. > > > > > > Now as we move HTTP to QUIC, this actually gets easier to do. > > > > > > One other opportunity I haven't explored, but which is pregnant with > > potential is the use of WebRTC, which runs over UDP internally. Since > JavaScript > > has direct access to create WebRTC connections (multiple ones), this > makes > > detailed testing in the browser quite reasonable. > > > > > > And the time measurements can resolve well below 100 microseconds, if > the JS > > is based on modern JIT compilation (Chrome, Firefox, Edge all compile to > machine > > code speed if the code is restricted and in a loop). Then again, there > is Web > > Assembly if you want to write C code that runs in the brower fast. > WebAssembly is > > a low level language that compiles to machine code in the browser > execution, and > > still has access to all the browser networking facilities. > > > > Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to > spectre > > side-channel vulnerabilities many browsers seemed to have lowered the > timer > > resolution, but even the ~1ms resolution should be fine for typical RTTs. > > > > Best Regards > > Sebastian > > > > P.S.: I assume that I simply do not see/understand the full scope of the > issue at > > hand yet. > > > > > > > > > > On Saturday, May 2, 2020 12:52pm, "Dave Taht" > > said: > > > > > > > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce > > wrote: > > > > > > > > > > > Fast.com reports my unloaded latency as 4ms, my loaded latency > > as ~7ms > > > > > > > > I guess one of my questions is that with a switch to BBR netflix is > > > > going to do pretty well. If fast.com is using bbr, well... that > > > > excludes much of the current side of the internet. > > > > > > > > > For download, I show 6ms unloaded and 6-7 loaded. But for upload > > the loaded > > > > shows as 7-8 and I see it blip upwards of 12ms. But I am no longer > using > > any > > > > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of the > > bloat would > > > > be nice. > > > > > > > > The tests do need to last a fairly long time. > > > > > > > > > On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom > > > > > > wrote: > > > > >> > > > > >> Michael Richardson : > > > > >> > Does it find/use my nearest Netflix cache? > > > > >> > > > > >> Thankfully, it appears so. The DSLReports bloat test was > > interesting, > > > > but > > > > >> the jitter on the ~240ms base latency from South Africa (and > > other parts > > > > of > > > > >> the world) was significant enough that the figures returned > > were often > > > > >> unreliable and largely unusable - at least in my experience. > > > > >> > > > > >> Fast.com reports my unloaded latency as 4ms, my loaded latency > > as ~7ms > > > > and > > > > >> mentions servers located in local cities. I finally have a test > > I can > > > > share > > > > >> with local non-technical people! > > > > >> > > > > >> (Agreed, upload test would be nice, but this is a huge step > > forward from > > > > >> what I had access to before.) > > > > >> > > > > >> Jannie Hanekom > > > > >> > > > > >> _______________________________________________ > > > > >> Cake mailing list > > > > >> Cake@lists.bufferbloat.net > > > > >> https://lists.bufferbloat.net/listinfo/cake > > > > > > > > > > _______________________________________________ > > > > > Cake mailing list > > > > > Cake@lists.bufferbloat.net > > > > > https://lists.bufferbloat.net/listinfo/cake > > > > > > > > > > > > > > > > -- > > > > Make Music, Not War > > > > > > > > Dave Täht > > > > CTO, TekLibre, LLC > > > > http://www.teklibre.com > > > > Tel: 1-831-435-0729 > > > > _______________________________________________ > > > > Cake mailing list > > > > Cake@lists.bufferbloat.net > > > > https://lists.bufferbloat.net/listinfo/cake > > > > > > > _______________________________________________ > > > Cake mailing list > > > Cake@lists.bufferbloat.net > > > https://lists.bufferbloat.net/listinfo/cake > > > > > > >