From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp112.iad3a.emailsrvr.com (smtp112.iad3a.emailsrvr.com [173.203.187.112]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 563763CB37 for ; Sun, 3 May 2020 11:07:00 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=g001.emailsrvr.com; s=20190322-9u7zjiwi; t=1588518420; bh=XqUNM3J5X3kuuAoqivF3ZrXfJ+6B3Mi9lnxTxqD2CiY=; h=Date:Subject:From:To:From; b=DHSM4L+HPlp4ENDVGq0d8LV+s2cQlbwVMUBzjYwbUSHPOaDRxmVf80Gm/Uo1GnSee 2ewr0gWP8uF6koGJG6npABabvXqEIC3bUNG6K6CBYurzmjClv7/7qqEWo2QfH0W0kX KrbdKWAgeeARptfdi/WMNef7oisGs6OTzxK7Xras= Received: from app44.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp7.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id E2ADF2BC0; Sun, 3 May 2020 11:06:59 -0400 (EDT) X-Sender-Id: dpreed@deepplum.com Received: from app44.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by 0.0.0.0:25 (trex/5.7.12); Sun, 03 May 2020 11:07:00 -0400 Received: from deepplum.com (localhost.localdomain [127.0.0.1]) by app44.wa-webapps.iad3a (Postfix) with ESMTP id A3A6960530; Sun, 3 May 2020 11:06:56 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) with HTTP; Sun, 3 May 2020 11:06:56 -0400 (EDT) X-Auth-ID: dpreed@deepplum.com Date: Sun, 3 May 2020 11:06:56 -0400 (EDT) From: "David P. Reed" To: "Sebastian Moeller" Cc: "=?utf-8?Q?Dave_T=C3=A4ht?=" , "Michael Richardson" , "Make-Wifi-fast" , "Jannie Hanekom" , "Cake List" , "Sergey Fedorov" , "bloat" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20200503110656000000_28394" Importance: Normal X-Priority: 3 (Normal) X-Type: html Message-ID: <1588518416.66682155@apps.rackspace.com> X-Mailer: webmail/17.3.10-RC X-Classification-ID: 321812f2-f857-4064-87cf-2192251092dd-1-1 Subject: Re: [Make-wifi-fast] [Cake] [Bloat] dslreports is no longer free X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 May 2020 15:07:00 -0000 ------=_20200503110656000000_28394 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AThanks Sebastian. I do agree that in many cases, reflecting the ICMP off= the entry device that has the external IP address for the NAT gets most of= the RTT measure, and if there's no queueing built up in the NAT device, th= at's a reasonable measure. But...=0A =0AHowever, if the router has "taken u= p the queueing delay" by rate limiting its uplink traffic to slightly less = than the capacity (as with Cake and other TC shaping that isn't as good as = cake), then there is a queue in the TC layer itself. This is what concerns = me as a distortion in the measurement that can fool one into thinking the T= C shaper is doing a good job, when in fact, lag under load may be quite hig= h from inside the routed domain (the home).=0A =0AAs you point out this unm= easured queueing delay can also be a problem with WiFi inside the home. But= it isn't limited to that.=0A =0AA badly set up shaping/congestion manageme= nt subsystem inside the NAT can look "very good" in its echo of ICMP packet= s, but be terrible in response time to trivial HTTP requests from inside, o= r equally terrible in twitch games and video conferencing.=0A =0ASo, for ex= ample, for tuning settings with "Cake" it is useless.=0A =0ATo be fair, usu= ally the Access Provider has no control of what is done after the cable is = terminated at the home, so as a way to decide if the provider is badly engi= neering its side, a ping from a server is a reasonable quality measure of t= he provider. =0A =0ABut not a good measure of the user experience, and if t= he provider provides the NAT box, even if it has a good shaper in it, like = Cake or fq_codel, it will just confuse the user and create the opportunity = for a "finger pointing" argument where neither side understands what is goi= ng on.=0A =0AThis is why we need =0A =0A1) a clear definition of lag under = load that is from end-to-end in latency, and involves, ideally, independent= traffic from multiple sources through the bottleneck.=0A =0A2) ideally, a = better way to localize where the queues are building up and present that to= users and access providers. The flent graphs are not interpretable by mos= t non-experts. What we need is a simple visualization of a sketch-map of th= e path (like traceroute might provide) with queueing delay measures shown = at key points that the user can understand.=0AOn Saturday, May 2, 2020 4:19= pm, "Sebastian Moeller" said:=0A=0A=0A=0A> Hi David,=0A> = =0A> in principle I agree, a NATed IPv4 ICMP probe will be at best reflecte= d at the NAT=0A> router (CPE) (some commercial home gateways do not respond= to ICMP echo requests=0A> in the name of security theatre). So it is prett= y hard to measure the full end to=0A> end path in that configuration. I bel= ieve that IPv6 should make that=0A> easier/simpler in that NAT hopefully wi= ll be out of the path (but let's see what=0A> ingenuity ISPs will come up w= ith).=0A> Then again, traditionally the relevant bottlenecks often are a) t= he internet=0A> access link itself and there the CPE is in a reasonable pos= ition as a reflector on=0A> the other side of the bottleneck as seen from a= n internet server, b) the home=0A> network between CPE and end-host, often = with variable rate wifi, here I agree=0A> reflecting echos at the CPE hides= part of the issue.=0A> =0A> =0A> =0A> > On May 2, 2020, at 19:38, David P.= Reed wrote:=0A> >=0A> > I am still a bit worried abo= ut properly defining "latency under load" for a=0A> NAT routed situation. I= f the test is based on ICMP Ping packets *from the server*,=0A> it will NOT= be measuring the full path latency, and if the potential congestion=0A> is= in the uplink path from the access provider's residential box to the acces= s=0A> provider's router/switch, it will NOT measure congestion caused by bu= fferbloat=0A> reliably on either side, since the bufferbloat will be outsid= e the ICMP Ping=0A> path.=0A> =0A> Puzzled, as i believe it is going to be = the residential box that will respond=0A> here, or will it be the AFTRs for= CG-NAT that reflect the ICMP echo requests?=0A> =0A> >=0A> > I realize tha= t a browser based speed test has to be basically run from the=0A> "server" = end, because browsers are not that good at time measurement on a packet=0A>= basis. However, there are ways to solve this and avoid the ICMP Ping issue= , with a=0A> cooperative server.=0A> >=0A> > I once built a test that fixed= this issue reasonably well. It carefully=0A> created a TCP based RTT measu= rement channel (over HTTP) that made the echo have to=0A> traverse the whol= e end-to-end path, which is the best and only way to accurately=0A> define = lag under load from the user's perspective. The client end of an unloaded= =0A> TCP connection can depend on TCP (properly prepared by getting it past= slowstart)=0A> to generate a single packet response.=0A> >=0A> > This "TCP= ping" is thus compatible with getting the end-to-end measurement on=0A> th= e server end of a true RTT.=0A> >=0A> > It's like tcp-traceroute tool, in t= hat it tricks anyone in the middle boxes=0A> into thinking this is a real, = serious packet, not an optional low priority=0A> packet.=0A> >=0A> > The sa= me issue comes up with non-browser-based techniques for measuring true=0A> = lag-under-load.=0A> >=0A> > Now as we move HTTP to QUIC, this actually gets= easier to do.=0A> >=0A> > One other opportunity I haven't explored, but wh= ich is pregnant with=0A> potential is the use of WebRTC, which runs over UD= P internally. Since JavaScript=0A> has direct access to create WebRTC conne= ctions (multiple ones), this makes=0A> detailed testing in the browser quit= e reasonable.=0A> >=0A> > And the time measurements can resolve well below = 100 microseconds, if the JS=0A> is based on modern JIT compilation (Chrome,= Firefox, Edge all compile to machine=0A> code speed if the code is restric= ted and in a loop). Then again, there is Web=0A> Assembly if you want to wr= ite C code that runs in the brower fast. WebAssembly is=0A> a low level lan= guage that compiles to machine code in the browser execution, and=0A> still= has access to all the browser networking facilities.=0A> =0A> Mmmh, accord= ing to https://github.com/w3c/hr-time/issues/56 due to spectre=0A> side-cha= nnel vulnerabilities many browsers seemed to have lowered the timer=0A> res= olution, but even the ~1ms resolution should be fine for typical RTTs.=0A> = =0A> Best Regards=0A> Sebastian=0A> =0A> P.S.: I assume that I simply do no= t see/understand the full scope of the issue at=0A> hand yet.=0A> =0A> =0A>= >=0A> > On Saturday, May 2, 2020 12:52pm, "Dave Taht" =0A> said:=0A> >=0A> > > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce =0A> wrote:=0A> > > >=0A> > > > > Fast.com reports my unlo= aded latency as 4ms, my loaded latency=0A> as ~7ms=0A> > >=0A> > > I guess = one of my questions is that with a switch to BBR netflix is=0A> > > going t= o do pretty well. If fast.com is using bbr, well... that=0A> > > excludes m= uch of the current side of the internet.=0A> > >=0A> > > > For download, I = show 6ms unloaded and 6-7 loaded. But for upload=0A> the loaded=0A> > > sho= ws as 7-8 and I see it blip upwards of 12ms. But I am no longer using=0A> a= ny=0A> > > traffic shaping. Any anti-bufferbloat is from my ISP. A graph of= the=0A> bloat would=0A> > > be nice.=0A> > >=0A> > > The tests do need to = last a fairly long time.=0A> > >=0A> > > > On Sat, May 2, 2020 at 9:51 AM J= annie Hanekom=0A> =0A> > > wrote:=0A> > > >>=0A> > > >>= Michael Richardson :=0A> > > >> > Does it find/use my ne= arest Netflix cache?=0A> > > >>=0A> > > >> Thankfully, it appears so. The D= SLReports bloat test was=0A> interesting,=0A> > > but=0A> > > >> the jitter= on the ~240ms base latency from South Africa (and=0A> other parts=0A> > > = of=0A> > > >> the world) was significant enough that the figures returned= =0A> were often=0A> > > >> unreliable and largely unusable - at least in my= experience.=0A> > > >>=0A> > > >> Fast.com reports my unloaded latency as = 4ms, my loaded latency=0A> as ~7ms=0A> > > and=0A> > > >> mentions servers = located in local cities. I finally have a test=0A> I can=0A> > > share=0A> = > > >> with local non-technical people!=0A> > > >>=0A> > > >> (Agreed, uplo= ad test would be nice, but this is a huge step=0A> forward from=0A> > > >> = what I had access to before.)=0A> > > >>=0A> > > >> Jannie Hanekom=0A> > > = >>=0A> > > >> _______________________________________________=0A> > > >> Ca= ke mailing list=0A> > > >> Cake@lists.bufferbloat.net=0A> > > >> https://li= sts.bufferbloat.net/listinfo/cake=0A> > > >=0A> > > > _____________________= __________________________=0A> > > > Cake mailing list=0A> > > > Cake@lists= .bufferbloat.net=0A> > > > https://lists.bufferbloat.net/listinfo/cake=0A> = > >=0A> > >=0A> > >=0A> > > --=0A> > > Make Music, Not War=0A> > >=0A> > > = Dave T=C3=A4ht=0A> > > CTO, TekLibre, LLC=0A> > > http://www.teklibre.com= =0A> > > Tel: 1-831-435-0729=0A> > > ______________________________________= _________=0A> > > Cake mailing list=0A> > > Cake@lists.bufferbloat.net=0A> = > > https://lists.bufferbloat.net/listinfo/cake=0A> > >=0A> > _____________= __________________________________=0A> > Cake mailing list=0A> > Cake@lists= .bufferbloat.net=0A> > https://lists.bufferbloat.net/listinfo/cake=0A> =0A>= =0A ------=_20200503110656000000_28394 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Thanks Sebastian. I do agree that in many cases, reflecting the ICMP off t= he entry device that has the external IP address for the NAT gets most of t= he RTT measure, and if there's no queueing built up in the NAT device, that= 's a reasonable measure. But...

=0A

 

=0A

However, if= the router has "taken up the queueing delay" by rate limiting its uplink t= raffic to slightly less than the capacity (as with Cake and other TC shapin= g that isn't as good as cake), then there is a queue in the TC layer itself= . This is what concerns me as a distortion in the measurement that can fool= one into thinking the TC shaper is doing a good job, when in fact, lag und= er load may be quite high from inside the routed domain (the home).

=0A<= p style=3D"margin:0;padding:0;margin: 0; padding: 0; font-family: arial; fo= nt-size: 12pt; overflow-wrap: break-word;"> 

=0A

As you point out this unmeasured queueing delay ca= n also be a problem with WiFi inside the home. But it isn't limited to that= .

=0A

 

=0A

A badly set up shaping/congestion manag= ement subsystem inside the NAT can look "very good" in its echo of ICMP pac= kets, but be terrible in response time to trivial HTTP requests from inside= , or equally terrible in twitch games and video conferencing.

=0A

 

=0A

So, for example, for tuning settings with "Cake" it is u= seless.

=0A

 

=0A

To be fair, usually the Access Prov= ider has no control of what is done after the cable is terminated at the ho= me, so as a way to decide if the provider is badly engineering its side, a = ping from a server is a reasonable quality measure of the provider. =0A

 

=0A

But not a good measure of the user experienc= e, and if the provider provides the NAT box, even if it has a good shaper i= n it, like Cake or fq_codel, it will just confuse the user and create the o= pportunity for a "finger pointing" argument where neither side understands = what is going on.

=0A

 <= /p>=0A

This is why we need =

=0A

 

=0A

1) a clear definition of lag under load th= at is from end-to-end in latency, and involves, ideally, independent traffi= c from multiple sources through the bottleneck.

=0A

 

=0A

2) ideally, a better way to localize where the queues are building up = and present that to users and access providers.  The flent graphs are = not interpretable by most non-experts. What we need is a simple visualizati= on of a sketch-map of the path (like traceroute might provide) with queuein= g delay measures  shown at key points that the user can understand.=0A

On Saturday, May 2, 2020 4:= 19pm, "Sebastian Moeller" <moeller0@gmx.de> said:

=0A<= div id=3D"SafeStyles1588461883">=0A

> Hi David,
>
> in principle I agree, a NATed IPv4 = ICMP probe will be at best reflected at the NAT
> router (CPE) (som= e commercial home gateways do not respond to ICMP echo requests
> i= n the name of security theatre). So it is pretty hard to measure the full e= nd to
> end path in that configuration. I believe that IPv6 should = make that
> easier/simpler in that NAT hopefully will be out of the= path (but let's see what
> ingenuity ISPs will come up with).
> Then again, traditionally the relevant bottlenecks often are a) the i= nternet
> access link itself and there the CPE is in a reasonable p= osition as a reflector on
> the other side of the bottleneck as see= n from an internet server, b) the home
> network between CPE and en= d-host, often with variable rate wifi, here I agree
> reflecting ec= hos at the CPE hides part of the issue.
>
>
> > > On May 2, 2020, at 19:38, David P. Reed <dpreed@deepplum.co= m> wrote:
> >
> > I am still a bit worried about p= roperly defining "latency under load" for a
> NAT routed situation.= If the test is based on ICMP Ping packets *from the server*,
> it = will NOT be measuring the full path latency, and if the potential congestio= n
> is in the uplink path from the access provider's residential bo= x to the access
> provider's router/switch, it will NOT measure con= gestion caused by bufferbloat
> reliably on either side, since the = bufferbloat will be outside the ICMP Ping
> path.
>
&= gt; Puzzled, as i believe it is going to be the residential box that will r= espond
> here, or will it be the AFTRs for CG-NAT that reflect the = ICMP echo requests?
>
> >
> > I realize that= a browser based speed test has to be basically run from the
> "ser= ver" end, because browsers are not that good at time measurement on a packe= t
> basis. However, there are ways to solve this and avoid the ICMP= Ping issue, with a
> cooperative server.
> >
> = > I once built a test that fixed this issue reasonably well. It carefull= y
> created a TCP based RTT measurement channel (over HTTP) that ma= de the echo have to
> traverse the whole end-to-end path, which is = the best and only way to accurately
> define lag under load from th= e user's perspective. The client end of an unloaded
> TCP connectio= n can depend on TCP (properly prepared by getting it past slowstart)
&= gt; to generate a single packet response.
> >
> > Thi= s "TCP ping" is thus compatible with getting the end-to-end measurement on<= br />> the server end of a true RTT.
> >
> > It's = like tcp-traceroute tool, in that it tricks anyone in the middle boxes
> into thinking this is a real, serious packet, not an optional low pri= ority
> packet.
> >
> > The same issue comes = up with non-browser-based techniques for measuring true
> lag-under= -load.
> >
> > Now as we move HTTP to QUIC, this actu= ally gets easier to do.
> >
> > One other opportunity= I haven't explored, but which is pregnant with
> potential is the = use of WebRTC, which runs over UDP internally. Since JavaScript
> h= as direct access to create WebRTC connections (multiple ones), this makes> detailed testing in the browser quite reasonable.
> >> > And the time measurements can resolve well below 100 microsec= onds, if the JS
> is based on modern JIT compilation (Chrome, Firef= ox, Edge all compile to machine
> code speed if the code is restric= ted and in a loop). Then again, there is Web
> Assembly if you want= to write C code that runs in the brower fast. WebAssembly is
> a l= ow level language that compiles to machine code in the browser execution, a= nd
> still has access to all the browser networking facilities.
>
> Mmmh, according to https://github.com/w3c/hr-time/issues/= 56 due to spectre
> side-channel vulnerabilities many browsers seem= ed to have lowered the timer
> resolution, but even the ~1ms resolu= tion should be fine for typical RTTs.
>
> Best Regards
> Sebastian
>
> P.S.: I assume that I simply do not s= ee/understand the full scope of the issue at
> hand yet.
> =
>
> >
> > On Saturday, May 2, 2020 12:52pm,= "Dave Taht" <dave.taht@gmail.com>
> said:
> >
> > > On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce= @gmail.com>
> wrote:
> > > >
> > >= ; > > Fast.com reports my unloaded latency as 4ms, my loaded latency<= br />> as ~7ms
> > >
> > > I guess one of my= questions is that with a switch to BBR netflix is
> > > goin= g to do pretty well. If fast.com is using bbr, well... that
> > = > excludes much of the current side of the internet.
> > >=
> > > > For download, I show 6ms unloaded and 6-7 loaded.= But for upload
> the loaded
> > > shows as 7-8 and I= see it blip upwards of 12ms. But I am no longer using
> any
&= gt; > > traffic shaping. Any anti-bufferbloat is from my ISP. A graph= of the
> bloat would
> > > be nice.
> > &= gt;
> > > The tests do need to last a fairly long time.
= > > >
> > > > On Sat, May 2, 2020 at 9:51 AM Jann= ie Hanekom
> <jannie@hanekom.net>
> > > wrote:<= br />> > > >>
> > > >> Michael Richardso= n <mcr@sandelman.ca>:
> > > >> > Does it find/= use my nearest Netflix cache?
> > > >>
> > &= gt; >> Thankfully, it appears so. The DSLReports bloat test was
= > interesting,
> > > but
> > > >> the = jitter on the ~240ms base latency from South Africa (and
> other pa= rts
> > > of
> > > >> the world) was sign= ificant enough that the figures returned
> were often
> >= ; > >> unreliable and largely unusable - at least in my experience= .
> > > >>
> > > >> Fast.com report= s my unloaded latency as 4ms, my loaded latency
> as ~7ms
>= > > and
> > > >> mentions servers located in loc= al cities. I finally have a test
> I can
> > > share<= br />> > > >> with local non-technical people!
> >= ; > >>
> > > >> (Agreed, upload test would be = nice, but this is a huge step
> forward from
> > > &g= t;> what I had access to before.)
> > > >>
>= > > >> Jannie Hanekom
> > > >>
> &= gt; > >> _______________________________________________
>= > > >> Cake mailing list
> > > >> Cake@lis= ts.bufferbloat.net
> > > >> https://lists.bufferbloat.n= et/listinfo/cake
> > > >
> > > > ________= _______________________________________
> > > > Cake maili= ng list
> > > > Cake@lists.bufferbloat.net
> > = > > https://lists.bufferbloat.net/listinfo/cake
> > >> > >
> > >
> > > --
> >= ; > Make Music, Not War
> > >
> > > Dave T= =C3=A4ht
> > > CTO, TekLibre, LLC
> > > http://= www.teklibre.com
> > > Tel: 1-831-435-0729
> > >= ; _______________________________________________
> > > Cake = mailing list
> > > Cake@lists.bufferbloat.net
> > = > https://lists.bufferbloat.net/listinfo/cake
> > >
&= gt; > _______________________________________________
> > Cak= e mailing list
> > Cake@lists.bufferbloat.net
> > htt= ps://lists.bufferbloat.net/listinfo/cake
>
>

=0A= =0A

 

------=_20200503110656000000_28394--