From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <moeller0@gmx.de>
Received: from mout.gmx.net (mout.gmx.net [212.227.17.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 5EE9B3B2A4;
 Wed,  6 May 2020 04:09:01 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net;
 s=badeba3b8450; t=1588752535;
 bh=VKeyD7q5bMQzHkE2TxH/pxkQp1r9qv6D94k5tAJvx6o=;
 h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To;
 b=lVSLMwFLE2btu02YgUclg8LdE5fUoezDlmrGrykshE5eo/+XzvqiNCl5PBI7u4SOb
 nKKK3cTdcZpZo8XxcfwEWiLuvh4hqv4mRmVwR39G2R8pesOM5vc03F/fZXgWOCZpl2
 qXD5lx4bwaXMwTVyvMU4RPj9T6msXlOi92TK/r9o=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [10.11.12.16] ([134.76.241.253]) by mail.gmx.com (mrgmx105
 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MYeMt-1jaFbZ3Ss7-00Vf8j; Wed, 06
 May 2020 10:08:54 +0200
Content-Type: text/plain;
	charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\))
From: Sebastian Moeller <moeller0@gmx.de>
X-Priority: 3 (Normal)
In-Reply-To: <1588518416.66682155@apps.rackspace.com>
Date: Wed, 6 May 2020 10:08:53 +0200
Cc: =?utf-8?Q?Dave_T=C3=A4ht?= <dave.taht@gmail.com>,
 Make-Wifi-fast <make-wifi-fast@lists.bufferbloat.net>,
 Jannie Hanekom <jannie@hanekom.net>,
 Cake List <cake@lists.bufferbloat.net>,
 Sergey Fedorov <sfedorov@netflix.com>, bloat <bloat@lists.bufferbloat.net>,
 =?utf-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= <toke@redhat.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <5A0DCDB9-CF68-4C0D-896D-2C8F8B304FC8@gmx.de>
References: <1588518416.66682155@apps.rackspace.com>
To: "David P. Reed" <dpreed@deepplum.com>
X-Mailer: Apple Mail (2.3445.104.14)
X-Provags-ID: V03:K1:Cv3FN4zuurCk2S2BxSX/fencAJbD9Qg70JeFqeiC6CWMbTOAxwt
 lHACTIDcqPAaaO4Mabc0wxPEQQjinSXcV1qQMRWZAw0Uiuh5kV9RnF3GEo5v2y8ZuLxUx7/
 6MfjXXSkUk6VyFt/aiCRFtkNutL2Iu/j/pXGVM5jtL9ku5/g9iEJeEMGRGP5Se8DkY/GC16
 h/Locq3ME2f4KzD5QwMPw==
X-Spam-Flag: NO
X-UI-Out-Filterresults: notjunk:1;V03:K0:f5Lydo4ujg0=:CsRXiCIwweLtbdgVszkEHH
 /jp5eg5cQUkob00Pf7P8ubGspP+2cGTRdmBscnVzR3BOkyLxiHgHVm9eVGmmtO2RsUqfJ673e
 lXA4puv3z2dATGdvHwEaBV5WZtC7fygQ/u8OF6YJ71Kd2pHYYiiNbDYOdfShiz4zWJDt7jj2c
 LYRFMJqNKNO4WGJ7GaXzaV3HnWyvZNcRxiX8SVCgfmzUM1afgTasJHguYORQuy4E8mFUI0nwo
 fjED3hXvDyii9xaEqw0Uocg0eynO+A4AOMWSjclzoKWTtlkI1sOGtorkbMtM+RCyzjN5P+Rgo
 qvTDnOrEkSffGPbxQH4bwxb1oeexysU0WoZ6IZyTmiqSKpfGQbMoiGJ0AGkKuWACeID3TXdFq
 mFFqRCCDlQUMcPM2hD0qUaW7Lqo8PU7zil1MqBQRh8CLtwQ4oaGIev9bBAeprY2PqHB+8DQP6
 WIXsZp62RVydojShD29hS0aZPWeD9I/b/OU6F9frfGSUeNnlRa+8vrDtC//+ffc11ZZKqiVbB
 55hH7XrJVU83xXZ7HKuHY4HRMlFO8PHjKeZNtJ+smoFJ4OfKurLsbuxDooBVyQvIGsWmvXd2g
 pGyQn6jAVfz7X43Dak7FLsNGxgCwiHOOY0bv6RN314aZ2Sp8l+gF3yHyu1E5LwRlzkqbBitGD
 zk2EztdbnNw9ElvL3JgQcL2NJTA7yOFSiPdpuWP6JRDIT0aTPSNrqa2N3JJRDnTdjuz7N91VE
 y87H3G/OnY51fnM0a6/SjmZqmjRqsRs9Sa8BUcSQ0VLffZscpZ1u2/heMnWoHCw/vzH/+RzY9
 k71TVAOMxNga5LKmjIOjS5C1tKbGMOT4DARgd7AxgIMUvYDPm43UlgPdIZNtNXI9haKFCyOgd
 5Cmj+iGGZoyLOxDc7ve3X3wLN+HS4Z9ejG/IgBnWSncFW5XK/VB5IKObfHiVTZrKQUNwGqqJW
 I9NPYZDtvVWzC+DsZehbpniqPwbAMKcU8bUU8hUXfnnd6Ttkp+emKag4g2MP/4OQr2uri372q
 0cGa9rKl3FpNSNfzZoXCGOYgorzyyG9rQmjg1WEA3X4FNCNjypbp+Dn870IYPEJAEjvvNJY0M
 OvvES6m3MavJhoSDsNOODGW056IuszF8Q44s77kwq1QXihdXUt6PpqHtOelSCY1+54gxZB/1w
 +zV3PwfiJz7hCgCmO0fbER+SE0/Qtsmvksl+6YcDlUC5L7p6koN6poPxRNZFFXMC6H2Z5IJnN
 YjaLPU9QyfewAnirJ
Subject: Re: [Bloat] [Cake] [Make-wifi-fast]  dslreports is no longer free
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Wed, 06 May 2020 08:09:01 -0000

Dear David,

Thanks for the elaboration below, and indeed I was not appreciating the =
full scope of the challenge.

> On May 3, 2020, at 17:06, David P. Reed <dpreed@deepplum.com> wrote:
>=20
> Thanks Sebastian. I do agree that in many cases, reflecting the ICMP =
off the entry device that has the external IP address for the NAT gets =
most of the RTT measure, and if there's no queueing built up in the NAT =
device, that's a reasonable measure. But...

	Yes, I see; I really hope that with IPv6 coming more and more =
online, and hence less NAT, end-to-end RTT measurements will be simpler =
in the future. But cue the people who will for example recommend to =
drop/ignore ICMP in the name of security theater... Its the same mindset =
that basically recommends to ignore ICMP and/or IP timestamps, because =
"information leakage", while all the information that leaks for a =
standards conformant host is the time since midnight UTC (and =
potentially an idea about the difference between the local clock =
setting)... I fail to understand the rationale thread model behind =
eschewing this... For our purpoes one-way timestamps would be most =
excellent to have to be able to assess on which "leg" overload actually =
happens.

>=20
> However, if the router has "taken up the queueing delay" by rate =
limiting its uplink traffic to slightly less than the capacity (as with =
Cake and other TC shaping that isn't as good as cake), then there is a =
queue in the TC layer itself. This is what concerns me as a distortion =
in the measurement that can fool one into thinking the TC shaper is =
doing a good job, when in fact, lag under load may be quite high from =
inside the routed domain (the home).

	As long as the shaper is instantiated on the NAT box, the =
latency probes reflected by that NAT-box will also travel through the =
shaper; but now you mention it, in SQM we do ingress shaping via an IFB =
and hence will also shape the incoming latency probes, but I started to =
recommend to do ingress shaping as egress-shaping on the LAN-wards =
interface of a router (to avoid the computational cost of the IFB =
redirection dance, and to allow people to use iptables for ingress*), =
and in such a configuration router reflected/emitted WAN-probes will =
avoid the ingress TC-queues...=20

*) With nftables having a hook at ingress, that second rationale will =
become moot in the near future...


>=20
> As you point out this unmeasured queueing delay can also be a problem =
with WiFi inside the home. But it isn't limited to that.
>=20
> A badly set up shaping/congestion management subsystem inside the NAT =
can look "very good" in its echo of ICMP packets, but be terrible in =
response time to trivial HTTP requests from inside, or equally terrible =
in twitch games and video conferencing.

	Good point, and one of Dave's pet peeves, in former time people =
recommended to up-priritize ICMP packets to make RTT look good, falling =
exactly into the trap you described.

>=20
> So, for example, for tuning settings with "Cake" it is useless.

	I believe that at least for the way we instantiate things by =
default in SQM-scripts we avoid that pit-fall. What do you think @Toke?

>=20
> To be fair, usually the Access Provider has no control of what is done =
after the cable is terminated at the home, so as a way to decide if the =
provider is badly engineering its side, a ping from a server is a =
reasonable quality measure of the provider.=20

	Most providers in Germany will try to steer customers to rent a =
wifi router from the ISP, so bloat in the wifi link would also be under =
the responsibility of the ISP to some degree, no?


>=20
> But not a good measure of the user experience, and if the provider =
provides the NAT box, even if it has a good shaper in it, like Cake or =
fq_codel, it will just confuse the user and create the opportunity for a =
"finger pointing" argument where neither side understands what is going =
on.
>=20
> This is why we need=20
>=20
> 1) a clear definition of lag under load that is from end-to-end in =
latency, and involves, ideally, independent traffic from multiple =
sources through the bottleneck.

	I am all for it, in addition in the past we also reasoned that =
this definition needs to be relative simple so it can be easily =
explained to turn naive layperson into informed amateurs ;) The multiple =
sources thing is something that dslreports did welll, they typically =
tried to serve from multiple server sites and reported some stats per =
site. Now with its basically gone, it becomes clear how much clue went =
into that speedtest, a pitty that most of the competition did not follow =
their lead yet (I am especially looking at you Ookla...).

>=20
> 2) ideally, a better way to localize where the queues are building up =
and present that to users and access providers. =20

	Yes. Now how to do this robustly and reliably escapes me, albeit =
enabling one-way timestamps might help, then a saturating speedtest =
could be accompanied not by conceptually a "simple" IVMP echo request, =
but by a repeated traceroute that gets there-and-back delay measurements =
for the approximated path (approximated because of the complications of =
understanding traceroute results).


> The flent graphs are not interpretable by most non-experts.

	And sometimes not even by experts ;)

> What we need is a simple visualization of a sketch-map of the path =
(like traceroute might provide) with queueing delay measures  shown at =
key points that the user can understand.

	I am on the fence, personally I would absolutely love that, but =
I am not sure how the rest of my family would receive something like =
that? I guess it depends on the simplicity of the representation and =
probably, following fast.com's lead, a way tp also compress that =
expanded results into a reasonable one-number representation. I hate =
on-number-representations for complex issues, but people generally will =
come up with one themselves if none is supplied. (And I get this, =
outside our areas of expertise we all prefer the world to be simple)

Best Regards
	Sebastian


> On Saturday, May 2, 2020 4:19pm, "Sebastian Moeller" <moeller0@gmx.de> =
said:
>=20
>> Hi David,
>>=20
>> in principle I agree, a NATed IPv4 ICMP probe will be at best =
reflected at the NAT
>> router (CPE) (some commercial home gateways do not respond to ICMP =
echo requests
>> in the name of security theatre). So it is pretty hard to measure the =
full end to
>> end path in that configuration. I believe that IPv6 should make that
>> easier/simpler in that NAT hopefully will be out of the path (but =
let's see what
>> ingenuity ISPs will come up with).
>> Then again, traditionally the relevant bottlenecks often are a) the =
internet
>> access link itself and there the CPE is in a reasonable position as a =
reflector on
>> the other side of the bottleneck as seen from an internet server, b) =
the home
>> network between CPE and end-host, often with variable rate wifi, here =
I agree
>> reflecting echos at the CPE hides part of the issue.
>>=20
>>=20
>>=20
>>> On May 2, 2020, at 19:38, David P. Reed <dpreed@deepplum.com> wrote:
>>>=20
>>> I am still a bit worried about properly defining "latency under =
load" for a
>> NAT routed situation. If the test is based on ICMP Ping packets *from =
the server*,
>> it will NOT be measuring the full path latency, and if the potential =
congestion
>> is in the uplink path from the access provider's residential box to =
the access
>> provider's router/switch, it will NOT measure congestion caused by =
bufferbloat
>> reliably on either side, since the bufferbloat will be outside the =
ICMP Ping
>> path.
>>=20
>> Puzzled, as i believe it is going to be the residential box that will =
respond
>> here, or will it be the AFTRs for CG-NAT that reflect the ICMP echo =
requests?
>>=20
>>>=20
>>> I realize that a browser based speed test has to be basically run =
from the
>> "server" end, because browsers are not that good at time measurement =
on a packet
>> basis. However, there are ways to solve this and avoid the ICMP Ping =
issue, with a
>> cooperative server.
>>>=20
>>> I once built a test that fixed this issue reasonably well. It =
carefully
>> created a TCP based RTT measurement channel (over HTTP) that made the =
echo have to
>> traverse the whole end-to-end path, which is the best and only way to =
accurately
>> define lag under load from the user's perspective. The client end of =
an unloaded
>> TCP connection can depend on TCP (properly prepared by getting it =
past slowstart)
>> to generate a single packet response.
>>>=20
>>> This "TCP ping" is thus compatible with getting the end-to-end =
measurement on
>> the server end of a true RTT.
>>>=20
>>> It's like tcp-traceroute tool, in that it tricks anyone in the =
middle boxes
>> into thinking this is a real, serious packet, not an optional low =
priority
>> packet.
>>>=20
>>> The same issue comes up with non-browser-based techniques for =
measuring true
>> lag-under-load.
>>>=20
>>> Now as we move HTTP to QUIC, this actually gets easier to do.
>>>=20
>>> One other opportunity I haven't explored, but which is pregnant with
>> potential is the use of WebRTC, which runs over UDP internally. Since =
JavaScript
>> has direct access to create WebRTC connections (multiple ones), this =
makes
>> detailed testing in the browser quite reasonable.
>>>=20
>>> And the time measurements can resolve well below 100 microseconds, =
if the JS
>> is based on modern JIT compilation (Chrome, Firefox, Edge all compile =
to machine
>> code speed if the code is restricted and in a loop). Then again, =
there is Web
>> Assembly if you want to write C code that runs in the brower fast. =
WebAssembly is
>> a low level language that compiles to machine code in the browser =
execution, and
>> still has access to all the browser networking facilities.
>>=20
>> Mmmh, according to https://github.com/w3c/hr-time/issues/56 due to =
spectre
>> side-channel vulnerabilities many browsers seemed to have lowered the =
timer
>> resolution, but even the ~1ms resolution should be fine for typical =
RTTs.
>>=20
>> Best Regards
>> Sebastian
>>=20
>> P.S.: I assume that I simply do not see/understand the full scope of =
the issue at
>> hand yet.
>>=20
>>=20
>>>=20
>>> On Saturday, May 2, 2020 12:52pm, "Dave Taht" <dave.taht@gmail.com>
>> said:
>>>=20
>>>> On Sat, May 2, 2020 at 9:37 AM Benjamin Cronce <bcronce@gmail.com>
>> wrote:
>>>>>=20
>>>>>> Fast.com reports my unloaded latency as 4ms, my loaded latency
>> as ~7ms
>>>>=20
>>>> I guess one of my questions is that with a switch to BBR netflix is
>>>> going to do pretty well. If fast.com is using bbr, well... that
>>>> excludes much of the current side of the internet.
>>>>=20
>>>>> For download, I show 6ms unloaded and 6-7 loaded. But for upload
>> the loaded
>>>> shows as 7-8 and I see it blip upwards of 12ms. But I am no longer =
using
>> any
>>>> traffic shaping. Any anti-bufferbloat is from my ISP. A graph of =
the
>> bloat would
>>>> be nice.
>>>>=20
>>>> The tests do need to last a fairly long time.
>>>>=20
>>>>> On Sat, May 2, 2020 at 9:51 AM Jannie Hanekom
>> <jannie@hanekom.net>
>>>> wrote:
>>>>>>=20
>>>>>> Michael Richardson <mcr@sandelman.ca>:
>>>>>>> Does it find/use my nearest Netflix cache?
>>>>>>=20
>>>>>> Thankfully, it appears so. The DSLReports bloat test was
>> interesting,
>>>> but
>>>>>> the jitter on the ~240ms base latency from South Africa (and
>> other parts
>>>> of
>>>>>> the world) was significant enough that the figures returned
>> were often
>>>>>> unreliable and largely unusable - at least in my experience.
>>>>>>=20
>>>>>> Fast.com reports my unloaded latency as 4ms, my loaded latency
>> as ~7ms
>>>> and
>>>>>> mentions servers located in local cities. I finally have a test
>> I can
>>>> share
>>>>>> with local non-technical people!
>>>>>>=20
>>>>>> (Agreed, upload test would be nice, but this is a huge step
>> forward from
>>>>>> what I had access to before.)
>>>>>>=20
>>>>>> Jannie Hanekom
>>>>>>=20
>>>>>> _______________________________________________
>>>>>> Cake mailing list
>>>>>> Cake@lists.bufferbloat.net
>>>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>>=20
>>>>> _______________________________________________
>>>>> Cake mailing list
>>>>> Cake@lists.bufferbloat.net
>>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>=20
>>>>=20
>>>>=20
>>>> --
>>>> Make Music, Not War
>>>>=20
>>>> Dave T=C3=A4ht
>>>> CTO, TekLibre, LLC
>>>> http://www.teklibre.com
>>>> Tel: 1-831-435-0729
>>>> _______________________________________________
>>>> Cake mailing list
>>>> Cake@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/cake
>>>>=20
>>> _______________________________________________
>>> Cake mailing list
>>> Cake@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cake
>>=20
>>=20
>=20