From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <moeller0@gmx.de>
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 82FD63CB39
 for <rpm@lists.bufferbloat.net>; Fri, 19 Jan 2024 08:14:25 -0500 (EST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417;
 t=1705670059; x=1706274859; i=moeller0@gmx.de;
 bh=7uqse8FHDwISIlP+Csb4mNCiCJhc1Fc3cGlgaXJ9iTs=;
 h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:
 To;
 b=gxLfZOLcUnM566Jkg8iqZ8xGv7h+kKYbrzGTn5ST64LlXCC6AvTl30TKUZdVz/nC
 It7QNq10wFzEzg2O26TWsDDSZRxGJ/0mP+sgDyx0xdYMCIqWbVpplbQ/h+nqu1gwN
 aM1SB1zhtDqK3z/9YPG82aIbrunD1MUZkvKZlFbm/bkRtIVgGp0O3MgRcz3/2s7Qa
 lDneJhYE7JJEFiLx06u0aLNnyamAN/gL+ozLzK6n3Wy4qpBrdMZNz7OalafFFC4Zb
 I5Ow+5RdXbsBiEC79m1JIkM+Rmhpc82NNeu8lZDY1JZVPNMAgjx6rVyBhbiNW7vM2
 KTrMlLACZaH8wIyALQ==
X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a
Received: from smtpclient.apple ([134.76.241.253]) by mail.gmx.net (mrgmx004
 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MPXd2-1rmBL72LCc-00MaPP; Fri, 19
 Jan 2024 14:14:19 +0100
Content-Type: text/plain;
	charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.300.61.1.2\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <7494CC8D-7BAB-41DB-9FF7-7306747F2DC9@apple.com>
Date: Fri, 19 Jan 2024 14:14:09 +0100
Cc: IETF IPPM WG <ippm@ietf.org>,
 Rpm <rpm@lists.bufferbloat.net>
Content-Transfer-Encoding: quoted-printable
Message-Id: <14EC339A-9A84-40C5-AFCC-474DF03C16B6@gmx.de>
References: <D7323D41-BA9B-46E5-AA7D-6514636AA44D@gmx.de>
 <7494CC8D-7BAB-41DB-9FF7-7306747F2DC9@apple.com>
To: Christoph Paasch <cpaasch@apple.com>
X-Mailer: Apple Mail (2.3774.300.61.1.2)
X-Provags-ID: V03:K1:e+00WRI3ceuCjQ3wMJjYwSOMQcuijI+SclPtdEbqABV3rli/p4b
 0fWJprnvmLjLNuQ/pjUa5Pd7J0c+W7tYDkDgqQZrfiggdISOjFdtoWIbhiXYZcsgQhV7ivD
 E9updJ+IIYkrhZIOfw0lxuCX/3h1Ye0eFGb5JD1vTYXhKaiTKPZJmhQlDjxc3eS8SRyI/WT
 /0t14g/FijONdB7OFHNww==
X-Spam-Flag: NO
UI-OutboundReport: notjunk:1;M01:P0:qc2LjwIgJVY=;7fcFEC0bz6rV4YvHKwclauKO0Vp
 eQHYYU0xA2Sv2FBxEYwgQpAUF2rwA+cVaxX9et8FEqGU38fyoXuxVv3ESmdy2Q5Xf4j0xr9Xi
 5mMlBDqzRUz5qOBC2/AYPitCcV1heRdbvP7e1Jxyt3YKW5DkP8hMg+pfmoK7S5nXff6ydeKqL
 M81qsqtdJ2tI2i3vg/Cl/gaZGoYj652xlpCZN48JqOne8DfmUWuPMJ1Zt0QpbWtCB647OFlph
 UYS9Ww9kOX0RaCji2u4F4BPEmkU0I3oKnD6+7G7x7ZP7dJAfUZdAIlszRDe6+MpThi4ntbT1W
 hDvfgBAHqc0JZmHPzD989QDi/1es8RoU7yFQ7QexDf9644clo1Y9n4P6sLysl7h5ma4iDj1mH
 fsupiKnq6v6B1a4id8ZPwoB1Aml0LPRRsVQBv3p92d5MPW97vzxAa/AWyHRFQH04Nm4dz/Yzi
 RutfMBmEc/8Q3EBN66buHm1aarTVQ+reZLaKXs0Rlm/WFEstlpP0PgF/ofqzu0xAY7uT2UCKI
 OslDdSq5zX1N5kJeSSltXODf1lsmGJowvqOcQvVf/s8+wuHlAYkjBCiSfNps0k7JTDvbYH+IH
 8JG5Lr9RC0Ni8r+vOS61ZWWTW0l0X/PGpNtkeFM6VbMSQdP0Y/nVj4owkf0B0sidTGWZYsioB
 UHCsT9ZMfRJpLA5q5CeHO8SmBW+9MNXWSjaDI0qnjCOcMgkwcZ/z3FypiTngVugTgET/6tXw8
 uVF0rI41hzo1KIk4W0EaNrgoxQrk2rTzARwt3f4VKLeYBGVuNKSx8sS30AGfC6L5Sap6kNGhI
 Jr1ZCZgGdX2ey89GJuUnvELNavsjJq5YG2ZOLGAjjF4L2uGRTYGjLJDpmtRa/TWUzo3cmtG3Z
 YF/cTf/hQfjhmHR8y4ZLV4/Ni6CxufBbiJ3LlPUKTMi2M1/oD4S9GarJrwRDuO0Sgb5/nXZnh
 ciZzAmySAGp4wY0KqdxoKRdCt3U=
Subject: Re: [Rpm] [ippm] draft-ietf-ippm-responsiveness
X-BeenThere: rpm@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: revolutions per minute - a new metric for measuring responsiveness
 <rpm.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/rpm>,
 <mailto:rpm-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/rpm>
List-Post: <mailto:rpm@lists.bufferbloat.net>
List-Help: <mailto:rpm-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/rpm>,
 <mailto:rpm-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 19 Jan 2024 13:14:25 -0000

Hi Christoph

> On 16. Jan 2024, at 20:01, Christoph Paasch <cpaasch@apple.com> wrote:
>=20
> Hello Sebastian,
>=20
>=20
> thanks for the feedback, please see inline!
>=20
>> On Dec 3, 2023, at 10:13=E2=80=AFAM, Sebastian Moeller =
<moeller0@gmx.de> wrote:
>>=20
>> Dear IPPM members,
>>=20
>> On re-reading the current responsiveness draft I stumbled over the =
following section:
>>=20
>>=20
>> Parallel vs Sequential Uplink and Downlink
>>=20
>> Poor responsiveness can be caused by queues in either (or both) the =
upstream and the downstream direction. Furthermore, both paths may =
differ significantly due to access link conditions (e.g., 5G downstream =
and LTE upstream) or routing changes within the ISPs. To measure =
responsiveness under working conditions, the algorithm must explore both =
directions.
>>=20
>> One approach could be to measure responsiveness in the uplink and =
downlink in parallel. It would allow for a shorter test run-time.
>>=20
>> However, a number of caveats come with measuring in parallel:
>>=20
>> =E2=80=A2 Half-duplex links may not permit simultaneous uplink and =
downlink traffic. This restriction means the test might not reach the =
path's capacity in both directions at once and thus not expose all the =
potential sources of low responsiveness.
>> =E2=80=A2 Debuggability of the results becomes harder: During =
parallel measurement it is impossible to differentiate whether the =
observed latency happens in the uplink or the downlink direction.
>> Thus, we recommend testing uplink and downlink sequentially. Parallel =
testing is considered a future extension.
>>=20
>>=20
>> I argue, that this is not the correct diagnosis and hence not the =
correct decision.
>> For half-duplex links the given argument is not incorrect, but =
incomplete, as it is quite likely that when forced to multiplex more =
bi-directional traffic (all TCP testing is bi-directional, so we only =
argue about the amount of reverse traffic, not whether it exist, and =
even if we would switch to QUIC/UDP we would still need a feed-back =
channel) we will se different "potential sources of low responsiveness" =
so ignoring any of the two seems ill advised.
>=20
> You are saying that parallel bi-directional traffic exposes different =
sources of responsiveness issues than uni-directional traffic (up and =
down) ? What kind of different sources would that expose ? Can you give =
some examples and maybe a suggestion on how to word things ?

[SM] If the bottleneck is a WiFi link we occasionally see that some OS =
are more aggressive than others in acquiring airtime, which easily =
results in differential throughput for the two directions and often =
higher queueing delay for the direction that is 'slowed' down. In theory =
that should not really happen but in practise it does, e.g. the ISP =
unhelpfully passes undesired DSCP marks into a home network that then =
are acted upon by WiFi WMM. To elaborate, Comcast for a long time had an =
issue where large fractions (IIRC up to 25%) of packets where =
inadvertently marked as CS1 which in default WMM translates to AC_BK, =
and if the client sends the upload traffic via the default AC_BE, these =
differential AC usage can now result in different queueing delay =
compared to looking at upload and download individually. (If all traffic =
of a channel uses AC_BK instead of AC_BE this should not affect latency =
much)
Side-note: Comcast after being alerted took notice of the issue and =
fixed it, but I think this kind of issue can happen to other ISPs as =
well.


>=20
>> Debuggability is not "rocket science" either, all one needs is a =
three value timestamp format (similar to what NTP uses) and one can, =
even without synchronized clocks! establish baseline OWDs and then under =
bi-directional load one can see which of these unloaded OWDs actually =
increases, so I argue that "it is impossible to differentiate whether =
the observed latency happens in the uplink or the downlink direction" is =
simply an incorrect assertion... (and we are actually doing this =
successfully in the existing internet as part of the cake-autorate =
project [h++ps://github.com/lynxthecat/cake-autorate/tree/master] =
already, based on ICMP timestamps). The relevant observation here is =
that we are not necessarily interested in veridical OWDs under idle =
conditions, but we want to see which OWD(s) increase during =
working-conditions, and that works with desynchronized clocks and is =
also robust against slow clock drift.
>=20
> Unfortunately, this would require for the server to add timestamps to =
the HTTP-response, right ?

[SM] Yes in a sense.... but that could be a a small process that simply =
updates the content of that file every couple of milliseconds, so would =
not strictly need to be the server process...=20


> We opted against this because the =E2=80=9Cpower=E2=80=9D of the =
responsiveness methodology is that it is extremely lightweight on the =
server-side. And with lightweight I mean not only from an =
implementation/CPU perspective but also from a deployment perspective. =
All one needs to do on the server in order to provide a =
responsiveness-measurement-endpoint is to host 2 files (one very large =
one and a very small one) and provide an endpoint to =E2=80=9CPOST=E2=80=9D=
 data to. All of these are standard capabilities in every webserver that =
can easily be configured. And we have seen a rise of endpoints showing =
up thanks to the simplicity to deploy it.
>=20
> So, it is IMO a balance between =E2=80=9Cdeployability=E2=80=9D and =
=E2=80=9Cdebuggability=E2=80=9D. The responsiveness test is clearly =
aiming towards being deployable and accessible. Thus I think we would =
prefer keeping things on the server-side simple.
>=20
>=20
> Thoughts ?

[SM] I really really would like some way to get OWDs if only optional, =
but even more than that I think RPM should get as wide a deployment as =
possible, ubiquity has its own inherent value for measurement platforms, =
so if this makes deployment harder it would be a no-go.=20

Now, I get that this is a long shot, but I fear that if the draft does =
not mention this at all the chance will be gone forever....=20
Could we maybe add a description of an optional 'time' payload, so =
clients could expect a single standardised format for that, if a server =
would optionally support it?


> That being said, I=E2=80=99m not entirely opposed to recommending the =
parallel mode as well. The interesting bit about the parallel mode is =
not so much the responsiveness measurement but rather the capacity =
measurement. Because, surprisingly many modems/=E2=80=A6 that are =
supposedly (according to their spec-sheet) able to handle 1 Gbps =
full-duplex suddenly show their weakness and are no more able to handle =
line-rate. So, it is more about capacity than responsiveness IMO.

[SM] True, yet such overload also occasionally affects queuing delay and =
jitter (sure RPM does not report jitter, but it likely affects the =
ability of a test to reach the required stability criteria).

> However, as a frequent user of the networkQuality-tool I realize =
myself that whenever I want to test my network I end up using a =
sequential test in favor of the parallel test.

[SM] I agree that a full complement of upload, then download, then =
combined upload & download is a great tool for understanding network =
behaviour. I also want to applaud Apple's networkQuality of an excellent =
implementation of the ideas behind this draft, offering a great and well =
selected set of options:

USAGE: networkQuality [-C <configuration_url>] [-c] [-d] [-f =
<comma-separated list>] [-h] [-I <network interface name>] [-k] [-p] [-r =
host] [-S <port>] [-s] [-u] [-v]
    -C: Override Configuration URL or path (with scheme file://)
    -c: Produce computer-readable output
    -d: Do not run a download test (implies -s)
    -f: <comma-separated list>: Enforce Protocol selections. Available =
options:
        h1: Force-enable HTTP/1.1
        h2: Force-enable HTTP/2
        h3: Force-enable HTTP/3 (QUIC)
        L4S: Force-enable L4S
        noL4S: Force-disable L4S
    -h: Show help (this message)
    -I: Bind test to interface (e.g., en0, pdp_ip0,...)
    -k: Disable certificate validation
    -p: Use iCloud Private Relay
    -r: Connect to host or IP, overriding DNS for initial config request
    -S: Start and run server on specified port. Other specified options =
ignored
    -s: Run tests sequentially instead of parallel upload/download
    -u: Do not run an upload test (implies -s)
    -v: Verbose output

that cover a lot of cases with a relative small set of control =
parameters.

>=20
>=20
>=20
> Christoph
>=20
>=20
>>=20
>> Given these observations, I ask that we change this design parameter =
to default requiring both measurement modes and defaulting to parallel =
testing (or randomly select between both modes, but report which it =
choose).
>>=20
>> Best Regards
>> Sebastian
>> _______________________________________________
>> ippm mailing list
>> ippm@ietf.org
>> https://www.ietf.org/mailman/listinfo/ippm
>=20