General list for discussing Bufferbloat
 help / color / mirror / Atom feed
* [Bloat] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
@ 2021-08-13 21:41 Christoph Paasch
  2021-08-15 13:39 ` Erik Auerswald
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Paasch @ 2021-08-13 21:41 UTC (permalink / raw)
  To: bloat; +Cc: Randall Meyer, Omer Shapira, Stuart Cheshire

I already posted this to the RPM-list, but the audience here on bloat should
be interested as well.


This is the specification of Apple's responsiveness/RPM test. We believe that it
would be good for the bufferbloat-effort to have a specification of how to
quantify the extend of bufferbloat from a user's perspective. Our
Internet-draft is a first step in that direction and we hope that it will
kick off some collaboration.


Feedback is very welcome!


Cheers,
Christoph


----- Forwarded message from internet-drafts@ietf.org -----

From: internet-drafts@ietf.org
To: Christoph Paasch <cpaasch@apple.com>, Omer Shapira <oesh@apple.com>, Randall Meyer <rrm@apple.com>, Stuart Cheshire
	<cheshire@apple.com>
Date: Fri, 13 Aug 2021 09:43:40 -0700
Subject: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt


A new version of I-D, draft-cpaasch-ippm-responsiveness-00.txt
has been successfully submitted by Christoph Paasch and posted to the
IETF repository.

Name:		draft-cpaasch-ippm-responsiveness
Revision:	00
Title:		Responsiveness under Working Conditions
Document date:	2021-08-13
Group:		Individual Submission
Pages:		12
URL:            https://www.ietf.org/archive/id/draft-cpaasch-ippm-responsiveness-00.txt
Status:         https://datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/
Htmlized:       https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness


Abstract:
   Bufferbloat has been a long-standing problem on the Internet with
   more than a decade of work on standardizing technical solutions,
   implementations and testing.  However, to this date, bufferbloat is
   still a very common problem for the end-users.  Everyone "knows" that
   it is "normal" for a video conference to have problems when somebody
   else on the same home-network is watching a 4K movie.

   The reason for this problem is not the lack of technical solutions,
   but rather a lack of awareness of the problem-space, and a lack of
   tooling to accurately measure the problem.  We believe that exposing
   the problem of bufferbloat to the end-user by measuring the end-
   users' experience at a high level will help to create the necessary
   awareness.

   This document is a first attempt at specifying a measurement
   methodology to evaluate bufferbloat the way common users are
   experiencing it today, using today's most frequently used protocols
   and mechanisms to accurately measure the user-experience.  We also
   provide a way to express the bufferbloat as a measure of "Round-trips
   per minute" (RPM) to have a more intuitive way for the users to
   understand the notion of bufferbloat.

                                                                                  


The IETF Secretariat



----- End forwarded message -----

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
  2021-08-13 21:41 [Bloat] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt Christoph Paasch
@ 2021-08-15 13:39 ` Erik Auerswald
  2021-08-18 22:01   ` Christoph Paasch
  2021-09-21 20:50   ` [Bloat] [ippm] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt Toerless Eckert
  0 siblings, 2 replies; 14+ messages in thread
From: Erik Auerswald @ 2021-08-15 13:39 UTC (permalink / raw)
  To: bloat; +Cc: draft-cpaasch-ippm-responsiveness, ippm

[-- Attachment #1: Type: text/plain, Size: 7157 bytes --]

Hi,

I'd like to thank you for working on a nice I-D describing an interesting
and IMHO useful network measurement metric.

Since feedback was asked for, I'd like to try and provide constructive
feedback.

In general, I like the idea of "Round-trips per Minute" (RPM) as a
metric used to characterize (one aspect of) a network.  I do think that
introducing this would improve the status quo.  Since this RPM definition
comprises a specific way of adding load to the network and measuring a
complex metric, I think it is useful to "standardize" it.

I do not think RPM can replace all other metrics.  This is, in a way,
mentioned in the introduction, where it is suggested to add RPM to
existing measurement platforms.  As such I just want to point this out
more explicitely, but do not intend to diminish the RPM idea by this.
In short, I'd say it's complicated.

Bandwidth matters for bulk data transfer, e.g., downloading a huge update
required for playing a multiplayer game online.

Minimum latency matters for the feasibility of interactive applications,
e.g., controlling a toy car in your room vs. a robotic arm on the ISS
from Earth vs. orbital insertion around Mars from Earth.  For a more
mundane use case consider a voice conference.  (A good decade ago I
experienced a voice conferencing system running over IP that introduced
over one second of (minimum) latency and therefore was awkward to use.)

Expressing 'bufferbloat as a measure of "Round-trips per Minute" (RPM)'
exhibits (at least) two problems:

1. A high RPM value is associated with little bufferbloat problems.

2. A low RPM value may be caused by high minimum delay instead of
   bufferbloat.

I think that RPM (i.e., under working conditions) measures a network's
usefulness for interactive applications, but not necessarily bufferbloat.
I do think that RPM is in itself more generally useful than minimum
latency or bandwidth.

A combination of low minimum latency with low RPM value strongly hints
at bufferbloat.  Other combinations are less easily characterized.

Bufferbloat can still lie in hiding, e.g., when a link with bufferbloat
is not yet the bottleneck, or if the communications end-points are not
yet able to saturate the network inbetween.  Thus high bandwidth can
result in high RPM values despite (hidden) bufferbloat.

The "Measuring is Hard" section mentions additional complications.

All in all, I do think that "measuring bufferbloat" and "measuring RPM"
should not be used synonymously.  The I-D title clearly shows this:
RPM is measuring "Responsiveness under Working Conditions" which may be
affected by bufferbloat, among other potential factors, but is not in
itself bufferbloat.

Under the assumption that only a single value (performance score) is
considered, I do think that RPM is more generally useful than bandwidth
or idle latency.

On a meta-level, I think that the word "bufferbloat" is not used according
to a single self-consistent definition in the I-D.

Additionally, I think that the I-D should reference DNS, HTTP/2, and
TLS 1.3, since these protocols are required for implementing the RPM
measurement.  The same for JSON, I think.  Possibly URL.

Using "rpm.example" instead of "example.apple.com" would result in shorter
lines for the example JSON.

"host123.cdn.example" instead of "hostname123.cdnprovider.com" might be
a more appropriate example DNS name.

Adding an informative reference to RFC 2606 / BCP 32 might raise awareness
of the existence of a BCP on example DNS names.

Please find both a unified diff against the text rendering of the I-D,
and a word diff produced from the unified diff, attached to this email
in order to suggest editorial changes that are intended to improve the
reading experience.  They are intended for reading and (possibly partial)
manual application, since the text rendering of an I-D is usually not
the preferred form of editing it.

Thanks,
Erik
-- 
Always use the right tool for the job.
                        -- Rob Pike


On Fri, Aug 13, 2021 at 02:41:05PM -0700, Christoph Paasch via Bloat wrote:
> I already posted this to the RPM-list, but the audience here on bloat should
> be interested as well.
> 
> 
> This is the specification of Apple's responsiveness/RPM test. We believe that it
> would be good for the bufferbloat-effort to have a specification of how to
> quantify the extend of bufferbloat from a user's perspective. Our
> Internet-draft is a first step in that direction and we hope that it will
> kick off some collaboration.
> 
> 
> Feedback is very welcome!
> 
> 
> Cheers,
> Christoph
> 
> 
> ----- Forwarded message from internet-drafts@ietf.org -----
> 
> From: internet-drafts@ietf.org
> To: Christoph Paasch <cpaasch@apple.com>, Omer Shapira <oesh@apple.com>, Randall Meyer <rrm@apple.com>, Stuart Cheshire
> 	<cheshire@apple.com>
> Date: Fri, 13 Aug 2021 09:43:40 -0700
> Subject: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
> 
> 
> A new version of I-D, draft-cpaasch-ippm-responsiveness-00.txt
> has been successfully submitted by Christoph Paasch and posted to the
> IETF repository.
> 
> Name:		draft-cpaasch-ippm-responsiveness
> Revision:	00
> Title:		Responsiveness under Working Conditions
> Document date:	2021-08-13
> Group:		Individual Submission
> Pages:		12
> URL:            https://www.ietf.org/archive/id/draft-cpaasch-ippm-responsiveness-00.txt
> Status:         https://datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/
> Htmlized:       https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness
> 
> 
> Abstract:
>    Bufferbloat has been a long-standing problem on the Internet with
>    more than a decade of work on standardizing technical solutions,
>    implementations and testing.  However, to this date, bufferbloat is
>    still a very common problem for the end-users.  Everyone "knows" that
>    it is "normal" for a video conference to have problems when somebody
>    else on the same home-network is watching a 4K movie.
> 
>    The reason for this problem is not the lack of technical solutions,
>    but rather a lack of awareness of the problem-space, and a lack of
>    tooling to accurately measure the problem.  We believe that exposing
>    the problem of bufferbloat to the end-user by measuring the end-
>    users' experience at a high level will help to create the necessary
>    awareness.
> 
>    This document is a first attempt at specifying a measurement
>    methodology to evaluate bufferbloat the way common users are
>    experiencing it today, using today's most frequently used protocols
>    and mechanisms to accurately measure the user-experience.  We also
>    provide a way to express the bufferbloat as a measure of "Round-trips
>    per minute" (RPM) to have a more intuitive way for the users to
>    understand the notion of bufferbloat.
> 
>                                                                                   
> 
> 
> The IETF Secretariat
> 
> 
> 
> ----- End forwarded message -----
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

[-- Attachment #2: editorial-suggestions-2021-08-15-unified.diff --]
[-- Type: text/x-diff, Size: 19237 bytes --]

--- draft-cpaasch-ippm-responsiveness-00.txt	2021-08-15 12:01:01.213813125 +0200
+++ draft-cpaasch-ippm-responsiveness-00-ea.txt	2021-08-15 15:08:08.013416074 +0200
@@ -17,7 +17,7 @@
 
    Bufferbloat has been a long-standing problem on the Internet with
    more than a decade of work on standardizing technical solutions,
-   implementations and testing.  However, to this date, bufferbloat is
+   implementations, and testing.  However, to this date, bufferbloat is
    still a very common problem for the end-users.  Everyone "knows" that
    it is "normal" for a video conference to have problems when somebody
    else on the same home-network is watching a 4K movie.
@@ -33,8 +33,8 @@
    methodology to evaluate bufferbloat the way common users are
    experiencing it today, using today's most frequently used protocols
    and mechanisms to accurately measure the user-experience.  We also
-   provide a way to express the bufferbloat as a measure of "Round-trips
-   per minute" (RPM) to have a more intuitive way for the users to
+   provide a way to express bufferbloat as a measure of "Round-trips
+   per Minute" (RPM) to have a more intuitive way for the users to
    understand the notion of bufferbloat.
 
 Status of This Memo
@@ -81,14 +81,14 @@
 Table of Contents
 
    1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
-   2.  Measuring is hard . . . . . . . . . . . . . . . . . . . . . .   3
+   2.  Measuring is Hard . . . . . . . . . . . . . . . . . . . . . .   3
    3.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
    4.  Measuring Responsiveness  . . . . . . . . . . . . . . . . . .   5
      4.1.  Working Conditions  . . . . . . . . . . . . . . . . . . .   5
        4.1.1.  Parallel vs Sequential Uplink and Downlink  . . . . .   6
-       4.1.2.  From single-flow to multi-flow  . . . . . . . . . . .   7
-       4.1.3.  Reaching saturation . . . . . . . . . . . . . . . . .   7
-       4.1.4.  Final algorithm . . . . . . . . . . . . . . . . . . .   7
+       4.1.2.  From Single-flow to Multi-flow  . . . . . . . . . . .   7
+       4.1.3.  Reaching Saturation . . . . . . . . . . . . . . . . .   7
+       4.1.4.  Final Algorithm . . . . . . . . . . . . . . . . . . .   7
      4.2.  Measuring Responsiveness  . . . . . . . . . . . . . . . .   8
        4.2.1.  Aggregating Round-trips per Minute  . . . . . . . . .   9
        4.2.2.  Statistical Confidence  . . . . . . . . . . . . . . .  10
@@ -103,8 +103,8 @@
 
    For many years, bufferbloat has been known as an unfortunately common
    issue in todays networks [Bufferbloat].  Solutions like FQ-codel
-   [RFC8289] or PIE [RFC8033] have been standardized and are to some
-   extend widely implemented.  Nevertheless, users still suffer from
+   [RFC8290] or PIE [RFC8033] have been standardized and are to some
+   extent widely implemented.  Nevertheless, users still suffer from
    bufferbloat.
 
 
@@ -129,7 +129,7 @@
    bufferbloat problem.
 
    We believe that it is necessary to create a standardized way for
-   measuring the extend of bufferbloat in a network and express it to
+   measuring the extent of bufferbloat in a network and express it to
    the user in a user-friendly way.  This should help existing
    measurement tools to add a bufferbloat measurement to their set of
    metrics.  It will also allow to raise the awareness to the problem
@@ -144,10 +144,10 @@
    classification for those protocols is very common.  It is thus very
    important to use those protocols for the measurements to avoid
    focusing on use-cases that are not actually affecting the end-user.
-   Finally, we propose to use "round-trips per minute" as a metric to
-   express the extend of bufferbloat.
+   Finally, we propose to use "Round-trips per Minute" as a metric to
+   express the extent of bufferbloat.
 
-2.  Measuring is hard
+2.  Measuring is Hard
 
    There are several challenges around measuring bufferbloat accurately
    on the Internet.  These challenges are due to different factors.
@@ -155,7 +155,7 @@
    problem space, and the reproducibility of the measurement.
 
    It is well-known that transparent TCP proxies are widely deployed on
-   port 443 and/or port 80, while less common on other ports.  Thus,
+   port 443 and/or port 80, while less commonly on other ports.  Thus,
    choice of the port-number to measure bufferbloat has a significant
    influence on the result.  Other factors are the protocols being used.
    TCP and UDP traffic may take a largely different path on the Internet
@@ -186,17 +186,17 @@
    measurement.  It seems that it's best to avoid extending the duration
    of the test beyond what's needed.
 
-   The problem space around the bufferbloat is huge.  Traditionally, one
+   The problem space around bufferbloat is huge.  Traditionally, one
    thinks of bufferbloat happening on the routers and switches of the
    Internet.  Thus, simply measuring bufferbloat at the transport layer
    would be sufficient.  However, the networking stacks of the clients
    and servers can also experience huge amounts of bufferbloat.  Data
    sitting in TCP sockets or waiting in the application to be scheduled
    for sending causes artificial latency, which affects user-experience
-   the same way the "traditional" bufferbloat does.
+   the same way "traditional" bufferbloat does.
 
    Finally, measuring bufferbloat requires us to fill the buffers of the
-   bottleneck and when buffer occupancy is at its peak, the latency
+   bottleneck, and when buffer occupancy is at its peak, the latency
    measurement needs to be done.  Achieving this in a reliable and
    reproducible way is not easy.  First, one needs to ensure that
    buffers are actually full for a sustained period of time to allow for
@@ -250,15 +250,15 @@
        bufferbloat.
 
    4.  Finally, in order for this measurement to be user-friendly to a
-       wide audience it is important that such a measurement finishes
-       within a short time-frame and short being anything below 20
+       wide audience, it is important that such a measurement finishes
+       within a short time-frame with short being anything below 20
        seconds.
 
 4.  Measuring Responsiveness
 
    The ability to reliably measure the responsiveness under typical
    working conditions is predicated by the ability to reliably put the
-   network in a state representative of the said conditions.  Once the
+   network in a state representative of said conditions.  Once the
    network has reached the required state, its responsiveness can be
    measured.  The following explains how the former and the latter are
    achieved.
@@ -270,7 +270,7 @@
    experiencing ingress and egress flows that are similar to those when
    used by humans in the typical day-to-day pattern.
 
-   While any network can be put momentarily into working condition by
+   While any network can be put momentarily into working conditions by
    the means of a single HTTP transaction, taking measurements requires
    maintaining such conditions over sufficient time.  Thus, measuring
    the network responsiveness in a consistent way depends on our ability
@@ -286,7 +286,7 @@
    way to achieve this is by creating multiple large bulk data-transfers
    in either downstream or upstream direction.  Similar to conventional
    speed-test applications that also create a varying number of streams
-   to measure throughput.  Working-conditions does the same.  It also
+   to measure throughput.  Working conditions does the same.  It also
    requires a way to detect when the network is in a persistent working
    condition, called "saturation".  This can be achieved by monitoring
    the instantaneous goodput over time.  When the goodput stops
@@ -298,7 +298,7 @@
    o  Should not waste traffic, since the user may be paying for it
 
    o  Should finish within a short time-frame to avoid impacting other
-      users on the same network and/or experience varying conditions
+      users on the same network and/or experiencing varying conditions
 
 4.1.1.  Parallel vs Sequential Uplink and Downlink
 
@@ -308,8 +308,8 @@
    upstream) or the routing in the ISPs.  Users sending data to an
    Internet service will fill the bottleneck on the upstream path to the
    server and thus expose a potential for bufferbloat to happen at this
-   bottleneck.  On the downlink direction any download from an Internet
-   service will encounter a bottleneck and thus exposes another
+   bottleneck.  In the downlink direction any download from an Internet
+   service will encounter a bottleneck and thus expose another
    potential for bufferbloat.  Thus, when measuring responsiveness under
    working conditions it is important to consider both, the upstream and
    the downstream bufferbloat.  This opens the door to measure both
@@ -322,13 +322,16 @@
    seconds of test per direction, while parallel measurement will allow
    for 20 seconds of testing in both directions.
 
-   However, a number caveats come with measuring in parallel: - Half-
-   duplex links may not expose uplink and downlink bufferbloat: A half-
-   duplex link may not allow during parallel measurement to saturate
-   both the uplink and the downlink direction.  Thus, bufferbloat in
-   either of the directions may not be exposed during parallel
-   measurement.  - Debuggability of the results becomes more obscure:
-   During parallel measurement it is impossible to differentiate on
+   However, a number caveats come with measuring in parallel:
+
+   - Half-duplex links may not expose uplink and downlink bufferbloat:
+     A half-duplex link may not allow to saturate both the uplink
+     and the downlink direction during parallel measurement.  Thus,
+     bufferbloat in either of the directions may not be exposed during
+     parallel measurement.
+
+   - Debuggability of the results becomes more obscure:
+     During parallel measurement it is impossible to differentiate on
 
 
 
@@ -338,26 +341,26 @@
 Internet-Draft   Responsiveness under Working Conditions     August 2021
 
 
-   whether the bufferbloat happens in the uplink or the downlink
-   direction.
+     whether the bufferbloat happens in the uplink or the downlink
+     direction.
 
-4.1.2.  From single-flow to multi-flow
+4.1.2.  From Single-flow to Multi-flow
 
-   As described in RFC 6349, a single TCP connection may not be
+   As described in [RFC6349], a single TCP connection may not be
    sufficient to saturate a path between a client and a server.  On a
    high-BDP network, traditional TCP window-size constraints of 4MB are
    often not sufficient to fill the pipe.  Additionally, traditional
-   loss-based TCP congestion control algorithms aggressively reacts to
+   loss-based TCP congestion control algorithms aggressively react to
    packet-loss by reducing the congestion window.  This reaction will
-   reduce the queuing in the network, and thus "artificially" make the
-   bufferbloat appear lesser.
+   reduce the queuing in the network, and thus "artificially" make
+   bufferbloat appear less of a problem.
 
-   The goal of the measurement is to keep the network as busy as
-   possible in a sustained and persistent way.  Thus, using multiple TCP
+   The goal is to keep the network as busy as possible in a sustained
+   and persistent way during the measurement.  Thus, using multiple TCP
    connections is needed for a sustained bufferbloat by gradually adding
-   TCP flows until saturation is needed.
+   TCP flows until saturation is reached.
 
-4.1.3.  Reaching saturation
+4.1.3.  Reaching Saturation
 
    It is best to detect when saturation has been reached so that the
    measurement of responsiveness can start with the confidence that the
@@ -367,8 +370,8 @@
    buffers are completely filled.  Thus, this depends highly on the
    congestion control that is being deployed on the sender-side.
    Congestion control algorithms like BBR may reach high throughput
-   without causing bufferbloat. (because the bandwidth-detection portion
-   of BBR is effectively seeking the bottleneck capacity)
+   without causing bufferbloat (because the bandwidth-detection portion
+   of BBR is effectively seeking the bottleneck capacity).
 
    It is advised to rather use loss-based congestion controls like Cubic
    to "reliably" ensure that the buffers are filled.
@@ -379,7 +382,7 @@
    packet-loss or ECN-marks signaling a congestion or even a full buffer
    of the bottleneck link.
 
-4.1.4.  Final algorithm
+4.1.4.  Final Algorithm
 
    The following is a proposal for an algorithm to reach saturation of a
    network by using HTTP/2 upload (POST) or download (GET) requests of
@@ -404,7 +407,7 @@
    throughput will remain stable.  In the latter case, this means that
    saturation has been reached and - more importantly - is stable.
 
-   In detail, the steps of the algorithm are the following
+   In detail, the steps of the algorithm are the following:
 
    o  Create 4 load-bearing connections
 
@@ -453,7 +456,7 @@
    the different stages of a separate network transaction as well as
    measuring on the load-bearing connections themselves.
 
-   Two aspects are being measured with this approach :
+   Two aspects are being measured with this approach:
 
    1.  How the network handles new connections and their different
        stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2
@@ -463,19 +466,19 @@
 
    2.  How the network and the client/server networking stack handles
        the latency on the load-bearing connections themselves.  E.g.,
-       Smart queuing techniques on the bottleneck will allow to keep the
+       smart queuing techniques on the bottleneck will allow to keep the
        latency within a reasonable limit in the network and buffer-
-       reducing techniques like TCP_NOTSENT_LOWAT makes sure the client
+       reducing techniques like TCP_NOTSENT_LOWAT make sure the client
        and server TCP-stack is not a source of significant latency.
 
    To measure the former, we send a DNS-request, establish a TCP-
    connection on port 443, establish a TLS-context using TLS1.3 and send
-   an HTTP2 GET request for an object of a single byte large.  This
+   an HTTP/2 GET request for an object the size of a single byte.  This
    measurement will be repeated multiple times for accuracy.  Each of
    these stages allows to collect a single latency measurement that can
    then be factored into the responsiveness computation.
 
-   To measure the latter, on the load-bearing connections (that uses
+   To measure the latter, on the load-bearing connections (that use
    HTTP/2) a GET request is multiplexed.  This GET request is for a
    1-byte object.  This allows to measure the end-to-end latency on the
    connections that are using the network at full speed.
@@ -492,10 +495,10 @@
    an equal weight to each of these measurements.
 
    Finally, the resulting latency needs to be exposed to the users.
-   Users have been trained to accept metrics that have a notion of "The
+   Users have been trained to accept metrics that have a notion of "the
    higher the better".  Latency measuring in units of seconds however is
    "the lower the better".  Thus, converting the latency measurement to
-   a frequency allows using the familiar notion of "The higher the
+   a frequency allows using the familiar notion of "the higher the
    better".  The term frequency has a very technical connotation.  What
    we are effectively measuring is the number of round-trips from the
 
@@ -513,7 +516,7 @@
    which is a wink to the "revolutions per minute" that we are used to
    in cars.
 
-   Thus, our unit of measure is "Round-trip per Minute" (RPM) that
+   Thus, our unit of measure is "Round-trips per Minute" (RPM) that
    expresses responsiveness under working conditions.
 
 4.2.2.  Statistical Confidence
@@ -527,13 +530,13 @@
 5.  Protocol Specification
 
    By using standard protocols that are most commonly used by end-users,
-   no new protocol needs to be specified.  However, both client and
+   no new protocol needs to be specified.  However, both clients and
    servers need capabilities to execute this kind of measurement as well
-   as a standard to flow to provision the client with the necessary
+   as a standard to follow to provision the client with the necessary
    information.
 
    First, the capabilities of both the client and the server: It is
-   expected that both hosts support HTTP/2 over TLS 1.3.  That the
+   expected that both hosts support HTTP/2 over TLS 1.3, and that the
    client is able to send a GET-request and a POST.  The server needs
    the ability to serve both of these HTTP commands.  Further, the
    server endpoint is accessible through a hostname that can be resolved
@@ -546,13 +549,13 @@
    1.  A config URL/response: This is the configuration file/format used
        by the client.  It's a simple JSON file format that points the
        client at the various URLs mentioned below.  All of the fields
-       are required except "test_endpoint".  If the service-procier can
+       are required except "test_endpoint".  If the service-provider can
        pin all of the requests for a test run to a specific node in the
        service (for a particular run), they can specify that node's name
        in the "test_endpoint" field.  It's preferred that pinning of
        some sort is available.  This is to ensure the measurement is
        against the same paths and not switching hosts during a test run
-       (ie moving from near POP A to near POP B) Sample content of this
+       (i.e., moving from near POP A to near POP B).  Sample content of this
        JSON would be:
 
 
@@ -577,7 +580,7 @@
 
    3.  A "large" URL/response: This needs to serve a status code of 200
        and a body size of at least 8GB.  The body can be bigger, and
-       will need to grow as network speeds increases over time.  The
+       will need to grow as network speeds increase over time.  The
        actual body content is irrelevant.  The client will probably
        never completely download the object.
 
@@ -618,16 +621,19 @@
 Internet-Draft   Responsiveness under Working Conditions     August 2021
 
 
+   [RFC6349]  ...
+
    [RFC8033]  Pan, R., Natarajan, P., Baker, F., and G. White,
               "Proportional Integral Controller Enhanced (PIE): A
               Lightweight Control Scheme to Address the Bufferbloat
               Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
               <https://www.rfc-editor.org/info/rfc8033>.
 
-   [RFC8289]  Nichols, K., Jacobson, V., McGregor, A., Ed., and J.
-              Iyengar, Ed., "Controlled Delay Active Queue Management",
-              RFC 8289, DOI 10.17487/RFC8289, January 2018,
-              <https://www.rfc-editor.org/info/rfc8289>.
+   [RFC8290]  Hoeiland-Joergensen, T., McKenney, P., Taht, D., Ed., and
+              Gettys, J., "The Flow Queue CoDel Packet Scheduler and
+	      Active Queue Management Algorithm", RFC 8290,
+	      DOI 10.17487/RFC8290, January 2018,
+              <https://www.rfc-editor.org/info/rfc8290>.
 
 Authors' Addresses
 

[-- Attachment #3: editorial-suggestions-2021-08-15-word.diff --]
[-- Type: text/x-diff, Size: 16064 bytes --]

[--- draft-cpaasch-ippm-responsiveness-00.txt-]{+++ draft-cpaasch-ippm-responsiveness-00-ea.txt+}	2021-08-15 [-12:01:01.213813125-] {+15:08:08.013416074+} +0200
@@ -17,7 +17,7 @@

   Bufferbloat has been a long-standing problem on the Internet with
   more than a decade of work on standardizing technical solutions,
   [-implementations-]
   {+implementations,+} and testing.  However, to this date, bufferbloat is
   still a very common problem for the end-users.  Everyone "knows" that
   it is "normal" for a video conference to have problems when somebody
   else on the same home-network is watching a 4K movie.
@@ -33,8 +33,8 @@
   methodology to evaluate bufferbloat the way common users are
   experiencing it today, using today's most frequently used protocols
   and mechanisms to accurately measure the user-experience.  We also
   provide a way to express [-the-] bufferbloat as a measure of "Round-trips
   per [-minute"-] {+Minute"+} (RPM) to have a more intuitive way for the users to
   understand the notion of bufferbloat.

Status of This Memo
@@ -81,14 +81,14 @@
Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Measuring is [-hard-] {+Hard+} . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
   4.  Measuring Responsiveness  . . . . . . . . . . . . . . . . . .   5
     4.1.  Working Conditions  . . . . . . . . . . . . . . . . . . .   5
       4.1.1.  Parallel vs Sequential Uplink and Downlink  . . . . .   6
       4.1.2.  From [-single-flow-] {+Single-flow+} to [-multi-flow-] {+Multi-flow+}  . . . . . . . . . . .   7
       4.1.3.  Reaching [-saturation-] {+Saturation+} . . . . . . . . . . . . . . . . .   7
       4.1.4.  Final [-algorithm-] {+Algorithm+} . . . . . . . . . . . . . . . . . . .   7
     4.2.  Measuring Responsiveness  . . . . . . . . . . . . . . . .   8
       4.2.1.  Aggregating Round-trips per Minute  . . . . . . . . .   9
       4.2.2.  Statistical Confidence  . . . . . . . . . . . . . . .  10
@@ -103,8 +103,8 @@

   For many years, bufferbloat has been known as an unfortunately common
   issue in todays networks [Bufferbloat].  Solutions like FQ-codel
   [-[RFC8289]-]
   {+[RFC8290]+} or PIE [RFC8033] have been standardized and are to some
   [-extend-]
   {+extent+} widely implemented.  Nevertheless, users still suffer from
   bufferbloat.


@@ -129,7 +129,7 @@
   bufferbloat problem.

   We believe that it is necessary to create a standardized way for
   measuring the [-extend-] {+extent+} of bufferbloat in a network and express it to
   the user in a user-friendly way.  This should help existing
   measurement tools to add a bufferbloat measurement to their set of
   metrics.  It will also allow to raise the awareness to the problem
@@ -144,10 +144,10 @@
   classification for those protocols is very common.  It is thus very
   important to use those protocols for the measurements to avoid
   focusing on use-cases that are not actually affecting the end-user.
   Finally, we propose to use [-"round-trips-] {+"Round-trips+} per [-minute"-] {+Minute"+} as a metric to
   express the [-extend-] {+extent+} of bufferbloat.

2.  Measuring is [-hard-] {+Hard+}

   There are several challenges around measuring bufferbloat accurately
   on the Internet.  These challenges are due to different factors.
@@ -155,7 +155,7 @@
   problem space, and the reproducibility of the measurement.

   It is well-known that transparent TCP proxies are widely deployed on
   port 443 and/or port 80, while less [-common-] {+commonly+} on other ports.  Thus,
   choice of the port-number to measure bufferbloat has a significant
   influence on the result.  Other factors are the protocols being used.
   TCP and UDP traffic may take a largely different path on the Internet
@@ -186,17 +186,17 @@
   measurement.  It seems that it's best to avoid extending the duration
   of the test beyond what's needed.

   The problem space around [-the-] bufferbloat is huge.  Traditionally, one
   thinks of bufferbloat happening on the routers and switches of the
   Internet.  Thus, simply measuring bufferbloat at the transport layer
   would be sufficient.  However, the networking stacks of the clients
   and servers can also experience huge amounts of bufferbloat.  Data
   sitting in TCP sockets or waiting in the application to be scheduled
   for sending causes artificial latency, which affects user-experience
   the same way [-the-] "traditional" bufferbloat does.

   Finally, measuring bufferbloat requires us to fill the buffers of the
   [-bottleneck-]
   {+bottleneck,+} and when buffer occupancy is at its peak, the latency
   measurement needs to be done.  Achieving this in a reliable and
   reproducible way is not easy.  First, one needs to ensure that
   buffers are actually full for a sustained period of time to allow for
@@ -250,15 +250,15 @@
       bufferbloat.

   4.  Finally, in order for this measurement to be user-friendly to a
       wide [-audience-] {+audience,+} it is important that such a measurement finishes
       within a short time-frame [-and-] {+with+} short being anything below 20
       seconds.

4.  Measuring Responsiveness

   The ability to reliably measure the responsiveness under typical
   working conditions is predicated by the ability to reliably put the
   network in a state representative of [-the-] said conditions.  Once the
   network has reached the required state, its responsiveness can be
   measured.  The following explains how the former and the latter are
   achieved.
@@ -270,7 +270,7 @@
   experiencing ingress and egress flows that are similar to those when
   used by humans in the typical day-to-day pattern.

   While any network can be put momentarily into working [-condition-] {+conditions+} by
   the means of a single HTTP transaction, taking measurements requires
   maintaining such conditions over sufficient time.  Thus, measuring
   the network responsiveness in a consistent way depends on our ability
@@ -286,7 +286,7 @@
   way to achieve this is by creating multiple large bulk data-transfers
   in either downstream or upstream direction.  Similar to conventional
   speed-test applications that also create a varying number of streams
   to measure throughput.  [-Working-conditions-]  {+Working conditions+} does the same.  It also
   requires a way to detect when the network is in a persistent working
   condition, called "saturation".  This can be achieved by monitoring
   the instantaneous goodput over time.  When the goodput stops
@@ -298,7 +298,7 @@
   o  Should not waste traffic, since the user may be paying for it

   o  Should finish within a short time-frame to avoid impacting other
      users on the same network and/or [-experience-] {+experiencing+} varying conditions

4.1.1.  Parallel vs Sequential Uplink and Downlink

@@ -308,8 +308,8 @@
   upstream) or the routing in the ISPs.  Users sending data to an
   Internet service will fill the bottleneck on the upstream path to the
   server and thus expose a potential for bufferbloat to happen at this
   bottleneck.  [-On-]  {+In+} the downlink direction any download from an Internet
   service will encounter a bottleneck and thus [-exposes-] {+expose+} another
   potential for bufferbloat.  Thus, when measuring responsiveness under
   working conditions it is important to consider both, the upstream and
   the downstream bufferbloat.  This opens the door to measure both
@@ -322,13 +322,16 @@
   seconds of test per direction, while parallel measurement will allow
   for 20 seconds of testing in both directions.

   However, a number caveats come with measuring in parallel:

   - [-Half-
   duplex-] {+Half-duplex+} links may not expose uplink and downlink bufferbloat:
     A [-half-
   duplex-] {+half-duplex+} link may not allow [-during parallel measurement-] to saturate both the uplink
     and the downlink [-direction.-] {+direction during parallel measurement.+}  Thus,
     bufferbloat in either of the directions may not be exposed during
     parallel measurement.

   - Debuggability of the results becomes more obscure:
     During parallel measurement it is impossible to differentiate on



@@ -338,26 +341,26 @@
Internet-Draft   Responsiveness under Working Conditions     August 2021


     whether the bufferbloat happens in the uplink or the downlink
     direction.

4.1.2.  From [-single-flow-] {+Single-flow+} to [-multi-flow-] {+Multi-flow+}

   As described in [-RFC 6349,-] {+[RFC6349],+} a single TCP connection may not be
   sufficient to saturate a path between a client and a server.  On a
   high-BDP network, traditional TCP window-size constraints of 4MB are
   often not sufficient to fill the pipe.  Additionally, traditional
   loss-based TCP congestion control algorithms aggressively [-reacts-] {+react+} to
   packet-loss by reducing the congestion window.  This reaction will
   reduce the queuing in the network, and thus "artificially" make [-the-]
   bufferbloat appear [-lesser.-] {+less of a problem.+}

   The goal [-of the measurement-] is to keep the network as busy as possible in a sustained
   and persistent [-way.-] {+way during the measurement.+}  Thus, using multiple TCP
   connections is needed for a sustained bufferbloat by gradually adding
   TCP flows until saturation is [-needed.-] {+reached.+}

4.1.3.  Reaching [-saturation-] {+Saturation+}

   It is best to detect when saturation has been reached so that the
   measurement of responsiveness can start with the confidence that the
@@ -367,8 +370,8 @@
   buffers are completely filled.  Thus, this depends highly on the
   congestion control that is being deployed on the sender-side.
   Congestion control algorithms like BBR may reach high throughput
   without causing [-bufferbloat.-] {+bufferbloat+} (because the bandwidth-detection portion
   of BBR is effectively seeking the bottleneck [-capacity)-] {+capacity).+}

   It is advised to rather use loss-based congestion controls like Cubic
   to "reliably" ensure that the buffers are filled.
@@ -379,7 +382,7 @@
   packet-loss or ECN-marks signaling a congestion or even a full buffer
   of the bottleneck link.

4.1.4.  Final [-algorithm-] {+Algorithm+}

   The following is a proposal for an algorithm to reach saturation of a
   network by using HTTP/2 upload (POST) or download (GET) requests of
@@ -404,7 +407,7 @@
   throughput will remain stable.  In the latter case, this means that
   saturation has been reached and - more importantly - is stable.

   In detail, the steps of the algorithm are the [-following-] {+following:+}

   o  Create 4 load-bearing connections

@@ -453,7 +456,7 @@
   the different stages of a separate network transaction as well as
   measuring on the load-bearing connections themselves.

   Two aspects are being measured with this [-approach :-] {+approach:+}

   1.  How the network handles new connections and their different
       stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2
@@ -463,19 +466,19 @@

   2.  How the network and the client/server networking stack handles
       the latency on the load-bearing connections themselves.  E.g.,
       [-Smart-]
       {+smart+} queuing techniques on the bottleneck will allow to keep the
       latency within a reasonable limit in the network and buffer-
       reducing techniques like TCP_NOTSENT_LOWAT [-makes-] {+make+} sure the client
       and server TCP-stack is not a source of significant latency.

   To measure the former, we send a DNS-request, establish a TCP-
   connection on port 443, establish a TLS-context using TLS1.3 and send
   an [-HTTP2-] {+HTTP/2+} GET request for an object {+the size+} of a single [-byte large.-] {+byte.+}  This
   measurement will be repeated multiple times for accuracy.  Each of
   these stages allows to collect a single latency measurement that can
   then be factored into the responsiveness computation.

   To measure the latter, on the load-bearing connections (that [-uses-] {+use+}
   HTTP/2) a GET request is multiplexed.  This GET request is for a
   1-byte object.  This allows to measure the end-to-end latency on the
   connections that are using the network at full speed.
@@ -492,10 +495,10 @@
   an equal weight to each of these measurements.

   Finally, the resulting latency needs to be exposed to the users.
   Users have been trained to accept metrics that have a notion of [-"The-] {+"the+}
   higher the better".  Latency measuring in units of seconds however is
   "the lower the better".  Thus, converting the latency measurement to
   a frequency allows using the familiar notion of [-"The-] {+"the+} higher the
   better".  The term frequency has a very technical connotation.  What
   we are effectively measuring is the number of round-trips from the

@@ -513,7 +516,7 @@
   which is a wink to the "revolutions per minute" that we are used to
   in cars.

   Thus, our unit of measure is [-"Round-trip-] {+"Round-trips+} per Minute" (RPM) that
   expresses responsiveness under working conditions.

4.2.2.  Statistical Confidence
@@ -527,13 +530,13 @@
5.  Protocol Specification

   By using standard protocols that are most commonly used by end-users,
   no new protocol needs to be specified.  However, both [-client-] {+clients+} and
   servers need capabilities to execute this kind of measurement as well
   as a standard to [-flow-] {+follow+} to provision the client with the necessary
   information.

   First, the capabilities of both the client and the server: It is
   expected that both hosts support HTTP/2 over TLS [-1.3.  That-] {+1.3, and that+} the
   client is able to send a GET-request and a POST.  The server needs
   the ability to serve both of these HTTP commands.  Further, the
   server endpoint is accessible through a hostname that can be resolved
@@ -546,13 +549,13 @@
   1.  A config URL/response: This is the configuration file/format used
       by the client.  It's a simple JSON file format that points the
       client at the various URLs mentioned below.  All of the fields
       are required except "test_endpoint".  If the [-service-procier-] {+service-provider+} can
       pin all of the requests for a test run to a specific node in the
       service (for a particular run), they can specify that node's name
       in the "test_endpoint" field.  It's preferred that pinning of
       some sort is available.  This is to ensure the measurement is
       against the same paths and not switching hosts during a test run
       [-(ie-]
       {+(i.e.,+} moving from near POP A to near POP [-B)-] {+B).+}  Sample content of this
       JSON would be:


@@ -577,7 +580,7 @@

   3.  A "large" URL/response: This needs to serve a status code of 200
       and a body size of at least 8GB.  The body can be bigger, and
       will need to grow as network speeds [-increases-] {+increase+} over time.  The
       actual body content is irrelevant.  The client will probably
       never completely download the object.

@@ -618,16 +621,19 @@
Internet-Draft   Responsiveness under Working Conditions     August 2021


   {+[RFC6349]  ...+}

   [RFC8033]  Pan, R., Natarajan, P., Baker, F., and G. White,
              "Proportional Integral Controller Enhanced (PIE): A
              Lightweight Control Scheme to Address the Bufferbloat
              Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
              <https://www.rfc-editor.org/info/rfc8033>.

   [-[RFC8289]  Nichols, K., Jacobson, V., McGregor, A.,-]

   {+[RFC8290]  Hoeiland-Joergensen, T., McKenney, P., Taht, D.,+} Ed., and [-J.
              Iyengar, Ed., "Controlled Delay-]
              {+Gettys, J., "The Flow Queue CoDel Packet Scheduler and+}
	      Active Queue [-Management",-] {+Management Algorithm",+} RFC [-8289,-] {+8290,+}
	      DOI [-10.17487/RFC8289,-] {+10.17487/RFC8290,+} January 2018,
              [-<https://www.rfc-editor.org/info/rfc8289>.-]
              {+<https://www.rfc-editor.org/info/rfc8290>.+}

Authors' Addresses


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
  2021-08-15 13:39 ` Erik Auerswald
@ 2021-08-18 22:01   ` Christoph Paasch
  2021-08-19  7:17     ` Erik Auerswald
  2021-09-21 20:50   ` [Bloat] [ippm] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt Toerless Eckert
  1 sibling, 1 reply; 14+ messages in thread
From: Christoph Paasch @ 2021-08-18 22:01 UTC (permalink / raw)
  To: Erik Auerswald; +Cc: bloat, draft-cpaasch-ippm-responsiveness, ippm

Hello Erik,

On 08/15/21 - 15:39, Erik Auerswald wrote:
> Hi,
> 
> I'd like to thank you for working on a nice I-D describing an interesting
> and IMHO useful network measurement metric.
> 
> Since feedback was asked for, I'd like to try and provide constructive
> feedback.

thanks a lot for your detailed feedback! Please see further inline.

> In general, I like the idea of "Round-trips per Minute" (RPM) as a
> metric used to characterize (one aspect of) a network.  I do think that
> introducing this would improve the status quo.  Since this RPM definition
> comprises a specific way of adding load to the network and measuring a
> complex metric, I think it is useful to "standardize" it.
> 
> I do not think RPM can replace all other metrics.  This is, in a way,
> mentioned in the introduction, where it is suggested to add RPM to
> existing measurement platforms.  As such I just want to point this out
> more explicitely, but do not intend to diminish the RPM idea by this.
> In short, I'd say it's complicated.

Yes, I fully agree that RPM is not the only metric. It is one among many.
If there is a sentiment in our document that sounds like "RPM is the only
that matters", please let me know where so we can reword the text.

> Bandwidth matters for bulk data transfer, e.g., downloading a huge update
> required for playing a multiplayer game online.
> 
> Minimum latency matters for the feasibility of interactive applications,
> e.g., controlling a toy car in your room vs. a robotic arm on the ISS
> from Earth vs. orbital insertion around Mars from Earth.  For a more
> mundane use case consider a voice conference.  (A good decade ago I
> experienced a voice conferencing system running over IP that introduced
> over one second of (minimum) latency and therefore was awkward to use.)

Wrt to minimum latency:

To some extend it is a subset of "RPM".
But admittedly, measuring minimum latency on its own is good for debugging
purposes and to know what one can get on a network that is not in persistent
working conditions.

> Expressing 'bufferbloat as a measure of "Round-trips per Minute" (RPM)'
> exhibits (at least) two problems:
> 
> 1. A high RPM value is associated with little bufferbloat problems.
> 
> 2. A low RPM value may be caused by high minimum delay instead of
>    bufferbloat.
> 
> I think that RPM (i.e., under working conditions) measures a network's
> usefulness for interactive applications, but not necessarily bufferbloat.

You are right and we are definitely misrepresenting this in the text.

I filed
https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/8.

If you want, feel free to submit a pull-request otherwise, we will get to
the issue in the next weeks.

> I do think that RPM is in itself more generally useful than minimum
> latency or bandwidth.
> 
> A combination of low minimum latency with low RPM value strongly hints
> at bufferbloat.  Other combinations are less easily characterized.
> 
> Bufferbloat can still lie in hiding, e.g., when a link with bufferbloat
> is not yet the bottleneck, or if the communications end-points are not
> yet able to saturate the network inbetween.  Thus high bandwidth can
> result in high RPM values despite (hidden) bufferbloat.
> 
> The "Measuring is Hard" section mentions additional complications.
> 
> All in all, I do think that "measuring bufferbloat" and "measuring RPM"
> should not be used synonymously.  The I-D title clearly shows this:
> RPM is measuring "Responsiveness under Working Conditions" which may be
> affected by bufferbloat, among other potential factors, but is not in
> itself bufferbloat.
> 
> Under the assumption that only a single value (performance score) is
> considered, I do think that RPM is more generally useful than bandwidth
> or idle latency.
> 
> On a meta-level, I think that the word "bufferbloat" is not used according
> to a single self-consistent definition in the I-D.

Fully agree with all your points above on how we misrepresented the relation
between RPM and bufferbloat.

> Additionally, I think that the I-D should reference DNS, HTTP/2, and
> TLS 1.3, since these protocols are required for implementing the RPM
> measurement.  The same for JSON, I think.  Possibly URL.

Yes, we have not given the references & citations enough care.
(https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/2)

> Using "rpm.example" instead of "example.apple.com" would result in shorter
> lines for the example JSON.
> 
> "host123.cdn.example" instead of "hostname123.cdnprovider.com" might be
> a more appropriate example DNS name.

Oups, we forgot to adjust these to a more generic hostname...
(https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/9)

> Adding an informative reference to RFC 2606 / BCP 32 might raise awareness
> of the existence of a BCP on example DNS names.
> 
> Please find both a unified diff against the text rendering of the I-D,
> and a word diff produced from the unified diff, attached to this email
> in order to suggest editorial changes that are intended to improve the
> reading experience.  They are intended for reading and (possibly partial)
> manual application, since the text rendering of an I-D is usually not
> the preferred form of editing it.

Thanks a lot for these
(https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/10)



Regards,
Christoph

> 
> Thanks,
> Erik
> -- 
> Always use the right tool for the job.
>                         -- Rob Pike
> 
> 
> On Fri, Aug 13, 2021 at 02:41:05PM -0700, Christoph Paasch via Bloat wrote:
> > I already posted this to the RPM-list, but the audience here on bloat should
> > be interested as well.
> > 
> > 
> > This is the specification of Apple's responsiveness/RPM test. We believe that it
> > would be good for the bufferbloat-effort to have a specification of how to
> > quantify the extend of bufferbloat from a user's perspective. Our
> > Internet-draft is a first step in that direction and we hope that it will
> > kick off some collaboration.
> > 
> > 
> > Feedback is very welcome!
> > 
> > 
> > Cheers,
> > Christoph
> > 
> > 
> > ----- Forwarded message from internet-drafts@ietf.org -----
> > 
> > From: internet-drafts@ietf.org
> > To: Christoph Paasch <cpaasch@apple.com>, Omer Shapira <oesh@apple.com>, Randall Meyer <rrm@apple.com>, Stuart Cheshire
> > 	<cheshire@apple.com>
> > Date: Fri, 13 Aug 2021 09:43:40 -0700
> > Subject: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
> > 
> > 
> > A new version of I-D, draft-cpaasch-ippm-responsiveness-00.txt
> > has been successfully submitted by Christoph Paasch and posted to the
> > IETF repository.
> > 
> > Name:		draft-cpaasch-ippm-responsiveness
> > Revision:	00
> > Title:		Responsiveness under Working Conditions
> > Document date:	2021-08-13
> > Group:		Individual Submission
> > Pages:		12
> > URL:            https://www.ietf.org/archive/id/draft-cpaasch-ippm-responsiveness-00.txt
> > Status:         https://datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/
> > Htmlized:       https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness
> > 
> > 
> > Abstract:
> >    Bufferbloat has been a long-standing problem on the Internet with
> >    more than a decade of work on standardizing technical solutions,
> >    implementations and testing.  However, to this date, bufferbloat is
> >    still a very common problem for the end-users.  Everyone "knows" that
> >    it is "normal" for a video conference to have problems when somebody
> >    else on the same home-network is watching a 4K movie.
> > 
> >    The reason for this problem is not the lack of technical solutions,
> >    but rather a lack of awareness of the problem-space, and a lack of
> >    tooling to accurately measure the problem.  We believe that exposing
> >    the problem of bufferbloat to the end-user by measuring the end-
> >    users' experience at a high level will help to create the necessary
> >    awareness.
> > 
> >    This document is a first attempt at specifying a measurement
> >    methodology to evaluate bufferbloat the way common users are
> >    experiencing it today, using today's most frequently used protocols
> >    and mechanisms to accurately measure the user-experience.  We also
> >    provide a way to express the bufferbloat as a measure of "Round-trips
> >    per minute" (RPM) to have a more intuitive way for the users to
> >    understand the notion of bufferbloat.
> > 
> >                                                                                   
> > 
> > 
> > The IETF Secretariat
> > 
> > 
> > 
> > ----- End forwarded message -----
> > _______________________________________________
> > Bloat mailing list
> > Bloat@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat

> --- draft-cpaasch-ippm-responsiveness-00.txt	2021-08-15 12:01:01.213813125 +0200
> +++ draft-cpaasch-ippm-responsiveness-00-ea.txt	2021-08-15 15:08:08.013416074 +0200
> @@ -17,7 +17,7 @@
>  
>     Bufferbloat has been a long-standing problem on the Internet with
>     more than a decade of work on standardizing technical solutions,
> -   implementations and testing.  However, to this date, bufferbloat is
> +   implementations, and testing.  However, to this date, bufferbloat is
>     still a very common problem for the end-users.  Everyone "knows" that
>     it is "normal" for a video conference to have problems when somebody
>     else on the same home-network is watching a 4K movie.
> @@ -33,8 +33,8 @@
>     methodology to evaluate bufferbloat the way common users are
>     experiencing it today, using today's most frequently used protocols
>     and mechanisms to accurately measure the user-experience.  We also
> -   provide a way to express the bufferbloat as a measure of "Round-trips
> -   per minute" (RPM) to have a more intuitive way for the users to
> +   provide a way to express bufferbloat as a measure of "Round-trips
> +   per Minute" (RPM) to have a more intuitive way for the users to
>     understand the notion of bufferbloat.
>  
>  Status of This Memo
> @@ -81,14 +81,14 @@
>  Table of Contents
>  
>     1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
> -   2.  Measuring is hard . . . . . . . . . . . . . . . . . . . . . .   3
> +   2.  Measuring is Hard . . . . . . . . . . . . . . . . . . . . . .   3
>     3.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
>     4.  Measuring Responsiveness  . . . . . . . . . . . . . . . . . .   5
>       4.1.  Working Conditions  . . . . . . . . . . . . . . . . . . .   5
>         4.1.1.  Parallel vs Sequential Uplink and Downlink  . . . . .   6
> -       4.1.2.  From single-flow to multi-flow  . . . . . . . . . . .   7
> -       4.1.3.  Reaching saturation . . . . . . . . . . . . . . . . .   7
> -       4.1.4.  Final algorithm . . . . . . . . . . . . . . . . . . .   7
> +       4.1.2.  From Single-flow to Multi-flow  . . . . . . . . . . .   7
> +       4.1.3.  Reaching Saturation . . . . . . . . . . . . . . . . .   7
> +       4.1.4.  Final Algorithm . . . . . . . . . . . . . . . . . . .   7
>       4.2.  Measuring Responsiveness  . . . . . . . . . . . . . . . .   8
>         4.2.1.  Aggregating Round-trips per Minute  . . . . . . . . .   9
>         4.2.2.  Statistical Confidence  . . . . . . . . . . . . . . .  10
> @@ -103,8 +103,8 @@
>  
>     For many years, bufferbloat has been known as an unfortunately common
>     issue in todays networks [Bufferbloat].  Solutions like FQ-codel
> -   [RFC8289] or PIE [RFC8033] have been standardized and are to some
> -   extend widely implemented.  Nevertheless, users still suffer from
> +   [RFC8290] or PIE [RFC8033] have been standardized and are to some
> +   extent widely implemented.  Nevertheless, users still suffer from
>     bufferbloat.
>  
>  
> @@ -129,7 +129,7 @@
>     bufferbloat problem.
>  
>     We believe that it is necessary to create a standardized way for
> -   measuring the extend of bufferbloat in a network and express it to
> +   measuring the extent of bufferbloat in a network and express it to
>     the user in a user-friendly way.  This should help existing
>     measurement tools to add a bufferbloat measurement to their set of
>     metrics.  It will also allow to raise the awareness to the problem
> @@ -144,10 +144,10 @@
>     classification for those protocols is very common.  It is thus very
>     important to use those protocols for the measurements to avoid
>     focusing on use-cases that are not actually affecting the end-user.
> -   Finally, we propose to use "round-trips per minute" as a metric to
> -   express the extend of bufferbloat.
> +   Finally, we propose to use "Round-trips per Minute" as a metric to
> +   express the extent of bufferbloat.
>  
> -2.  Measuring is hard
> +2.  Measuring is Hard
>  
>     There are several challenges around measuring bufferbloat accurately
>     on the Internet.  These challenges are due to different factors.
> @@ -155,7 +155,7 @@
>     problem space, and the reproducibility of the measurement.
>  
>     It is well-known that transparent TCP proxies are widely deployed on
> -   port 443 and/or port 80, while less common on other ports.  Thus,
> +   port 443 and/or port 80, while less commonly on other ports.  Thus,
>     choice of the port-number to measure bufferbloat has a significant
>     influence on the result.  Other factors are the protocols being used.
>     TCP and UDP traffic may take a largely different path on the Internet
> @@ -186,17 +186,17 @@
>     measurement.  It seems that it's best to avoid extending the duration
>     of the test beyond what's needed.
>  
> -   The problem space around the bufferbloat is huge.  Traditionally, one
> +   The problem space around bufferbloat is huge.  Traditionally, one
>     thinks of bufferbloat happening on the routers and switches of the
>     Internet.  Thus, simply measuring bufferbloat at the transport layer
>     would be sufficient.  However, the networking stacks of the clients
>     and servers can also experience huge amounts of bufferbloat.  Data
>     sitting in TCP sockets or waiting in the application to be scheduled
>     for sending causes artificial latency, which affects user-experience
> -   the same way the "traditional" bufferbloat does.
> +   the same way "traditional" bufferbloat does.
>  
>     Finally, measuring bufferbloat requires us to fill the buffers of the
> -   bottleneck and when buffer occupancy is at its peak, the latency
> +   bottleneck, and when buffer occupancy is at its peak, the latency
>     measurement needs to be done.  Achieving this in a reliable and
>     reproducible way is not easy.  First, one needs to ensure that
>     buffers are actually full for a sustained period of time to allow for
> @@ -250,15 +250,15 @@
>         bufferbloat.
>  
>     4.  Finally, in order for this measurement to be user-friendly to a
> -       wide audience it is important that such a measurement finishes
> -       within a short time-frame and short being anything below 20
> +       wide audience, it is important that such a measurement finishes
> +       within a short time-frame with short being anything below 20
>         seconds.
>  
>  4.  Measuring Responsiveness
>  
>     The ability to reliably measure the responsiveness under typical
>     working conditions is predicated by the ability to reliably put the
> -   network in a state representative of the said conditions.  Once the
> +   network in a state representative of said conditions.  Once the
>     network has reached the required state, its responsiveness can be
>     measured.  The following explains how the former and the latter are
>     achieved.
> @@ -270,7 +270,7 @@
>     experiencing ingress and egress flows that are similar to those when
>     used by humans in the typical day-to-day pattern.
>  
> -   While any network can be put momentarily into working condition by
> +   While any network can be put momentarily into working conditions by
>     the means of a single HTTP transaction, taking measurements requires
>     maintaining such conditions over sufficient time.  Thus, measuring
>     the network responsiveness in a consistent way depends on our ability
> @@ -286,7 +286,7 @@
>     way to achieve this is by creating multiple large bulk data-transfers
>     in either downstream or upstream direction.  Similar to conventional
>     speed-test applications that also create a varying number of streams
> -   to measure throughput.  Working-conditions does the same.  It also
> +   to measure throughput.  Working conditions does the same.  It also
>     requires a way to detect when the network is in a persistent working
>     condition, called "saturation".  This can be achieved by monitoring
>     the instantaneous goodput over time.  When the goodput stops
> @@ -298,7 +298,7 @@
>     o  Should not waste traffic, since the user may be paying for it
>  
>     o  Should finish within a short time-frame to avoid impacting other
> -      users on the same network and/or experience varying conditions
> +      users on the same network and/or experiencing varying conditions
>  
>  4.1.1.  Parallel vs Sequential Uplink and Downlink
>  
> @@ -308,8 +308,8 @@
>     upstream) or the routing in the ISPs.  Users sending data to an
>     Internet service will fill the bottleneck on the upstream path to the
>     server and thus expose a potential for bufferbloat to happen at this
> -   bottleneck.  On the downlink direction any download from an Internet
> -   service will encounter a bottleneck and thus exposes another
> +   bottleneck.  In the downlink direction any download from an Internet
> +   service will encounter a bottleneck and thus expose another
>     potential for bufferbloat.  Thus, when measuring responsiveness under
>     working conditions it is important to consider both, the upstream and
>     the downstream bufferbloat.  This opens the door to measure both
> @@ -322,13 +322,16 @@
>     seconds of test per direction, while parallel measurement will allow
>     for 20 seconds of testing in both directions.
>  
> -   However, a number caveats come with measuring in parallel: - Half-
> -   duplex links may not expose uplink and downlink bufferbloat: A half-
> -   duplex link may not allow during parallel measurement to saturate
> -   both the uplink and the downlink direction.  Thus, bufferbloat in
> -   either of the directions may not be exposed during parallel
> -   measurement.  - Debuggability of the results becomes more obscure:
> -   During parallel measurement it is impossible to differentiate on
> +   However, a number caveats come with measuring in parallel:
> +
> +   - Half-duplex links may not expose uplink and downlink bufferbloat:
> +     A half-duplex link may not allow to saturate both the uplink
> +     and the downlink direction during parallel measurement.  Thus,
> +     bufferbloat in either of the directions may not be exposed during
> +     parallel measurement.
> +
> +   - Debuggability of the results becomes more obscure:
> +     During parallel measurement it is impossible to differentiate on
>  
>  
>  
> @@ -338,26 +341,26 @@
>  Internet-Draft   Responsiveness under Working Conditions     August 2021
>  
>  
> -   whether the bufferbloat happens in the uplink or the downlink
> -   direction.
> +     whether the bufferbloat happens in the uplink or the downlink
> +     direction.
>  
> -4.1.2.  From single-flow to multi-flow
> +4.1.2.  From Single-flow to Multi-flow
>  
> -   As described in RFC 6349, a single TCP connection may not be
> +   As described in [RFC6349], a single TCP connection may not be
>     sufficient to saturate a path between a client and a server.  On a
>     high-BDP network, traditional TCP window-size constraints of 4MB are
>     often not sufficient to fill the pipe.  Additionally, traditional
> -   loss-based TCP congestion control algorithms aggressively reacts to
> +   loss-based TCP congestion control algorithms aggressively react to
>     packet-loss by reducing the congestion window.  This reaction will
> -   reduce the queuing in the network, and thus "artificially" make the
> -   bufferbloat appear lesser.
> +   reduce the queuing in the network, and thus "artificially" make
> +   bufferbloat appear less of a problem.
>  
> -   The goal of the measurement is to keep the network as busy as
> -   possible in a sustained and persistent way.  Thus, using multiple TCP
> +   The goal is to keep the network as busy as possible in a sustained
> +   and persistent way during the measurement.  Thus, using multiple TCP
>     connections is needed for a sustained bufferbloat by gradually adding
> -   TCP flows until saturation is needed.
> +   TCP flows until saturation is reached.
>  
> -4.1.3.  Reaching saturation
> +4.1.3.  Reaching Saturation
>  
>     It is best to detect when saturation has been reached so that the
>     measurement of responsiveness can start with the confidence that the
> @@ -367,8 +370,8 @@
>     buffers are completely filled.  Thus, this depends highly on the
>     congestion control that is being deployed on the sender-side.
>     Congestion control algorithms like BBR may reach high throughput
> -   without causing bufferbloat. (because the bandwidth-detection portion
> -   of BBR is effectively seeking the bottleneck capacity)
> +   without causing bufferbloat (because the bandwidth-detection portion
> +   of BBR is effectively seeking the bottleneck capacity).
>  
>     It is advised to rather use loss-based congestion controls like Cubic
>     to "reliably" ensure that the buffers are filled.
> @@ -379,7 +382,7 @@
>     packet-loss or ECN-marks signaling a congestion or even a full buffer
>     of the bottleneck link.
>  
> -4.1.4.  Final algorithm
> +4.1.4.  Final Algorithm
>  
>     The following is a proposal for an algorithm to reach saturation of a
>     network by using HTTP/2 upload (POST) or download (GET) requests of
> @@ -404,7 +407,7 @@
>     throughput will remain stable.  In the latter case, this means that
>     saturation has been reached and - more importantly - is stable.
>  
> -   In detail, the steps of the algorithm are the following
> +   In detail, the steps of the algorithm are the following:
>  
>     o  Create 4 load-bearing connections
>  
> @@ -453,7 +456,7 @@
>     the different stages of a separate network transaction as well as
>     measuring on the load-bearing connections themselves.
>  
> -   Two aspects are being measured with this approach :
> +   Two aspects are being measured with this approach:
>  
>     1.  How the network handles new connections and their different
>         stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2
> @@ -463,19 +466,19 @@
>  
>     2.  How the network and the client/server networking stack handles
>         the latency on the load-bearing connections themselves.  E.g.,
> -       Smart queuing techniques on the bottleneck will allow to keep the
> +       smart queuing techniques on the bottleneck will allow to keep the
>         latency within a reasonable limit in the network and buffer-
> -       reducing techniques like TCP_NOTSENT_LOWAT makes sure the client
> +       reducing techniques like TCP_NOTSENT_LOWAT make sure the client
>         and server TCP-stack is not a source of significant latency.
>  
>     To measure the former, we send a DNS-request, establish a TCP-
>     connection on port 443, establish a TLS-context using TLS1.3 and send
> -   an HTTP2 GET request for an object of a single byte large.  This
> +   an HTTP/2 GET request for an object the size of a single byte.  This
>     measurement will be repeated multiple times for accuracy.  Each of
>     these stages allows to collect a single latency measurement that can
>     then be factored into the responsiveness computation.
>  
> -   To measure the latter, on the load-bearing connections (that uses
> +   To measure the latter, on the load-bearing connections (that use
>     HTTP/2) a GET request is multiplexed.  This GET request is for a
>     1-byte object.  This allows to measure the end-to-end latency on the
>     connections that are using the network at full speed.
> @@ -492,10 +495,10 @@
>     an equal weight to each of these measurements.
>  
>     Finally, the resulting latency needs to be exposed to the users.
> -   Users have been trained to accept metrics that have a notion of "The
> +   Users have been trained to accept metrics that have a notion of "the
>     higher the better".  Latency measuring in units of seconds however is
>     "the lower the better".  Thus, converting the latency measurement to
> -   a frequency allows using the familiar notion of "The higher the
> +   a frequency allows using the familiar notion of "the higher the
>     better".  The term frequency has a very technical connotation.  What
>     we are effectively measuring is the number of round-trips from the
>  
> @@ -513,7 +516,7 @@
>     which is a wink to the "revolutions per minute" that we are used to
>     in cars.
>  
> -   Thus, our unit of measure is "Round-trip per Minute" (RPM) that
> +   Thus, our unit of measure is "Round-trips per Minute" (RPM) that
>     expresses responsiveness under working conditions.
>  
>  4.2.2.  Statistical Confidence
> @@ -527,13 +530,13 @@
>  5.  Protocol Specification
>  
>     By using standard protocols that are most commonly used by end-users,
> -   no new protocol needs to be specified.  However, both client and
> +   no new protocol needs to be specified.  However, both clients and
>     servers need capabilities to execute this kind of measurement as well
> -   as a standard to flow to provision the client with the necessary
> +   as a standard to follow to provision the client with the necessary
>     information.
>  
>     First, the capabilities of both the client and the server: It is
> -   expected that both hosts support HTTP/2 over TLS 1.3.  That the
> +   expected that both hosts support HTTP/2 over TLS 1.3, and that the
>     client is able to send a GET-request and a POST.  The server needs
>     the ability to serve both of these HTTP commands.  Further, the
>     server endpoint is accessible through a hostname that can be resolved
> @@ -546,13 +549,13 @@
>     1.  A config URL/response: This is the configuration file/format used
>         by the client.  It's a simple JSON file format that points the
>         client at the various URLs mentioned below.  All of the fields
> -       are required except "test_endpoint".  If the service-procier can
> +       are required except "test_endpoint".  If the service-provider can
>         pin all of the requests for a test run to a specific node in the
>         service (for a particular run), they can specify that node's name
>         in the "test_endpoint" field.  It's preferred that pinning of
>         some sort is available.  This is to ensure the measurement is
>         against the same paths and not switching hosts during a test run
> -       (ie moving from near POP A to near POP B) Sample content of this
> +       (i.e., moving from near POP A to near POP B).  Sample content of this
>         JSON would be:
>  
>  
> @@ -577,7 +580,7 @@
>  
>     3.  A "large" URL/response: This needs to serve a status code of 200
>         and a body size of at least 8GB.  The body can be bigger, and
> -       will need to grow as network speeds increases over time.  The
> +       will need to grow as network speeds increase over time.  The
>         actual body content is irrelevant.  The client will probably
>         never completely download the object.
>  
> @@ -618,16 +621,19 @@
>  Internet-Draft   Responsiveness under Working Conditions     August 2021
>  
>  
> +   [RFC6349]  ...
> +
>     [RFC8033]  Pan, R., Natarajan, P., Baker, F., and G. White,
>                "Proportional Integral Controller Enhanced (PIE): A
>                Lightweight Control Scheme to Address the Bufferbloat
>                Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
>                <https://www.rfc-editor.org/info/rfc8033>.
>  
> -   [RFC8289]  Nichols, K., Jacobson, V., McGregor, A., Ed., and J.
> -              Iyengar, Ed., "Controlled Delay Active Queue Management",
> -              RFC 8289, DOI 10.17487/RFC8289, January 2018,
> -              <https://www.rfc-editor.org/info/rfc8289>.
> +   [RFC8290]  Hoeiland-Joergensen, T., McKenney, P., Taht, D., Ed., and
> +              Gettys, J., "The Flow Queue CoDel Packet Scheduler and
> +	      Active Queue Management Algorithm", RFC 8290,
> +	      DOI 10.17487/RFC8290, January 2018,
> +              <https://www.rfc-editor.org/info/rfc8290>.
>  
>  Authors' Addresses
>  

> [--- draft-cpaasch-ippm-responsiveness-00.txt-]{+++ draft-cpaasch-ippm-responsiveness-00-ea.txt+}	2021-08-15 [-12:01:01.213813125-] {+15:08:08.013416074+} +0200
> @@ -17,7 +17,7 @@
> 
>    Bufferbloat has been a long-standing problem on the Internet with
>    more than a decade of work on standardizing technical solutions,
>    [-implementations-]
>    {+implementations,+} and testing.  However, to this date, bufferbloat is
>    still a very common problem for the end-users.  Everyone "knows" that
>    it is "normal" for a video conference to have problems when somebody
>    else on the same home-network is watching a 4K movie.
> @@ -33,8 +33,8 @@
>    methodology to evaluate bufferbloat the way common users are
>    experiencing it today, using today's most frequently used protocols
>    and mechanisms to accurately measure the user-experience.  We also
>    provide a way to express [-the-] bufferbloat as a measure of "Round-trips
>    per [-minute"-] {+Minute"+} (RPM) to have a more intuitive way for the users to
>    understand the notion of bufferbloat.
> 
> Status of This Memo
> @@ -81,14 +81,14 @@
> Table of Contents
> 
>    1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
>    2.  Measuring is [-hard-] {+Hard+} . . . . . . . . . . . . . . . . . . . . . .   3
>    3.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
>    4.  Measuring Responsiveness  . . . . . . . . . . . . . . . . . .   5
>      4.1.  Working Conditions  . . . . . . . . . . . . . . . . . . .   5
>        4.1.1.  Parallel vs Sequential Uplink and Downlink  . . . . .   6
>        4.1.2.  From [-single-flow-] {+Single-flow+} to [-multi-flow-] {+Multi-flow+}  . . . . . . . . . . .   7
>        4.1.3.  Reaching [-saturation-] {+Saturation+} . . . . . . . . . . . . . . . . .   7
>        4.1.4.  Final [-algorithm-] {+Algorithm+} . . . . . . . . . . . . . . . . . . .   7
>      4.2.  Measuring Responsiveness  . . . . . . . . . . . . . . . .   8
>        4.2.1.  Aggregating Round-trips per Minute  . . . . . . . . .   9
>        4.2.2.  Statistical Confidence  . . . . . . . . . . . . . . .  10
> @@ -103,8 +103,8 @@
> 
>    For many years, bufferbloat has been known as an unfortunately common
>    issue in todays networks [Bufferbloat].  Solutions like FQ-codel
>    [-[RFC8289]-]
>    {+[RFC8290]+} or PIE [RFC8033] have been standardized and are to some
>    [-extend-]
>    {+extent+} widely implemented.  Nevertheless, users still suffer from
>    bufferbloat.
> 
> 
> @@ -129,7 +129,7 @@
>    bufferbloat problem.
> 
>    We believe that it is necessary to create a standardized way for
>    measuring the [-extend-] {+extent+} of bufferbloat in a network and express it to
>    the user in a user-friendly way.  This should help existing
>    measurement tools to add a bufferbloat measurement to their set of
>    metrics.  It will also allow to raise the awareness to the problem
> @@ -144,10 +144,10 @@
>    classification for those protocols is very common.  It is thus very
>    important to use those protocols for the measurements to avoid
>    focusing on use-cases that are not actually affecting the end-user.
>    Finally, we propose to use [-"round-trips-] {+"Round-trips+} per [-minute"-] {+Minute"+} as a metric to
>    express the [-extend-] {+extent+} of bufferbloat.
> 
> 2.  Measuring is [-hard-] {+Hard+}
> 
>    There are several challenges around measuring bufferbloat accurately
>    on the Internet.  These challenges are due to different factors.
> @@ -155,7 +155,7 @@
>    problem space, and the reproducibility of the measurement.
> 
>    It is well-known that transparent TCP proxies are widely deployed on
>    port 443 and/or port 80, while less [-common-] {+commonly+} on other ports.  Thus,
>    choice of the port-number to measure bufferbloat has a significant
>    influence on the result.  Other factors are the protocols being used.
>    TCP and UDP traffic may take a largely different path on the Internet
> @@ -186,17 +186,17 @@
>    measurement.  It seems that it's best to avoid extending the duration
>    of the test beyond what's needed.
> 
>    The problem space around [-the-] bufferbloat is huge.  Traditionally, one
>    thinks of bufferbloat happening on the routers and switches of the
>    Internet.  Thus, simply measuring bufferbloat at the transport layer
>    would be sufficient.  However, the networking stacks of the clients
>    and servers can also experience huge amounts of bufferbloat.  Data
>    sitting in TCP sockets or waiting in the application to be scheduled
>    for sending causes artificial latency, which affects user-experience
>    the same way [-the-] "traditional" bufferbloat does.
> 
>    Finally, measuring bufferbloat requires us to fill the buffers of the
>    [-bottleneck-]
>    {+bottleneck,+} and when buffer occupancy is at its peak, the latency
>    measurement needs to be done.  Achieving this in a reliable and
>    reproducible way is not easy.  First, one needs to ensure that
>    buffers are actually full for a sustained period of time to allow for
> @@ -250,15 +250,15 @@
>        bufferbloat.
> 
>    4.  Finally, in order for this measurement to be user-friendly to a
>        wide [-audience-] {+audience,+} it is important that such a measurement finishes
>        within a short time-frame [-and-] {+with+} short being anything below 20
>        seconds.
> 
> 4.  Measuring Responsiveness
> 
>    The ability to reliably measure the responsiveness under typical
>    working conditions is predicated by the ability to reliably put the
>    network in a state representative of [-the-] said conditions.  Once the
>    network has reached the required state, its responsiveness can be
>    measured.  The following explains how the former and the latter are
>    achieved.
> @@ -270,7 +270,7 @@
>    experiencing ingress and egress flows that are similar to those when
>    used by humans in the typical day-to-day pattern.
> 
>    While any network can be put momentarily into working [-condition-] {+conditions+} by
>    the means of a single HTTP transaction, taking measurements requires
>    maintaining such conditions over sufficient time.  Thus, measuring
>    the network responsiveness in a consistent way depends on our ability
> @@ -286,7 +286,7 @@
>    way to achieve this is by creating multiple large bulk data-transfers
>    in either downstream or upstream direction.  Similar to conventional
>    speed-test applications that also create a varying number of streams
>    to measure throughput.  [-Working-conditions-]  {+Working conditions+} does the same.  It also
>    requires a way to detect when the network is in a persistent working
>    condition, called "saturation".  This can be achieved by monitoring
>    the instantaneous goodput over time.  When the goodput stops
> @@ -298,7 +298,7 @@
>    o  Should not waste traffic, since the user may be paying for it
> 
>    o  Should finish within a short time-frame to avoid impacting other
>       users on the same network and/or [-experience-] {+experiencing+} varying conditions
> 
> 4.1.1.  Parallel vs Sequential Uplink and Downlink
> 
> @@ -308,8 +308,8 @@
>    upstream) or the routing in the ISPs.  Users sending data to an
>    Internet service will fill the bottleneck on the upstream path to the
>    server and thus expose a potential for bufferbloat to happen at this
>    bottleneck.  [-On-]  {+In+} the downlink direction any download from an Internet
>    service will encounter a bottleneck and thus [-exposes-] {+expose+} another
>    potential for bufferbloat.  Thus, when measuring responsiveness under
>    working conditions it is important to consider both, the upstream and
>    the downstream bufferbloat.  This opens the door to measure both
> @@ -322,13 +322,16 @@
>    seconds of test per direction, while parallel measurement will allow
>    for 20 seconds of testing in both directions.
> 
>    However, a number caveats come with measuring in parallel:
> 
>    - [-Half-
>    duplex-] {+Half-duplex+} links may not expose uplink and downlink bufferbloat:
>      A [-half-
>    duplex-] {+half-duplex+} link may not allow [-during parallel measurement-] to saturate both the uplink
>      and the downlink [-direction.-] {+direction during parallel measurement.+}  Thus,
>      bufferbloat in either of the directions may not be exposed during
>      parallel measurement.
> 
>    - Debuggability of the results becomes more obscure:
>      During parallel measurement it is impossible to differentiate on
> 
> 
> 
> @@ -338,26 +341,26 @@
> Internet-Draft   Responsiveness under Working Conditions     August 2021
> 
> 
>      whether the bufferbloat happens in the uplink or the downlink
>      direction.
> 
> 4.1.2.  From [-single-flow-] {+Single-flow+} to [-multi-flow-] {+Multi-flow+}
> 
>    As described in [-RFC 6349,-] {+[RFC6349],+} a single TCP connection may not be
>    sufficient to saturate a path between a client and a server.  On a
>    high-BDP network, traditional TCP window-size constraints of 4MB are
>    often not sufficient to fill the pipe.  Additionally, traditional
>    loss-based TCP congestion control algorithms aggressively [-reacts-] {+react+} to
>    packet-loss by reducing the congestion window.  This reaction will
>    reduce the queuing in the network, and thus "artificially" make [-the-]
>    bufferbloat appear [-lesser.-] {+less of a problem.+}
> 
>    The goal [-of the measurement-] is to keep the network as busy as possible in a sustained
>    and persistent [-way.-] {+way during the measurement.+}  Thus, using multiple TCP
>    connections is needed for a sustained bufferbloat by gradually adding
>    TCP flows until saturation is [-needed.-] {+reached.+}
> 
> 4.1.3.  Reaching [-saturation-] {+Saturation+}
> 
>    It is best to detect when saturation has been reached so that the
>    measurement of responsiveness can start with the confidence that the
> @@ -367,8 +370,8 @@
>    buffers are completely filled.  Thus, this depends highly on the
>    congestion control that is being deployed on the sender-side.
>    Congestion control algorithms like BBR may reach high throughput
>    without causing [-bufferbloat.-] {+bufferbloat+} (because the bandwidth-detection portion
>    of BBR is effectively seeking the bottleneck [-capacity)-] {+capacity).+}
> 
>    It is advised to rather use loss-based congestion controls like Cubic
>    to "reliably" ensure that the buffers are filled.
> @@ -379,7 +382,7 @@
>    packet-loss or ECN-marks signaling a congestion or even a full buffer
>    of the bottleneck link.
> 
> 4.1.4.  Final [-algorithm-] {+Algorithm+}
> 
>    The following is a proposal for an algorithm to reach saturation of a
>    network by using HTTP/2 upload (POST) or download (GET) requests of
> @@ -404,7 +407,7 @@
>    throughput will remain stable.  In the latter case, this means that
>    saturation has been reached and - more importantly - is stable.
> 
>    In detail, the steps of the algorithm are the [-following-] {+following:+}
> 
>    o  Create 4 load-bearing connections
> 
> @@ -453,7 +456,7 @@
>    the different stages of a separate network transaction as well as
>    measuring on the load-bearing connections themselves.
> 
>    Two aspects are being measured with this [-approach :-] {+approach:+}
> 
>    1.  How the network handles new connections and their different
>        stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2
> @@ -463,19 +466,19 @@
> 
>    2.  How the network and the client/server networking stack handles
>        the latency on the load-bearing connections themselves.  E.g.,
>        [-Smart-]
>        {+smart+} queuing techniques on the bottleneck will allow to keep the
>        latency within a reasonable limit in the network and buffer-
>        reducing techniques like TCP_NOTSENT_LOWAT [-makes-] {+make+} sure the client
>        and server TCP-stack is not a source of significant latency.
> 
>    To measure the former, we send a DNS-request, establish a TCP-
>    connection on port 443, establish a TLS-context using TLS1.3 and send
>    an [-HTTP2-] {+HTTP/2+} GET request for an object {+the size+} of a single [-byte large.-] {+byte.+}  This
>    measurement will be repeated multiple times for accuracy.  Each of
>    these stages allows to collect a single latency measurement that can
>    then be factored into the responsiveness computation.
> 
>    To measure the latter, on the load-bearing connections (that [-uses-] {+use+}
>    HTTP/2) a GET request is multiplexed.  This GET request is for a
>    1-byte object.  This allows to measure the end-to-end latency on the
>    connections that are using the network at full speed.
> @@ -492,10 +495,10 @@
>    an equal weight to each of these measurements.
> 
>    Finally, the resulting latency needs to be exposed to the users.
>    Users have been trained to accept metrics that have a notion of [-"The-] {+"the+}
>    higher the better".  Latency measuring in units of seconds however is
>    "the lower the better".  Thus, converting the latency measurement to
>    a frequency allows using the familiar notion of [-"The-] {+"the+} higher the
>    better".  The term frequency has a very technical connotation.  What
>    we are effectively measuring is the number of round-trips from the
> 
> @@ -513,7 +516,7 @@
>    which is a wink to the "revolutions per minute" that we are used to
>    in cars.
> 
>    Thus, our unit of measure is [-"Round-trip-] {+"Round-trips+} per Minute" (RPM) that
>    expresses responsiveness under working conditions.
> 
> 4.2.2.  Statistical Confidence
> @@ -527,13 +530,13 @@
> 5.  Protocol Specification
> 
>    By using standard protocols that are most commonly used by end-users,
>    no new protocol needs to be specified.  However, both [-client-] {+clients+} and
>    servers need capabilities to execute this kind of measurement as well
>    as a standard to [-flow-] {+follow+} to provision the client with the necessary
>    information.
> 
>    First, the capabilities of both the client and the server: It is
>    expected that both hosts support HTTP/2 over TLS [-1.3.  That-] {+1.3, and that+} the
>    client is able to send a GET-request and a POST.  The server needs
>    the ability to serve both of these HTTP commands.  Further, the
>    server endpoint is accessible through a hostname that can be resolved
> @@ -546,13 +549,13 @@
>    1.  A config URL/response: This is the configuration file/format used
>        by the client.  It's a simple JSON file format that points the
>        client at the various URLs mentioned below.  All of the fields
>        are required except "test_endpoint".  If the [-service-procier-] {+service-provider+} can
>        pin all of the requests for a test run to a specific node in the
>        service (for a particular run), they can specify that node's name
>        in the "test_endpoint" field.  It's preferred that pinning of
>        some sort is available.  This is to ensure the measurement is
>        against the same paths and not switching hosts during a test run
>        [-(ie-]
>        {+(i.e.,+} moving from near POP A to near POP [-B)-] {+B).+}  Sample content of this
>        JSON would be:
> 
> 
> @@ -577,7 +580,7 @@
> 
>    3.  A "large" URL/response: This needs to serve a status code of 200
>        and a body size of at least 8GB.  The body can be bigger, and
>        will need to grow as network speeds [-increases-] {+increase+} over time.  The
>        actual body content is irrelevant.  The client will probably
>        never completely download the object.
> 
> @@ -618,16 +621,19 @@
> Internet-Draft   Responsiveness under Working Conditions     August 2021
> 
> 
>    {+[RFC6349]  ...+}
> 
>    [RFC8033]  Pan, R., Natarajan, P., Baker, F., and G. White,
>               "Proportional Integral Controller Enhanced (PIE): A
>               Lightweight Control Scheme to Address the Bufferbloat
>               Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
>               <https://www.rfc-editor.org/info/rfc8033>.
> 
>    [-[RFC8289]  Nichols, K., Jacobson, V., McGregor, A.,-]
> 
>    {+[RFC8290]  Hoeiland-Joergensen, T., McKenney, P., Taht, D.,+} Ed., and [-J.
>               Iyengar, Ed., "Controlled Delay-]
>               {+Gettys, J., "The Flow Queue CoDel Packet Scheduler and+}
> 	      Active Queue [-Management",-] {+Management Algorithm",+} RFC [-8289,-] {+8290,+}
> 	      DOI [-10.17487/RFC8289,-] {+10.17487/RFC8290,+} January 2018,
>               [-<https://www.rfc-editor.org/info/rfc8289>.-]
>               {+<https://www.rfc-editor.org/info/rfc8290>.+}
> 
> Authors' Addresses
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
  2021-08-18 22:01   ` Christoph Paasch
@ 2021-08-19  7:17     ` Erik Auerswald
  2021-08-19 15:48       ` Christoph Paasch
  0 siblings, 1 reply; 14+ messages in thread
From: Erik Auerswald @ 2021-08-19  7:17 UTC (permalink / raw)
  To: Christoph Paasch; +Cc: bloat, draft-cpaasch-ippm-responsiveness, ippm

Hello Christoph,

On Wed, Aug 18, 2021 at 03:01:42PM -0700, Christoph Paasch wrote:
> On 08/15/21 - 15:39, Erik Auerswald wrote:
> > [...]
> > I do not think RPM can replace all other metrics.  This is, in a way,
> > mentioned in the introduction, where it is suggested to add RPM to
> > existing measurement platforms.  As such I just want to point this out
> > more explicitely, but do not intend to diminish the RPM idea by this.
> > In short, I'd say it's complicated.
> 
> Yes, I fully agree that RPM is not the only metric. It is one among
> many.  If there is a sentiment in our document that sounds like "RPM
> is the only that matters", please let me know where so we can reword
> the text.

Regarding just this, in section 3 (Goals), item 3 (User-friendliness),
the I-D states that '[u]sers commonly look for a single "score" of their
performance.'  This can lead to the impression that RPM is intended to
provide this single score.

I do think that RPM seems more generally useful than either idle latency
or maximum bandwidth, but for a more technically minded audience, all
three provide useful information to get an impression of the usefulness
of a network for different applications.

Thanks,
Erik
-- 
Thinking doesn't guarantee that we won't make mistakes. But not thinking
guarantees that we will.
                        -- Leslie Lamport

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
  2021-08-19  7:17     ` Erik Auerswald
@ 2021-08-19 15:48       ` Christoph Paasch
  2021-08-19 17:50         ` [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt) Dave Collier-Brown
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Paasch @ 2021-08-19 15:48 UTC (permalink / raw)
  To: Erik Auerswald; +Cc: bloat, draft-cpaasch-ippm-responsiveness, ippm

Hello Erik,

On 08/19/21 - 09:17, Erik Auerswald wrote:
> Hello Christoph,
> 
> On Wed, Aug 18, 2021 at 03:01:42PM -0700, Christoph Paasch wrote:
> > On 08/15/21 - 15:39, Erik Auerswald wrote:
> > > [...]
> > > I do not think RPM can replace all other metrics.  This is, in a way,
> > > mentioned in the introduction, where it is suggested to add RPM to
> > > existing measurement platforms.  As such I just want to point this out
> > > more explicitely, but do not intend to diminish the RPM idea by this.
> > > In short, I'd say it's complicated.
> > 
> > Yes, I fully agree that RPM is not the only metric. It is one among
> > many.  If there is a sentiment in our document that sounds like "RPM
> > is the only that matters", please let me know where so we can reword
> > the text.
> 
> Regarding just this, in section 3 (Goals), item 3 (User-friendliness),
> the I-D states that '[u]sers commonly look for a single "score" of their
> performance.'  This can lead to the impression that RPM is intended to
> provide this single score.

yes we can rephrase this: https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/11

> I do think that RPM seems more generally useful than either idle latency
> or maximum bandwidth, but for a more technically minded audience, all
> three provide useful information to get an impression of the usefulness
> of a network for different applications.

I agree. Just measuring RPM is not useful. As one can have excellent RPM but
still have an Internet connection that is barely usable.

However, I still believe that a single score for the user would be great
(that score would not be RPM though). This score should group together a
large list of network-properties (RPM, goodput, idle latency, protocol
conformance,...) and express a value of utility to the user that express how
its user-experience is affected. It would make it much easier for non-technical
users to compare the quality of their Internet without just focusing on a
single throughput-metric.

But that is a different topic than RPM ;-)



Cheers,
Christoph


> 
> Thanks,
> Erik
> -- 
> Thinking doesn't guarantee that we won't make mistakes. But not thinking
> guarantees that we will.
>                         -- Leslie Lamport

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt)
  2021-08-19 15:48       ` Christoph Paasch
@ 2021-08-19 17:50         ` Dave Collier-Brown
  2021-08-19 21:17           ` Kenneth Porter
  2021-08-21 10:23           ` Erik Auerswald
  0 siblings, 2 replies; 14+ messages in thread
From: Dave Collier-Brown @ 2021-08-19 17:50 UTC (permalink / raw)
  To: bloat

On 2021-08-19 11:48 a.m., Christoph Paasch via Bloat wrote:

> I agree. Just measuring RPM is not useful. As one can have excellent RPM but
> still have an Internet connection that is barely usable.
>
> However, I still believe that a single score for the user would be great
> (that score would not be RPM though). This score should group together a
> large list of network-properties (RPM, goodput, idle latency, protocol
> conformance,...) and express a value of utility to the user that express how
> its user-experience is affected. It would make it much easier for non-technical
> users to compare the quality of their Internet without just focusing on a
> single throughput-metric.
>
> But that is a different topic than RPM ;-)
>
I can't actually draw a picture of it here, but there's a good way to
show multiple limiting factor graphically. One is with a bucket,
http://www.imthird.org/the-limiting-factor-concept, but for comparisons
I like a simpler one

For an example, imagine a large letter "Y" with the stem labelled
"throughput", one arm labelled "latency" and the other "RPM". There's  a
dotted-line circle drawn over it so that the two arms touch the circle,
and the stem sticks out through it.

The diameter of the circle is the "goodness" of the connection, and you
can line up diagrams like this for Bell, Rogers and Telus and see that
they all have fine throughput but lousy latency and RPM. And it extends
to multiple dimensions as well as multiple suppliers, so you could have
a five-legged Y if you wanted. I like three, myself.

I first saw this used for nutrients in the first edition of "Diet for a
Small Planet", but the editors found it too nerdy and talked the author
into taking it out (:-()

--dave

--
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com |              -- Mark Twain


CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt)
  2021-08-19 17:50         ` [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt) Dave Collier-Brown
@ 2021-08-19 21:17           ` Kenneth Porter
  2021-08-20  1:58             ` Dave Collier-Brown
  2021-08-21 10:23           ` Erik Auerswald
  1 sibling, 1 reply; 14+ messages in thread
From: Kenneth Porter @ 2021-08-19 21:17 UTC (permalink / raw)
  To: bloat

--On Thursday, August 19, 2021 2:50 PM -0400 Dave Collier-Brown 
<dave.collier-brown@indexexchange.com> wrote:

> I can't actually draw a picture of it here, but there's a good way to
> show multiple limiting factor graphically. One is with a bucket,
> http://www.imthird.org/the-limiting-factor-concept, but for comparisons
> I like a simpler one
>
> For an example, imagine a large letter "Y" with the stem labelled
> "throughput", one arm labelled "latency" and the other "RPM". There's  a
> dotted-line circle drawn over it so that the two arms touch the circle,
> and the stem sticks out through it.
>
> The diameter of the circle is the "goodness" of the connection, and you
> can line up diagrams like this for Bell, Rogers and Telus and see that
> they all have fine throughput but lousy latency and RPM. And it extends
> to multiple dimensions as well as multiple suppliers, so you could have
> a five-legged Y if you wanted. I like three, myself.
>
> I first saw this used for nutrients in the first edition of "Diet for a
> Small Planet", but the editors found it too nerdy and talked the author
> into taking it out (:-()

American innumeracy is so frustrating.

Do you have a link to the "Y" illustration? I'm having trouble visualizing 
it. Is the Y inverted? I don't understand what you mean by the stem going 
through the circle if the arms are touching it, or what the stem means.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt)
  2021-08-19 21:17           ` Kenneth Porter
@ 2021-08-20  1:58             ` Dave Collier-Brown
  2021-08-21  1:22               ` Kenneth Porter
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Collier-Brown @ 2021-08-20  1:58 UTC (permalink / raw)
  To: bloat

Look at the barrel link, in that case: I'll send you a sketch off-list

--dave


On 2021-08-19 5:17 p.m., Kenneth Porter wrote:
> --On Thursday, August 19, 2021 2:50 PM -0400 Dave Collier-Brown
> <dave.collier-brown@indexexchange.com> wrote:
>
>> I can't actually draw a picture of it here, but there's a good way to
>> show multiple limiting factor graphically. One is with a bucket,
>> http://www.imthird.org/the-limiting-factor-concept, but for comparisons
>> I like a simpler one
>>
>> For an example, imagine a large letter "Y" with the stem labelled
>> "throughput", one arm labelled "latency" and the other "RPM". There's  a
>> dotted-line circle drawn over it so that the two arms touch the circle,
>> and the stem sticks out through it.
>>
>> The diameter of the circle is the "goodness" of the connection, and you
>> can line up diagrams like this for Bell, Rogers and Telus and see that
>> they all have fine throughput but lousy latency and RPM. And it extends
>> to multiple dimensions as well as multiple suppliers, so you could have
>> a five-legged Y if you wanted. I like three, myself.
>>
>> I first saw this used for nutrients in the first edition of "Diet for a
>> Small Planet", but the editors found it too nerdy and talked the author
>> into taking it out (:-()
>
> American innumeracy is so frustrating.
>
> Do you have a link to the "Y" illustration? I'm having trouble
> visualizing it. Is the Y inverted? I don't understand what you mean by
> the stem going through the circle if the arms are touching it, or what
> the stem means.
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

--
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com |              -- Mark Twain


CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt)
  2021-08-20  1:58             ` Dave Collier-Brown
@ 2021-08-21  1:22               ` Kenneth Porter
  2021-08-21 11:01                 ` Sebastian Moeller
  0 siblings, 1 reply; 14+ messages in thread
From: Kenneth Porter @ 2021-08-21  1:22 UTC (permalink / raw)
  To: bloat

On 8/19/2021 6:58 PM, Dave Collier-Brown wrote:
> Look at the barrel link, in that case: I'll send you a sketch off-list 

Ok, the sketch is of a spoked wheel with 3 spokes, for throughput, 
latency, and RPM, and the spoke for throughput is much longer. The 
circle represents the spoke of smallest radius, indicating the worst 
rating of the service. The ISP will try to sell based on the longest 
spoke to make itself look better than the actual user experience.

It's like taking the barrel illustration and "exploding" the barrel so 
its staves lie flat, radiating from the base. In that case, the shortest 
stave is the worst rating.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt)
  2021-08-19 17:50         ` [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt) Dave Collier-Brown
  2021-08-19 21:17           ` Kenneth Porter
@ 2021-08-21 10:23           ` Erik Auerswald
  2021-08-21 16:31             ` Dave Collier-Brown
  1 sibling, 1 reply; 14+ messages in thread
From: Erik Auerswald @ 2021-08-21 10:23 UTC (permalink / raw)
  To: Dave Collier-Brown; +Cc: bloat

Hi,

On Thu, Aug 19, 2021 at 01:50:01PM -0400, Dave Collier-Brown wrote:
> 
> I can't actually draw a picture of it here, but there's a good way to
> show multiple limiting factor graphically. One is with a bucket,
> http://www.imthird.org/the-limiting-factor-concept, but for comparisons
> I like a simpler one
> 
> For an example, imagine a large letter "Y" with the stem labelled
> "throughput", one arm labelled "latency" and the other "RPM". There's  a
> dotted-line circle drawn over it so that the two arms touch the circle,
> and the stem sticks out through it.

Those both have the problem that both throughput and RPM are "the higher
the better", while latency is "the lower the better."

Perhaps the idea of a "score" (e.g., A through F) assigned to each
individual metric that is then combined to produce an "overall score" is
not that bad?  It is not perfect, it hides details, and it cannot please
everyone, but might be useful.

(I personally always want to see the base values and the method to
comupte both the individual and the overall score, so I can assess the
applicability of the score to my requirements.)

Thanks,
Erik
-- 
It's impossible to learn very much by simply sitting in a lecture,
or even by simply doing problems that are assigned.
                        -- Richard P. Feynman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt)
  2021-08-21  1:22               ` Kenneth Porter
@ 2021-08-21 11:01                 ` Sebastian Moeller
  0 siblings, 0 replies; 14+ messages in thread
From: Sebastian Moeller @ 2021-08-21 11:01 UTC (permalink / raw)
  To: Kenneth Porter; +Cc: bloat

So here is my take,

to assess the usefulness of an internet access link (any link probably, but being an end user myself, I want to limit myself to where I have first hand experience) there are a number of easier/harder to get numbers:

1) access rate: easy to measure (at least as HTTPS/TCP/IP goodput, gross rate of the bottleneck link is considerable harder to deduce)

2) latency to the end-point under test: also easy to measure (simply by running an ICMP probe stream before and during a speedtest; also some more enlightened speedtests already report both RTTs)
	 IMHO that actually should be a tuple of 
	latency without load RTT(unloaded) , and latency under saturating conditions RTT(saturated)
	=> if both numbers exist, the difference RTT(saturated) - RTT(unloaded) = Latency under load increase (LULI ;)) can be calculated
		which is a proxy of the level of latency degradation a user can expect as a function of load.

3) latency variation/jitter: also reasonable easy to measure with tools like MTR. Quite a number of use-cases work well with moderately high, but static RTTs but deteriorate quickly if the latency becomes too variable
	(Some access technologies, like WiFi or DOCSIS are prone to introduce jitter, as is XDSL when ITU G.998.4 G.INP retransmissions enter the picture)

4) packet loss: relatively hard to measure, but some tools like iperf seem to report retransmissions, 

5) packet re-ordering: hard to measure (IIUC) if severe enough this will manifest as apparent packet loss/replay attack
	

RPM. IMHO is nice additional way to address 2) above, but personally, I am less interested in how many hypothetical transactions I could perform per second as compared to how long each takes, as the latter still allows me to make reasonable guesses of completion time if I issue transactions in parallel, and the LULI measure can be reasonably compared with what ever latency budget a given task has.


Any attempt as trying to display all these five in one consistent plot is IMHO challenging, because such a plot should use a scale of equal severity for all its axis, otherwise it becomes really hard to parse as an aggregate, no?

Regards
        



> On Aug 21, 2021, at 03:22, Kenneth Porter <shiva@sewingwitch.com> wrote:
> 
> On 8/19/2021 6:58 PM, Dave Collier-Brown wrote:
>> Look at the barrel link, in that case: I'll send you a sketch off-list 
> 
> Ok, the sketch is of a spoked wheel with 3 spokes, for throughput, latency, and RPM, and the spoke for throughput is much longer. The circle represents the spoke of smallest radius, indicating the worst rating of the service. The ISP will try to sell based on the longest spoke to make itself look better than the actual user experience.
> 
> It's like taking the barrel illustration and "exploding" the barrel so its staves lie flat, radiating from the base. In that case, the shortest stave is the worst rating.
> 
> 
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt)
  2021-08-21 10:23           ` Erik Auerswald
@ 2021-08-21 16:31             ` Dave Collier-Brown
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Collier-Brown @ 2021-08-21 16:31 UTC (permalink / raw)
  To: Erik Auerswald; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 2287 bytes --]

I like the idea, especially if you did it it as an actual "card", like a school report card

On 2021-08-21 6:23 a.m., Erik Auerswald wrote:

Perhaps the idea of a "score" (e.g., A through F) assigned to each
individual metric that is then combined to produce an "overall score" is
not that bad?  It is not perfect, it hides details, and it cannot please
everyone, but might be useful.

(I personally always want to see the base values and the method to
comupte both the individual and the overall score, so I can assess the
applicability of the score to my requirements.)


How about:

                  Report Card for Little Johnny Rogers (rogers.com)

    Latency:          F  (200ms, where 42 expected for a round-trip to San Francisco)

    Throughput, Up:   B  (16.87 Mb/S where 20 advertised)

    Throughput, Down: A  (182.1 Mb/S where 175 advertised)

    RPM:              D- (23 where 65 expected)

    ----------------------------------------------------------------------------------

    Overall:          F (Lowest of the above marks)




--dave

--
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com<mailto:dave.collier-brown@indexexchange.com> |              -- Mark Twain


CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.

[-- Attachment #2: Type: text/html, Size: 3720 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] [ippm] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
  2021-08-15 13:39 ` Erik Auerswald
  2021-08-18 22:01   ` Christoph Paasch
@ 2021-09-21 20:50   ` Toerless Eckert
  2021-10-22 23:19     ` Christoph Paasch
  1 sibling, 1 reply; 14+ messages in thread
From: Toerless Eckert @ 2021-09-21 20:50 UTC (permalink / raw)
  To: Erik Auerswald; +Cc: bloat, draft-cpaasch-ippm-responsiveness, ippm

Dear authors

Thanks for the draft

a) Can you please update naming of the draft so people remembering RPM will find the draft ?
   Something like:

   draft-cpaasch-ippm-rpm-bufferbloat-metric-00
   Round-trips Per Minute (RPM) under load - a Metric for bufferbloat.

b) The draft does not mention, or at least does not have a
   separate section to discuss where the server is against which the test is run.
   It should have such a section. I can hink of at least two key options,
   - the server used for the service in question (e.g.: where contents comes from),
   - a server at a wll defined location in the access network provider.

c) I fear that b) leads to be biggest current issue with the metric:
   The longer the path is, such as full path to a server, the more useful the
   metric is for the user. But the user will effectively get a per-service metric.
   To make this more fun to the authors: Imagine the appleTV server nodes have a worse
   path to a particular user than the Netflix servers. Or vice versa.

   If we just use a path to some fixed point in the access provider,
   then we take away the users ability to beat up their OTT services to
   improve their paths. 

   If we use only a path toward the service, it will be harder to 
   hit on the service provider, if the service provider is bad.

   So, obviously, i would like to have all three RPM: to Netflix, AppleTV
   and a well defined server in Comcast. Then i can triangulate where
   my bufferbloat problem is.

d) Worst yet, without having seen more example numbers (a reference pointing
   to some good collected RPM numbers would be excellent), my
   concern is that instead of fixing bufferbloat on paths, we would simply
   encourage OTT to co-locate servers to the access providers own measurement
   point, aka: as close to the subscriber.

e) To solve d), maybe two ideas:

   - relevant to improve bufferbloat is only (lRPM - iRPM), where
     lRPM would be your current RPM, e.g.: under (l)oaded condition),
     and iRPM is idle RPM. This still does not take away from the
     fact that a path with more queuing hops or higher queue loads
     will fare worse than the shorter physcial propagation latency path,
     but it does mke the metric significantly be focussed on queueing,
     and should help a lot when we do compare service that might not
     have servers in the users metro area.

   - lRPM/m - RPM under load per mile (roughly).
     - Measure idle RTT in units of msec (iRTT)

     - Measure load RTT in units of msec (lRTT)

     - Just take iRTT as a measure for the path lenth.
       normalizing it absolutely is not of first order
       important, we are primarily interested in relative number,
       and this keeps the example calculation simple.

     - the RTT increase because of queueing is (lRTT - iRTT).

     - (lRTT - iRTT) / iRTT is therefore something like queuing RTT
       per path stretch. I think this is th relative number we want.

     - RPM = iRTT / (lRTT - iRTT) * 1000 turns this into some 
       number increasing with desired non-bufferbloat performance
       with enough significant in non fractionals.

     - Example: 
        idle RTT:  5msec, loaded RTT: 20 msec =>  333 RPM
        idle RTT: 10msec, loaded RTT: 20 msec => 1000 RPM
        idle RTT: 15msec, loaded RTT: 20 msec => 3000 RPM

        This nicely shows how the RPM will go up when the physcial
        path itself gets longer, but the relevant load RTT stays 
        the same.

        idle RTT:  5msec, loaded RTT: 20 msec =>  333 RPM
        idle RTT: 10msec, loaded RTT: 40 msec =>  333 RPM
        idle RTT: 15msec, loaded RTT: 60 msec =>  333 RPM

        This nicely shows that we can have servers at different
        physical distance and get the same RPM number, when the
        bufferbloat is the same, e.g.: 15 msec worth of bufferbloat
        for every 5msec propagation latency segment.

f) I can see how you do NOT want the type of metric i am
   proposing, because it only focusses on the bufferbloat
   factor, and you may want to stick to the full experience of
   the user, where unmistakingly the propagation latency can
   not be ignored, but to repeat from above:

   If we do not use a metric that fairly treats paths of different
   propagation latencies as the same wrt. performance, i am
   quite persuaded we will continue to just see big services
   win out, because hey can more easily afford to get closer
   to the user with their (rented/time-shared/owned) servers.

   Aka: Right now RPM is a metric that will specifically 
   make it easier for one of the big providers of sttreaming
   such as that of the authors to position themselves better
   against smaller services streaming from further away.

Cheers
    Toerless
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Bloat] [ippm] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
  2021-09-21 20:50   ` [Bloat] [ippm] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt Toerless Eckert
@ 2021-10-22 23:19     ` Christoph Paasch
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Paasch @ 2021-10-22 23:19 UTC (permalink / raw)
  To: Toerless Eckert
  Cc: Erik Auerswald, ippm, draft-cpaasch-ippm-responsiveness, bloat

Hello Toerless,

thanks for your feedback! Please see inline:

> On Sep 21, 2021, at 1:50 PM, Toerless Eckert <tte@cs.fau.de> wrote:
> 
> Dear authors
> 
> Thanks for the draft
> 
> a) Can you please update naming of the draft so people remembering RPM will find the draft ?
>   Something like:
> 
>   draft-cpaasch-ippm-rpm-bufferbloat-metric-00
>   Round-trips Per Minute (RPM) under load - a Metric for bufferbloat.

That's a good point! How does draft-cpaasch-ippm-responsiveness-rpm-00 sound?

I prefer not to have bufferbloat in the name as it is a loaded term (there are different interpretations of it) and easily misunderstood. Some consider bufferbloat to only be a problem on the routers/switches, while others consider it to be an end-to-end problem. Our test methodology measures end-to-end responsiveness and end-to-end here goes all the way up to the HTTP/2 implementation.

> b) The draft does not mention, or at least does not have a
>   separate section to discuss where the server is against which the test is run.
>   It should have such a section. I can hink of at least two key options,
>   - the server used for the service in question (e.g.: where contents comes from),
>   - a server at a wll defined location in the access network provider.

I see your point. I don't think the draft needs to "mandate" where one should put those servers. But rather a discussion on where one may want to place the server depending on what one is measuring.

A WiFi AP vendor may want to deploy the server locally in his testbed. A content-provider wants to host the server on its content-servers. And an ISP wants to host it probably at the border of its domain.

> c) I fear that b) leads to be biggest current issue with the metric:
>   The longer the path is, such as full path to a server, the more useful the
>   metric is for the user. But the user will effectively get a per-service metric.
>   To make this more fun to the authors: Imagine the appleTV server nodes have a worse
>   path to a particular user than the Netflix servers. Or vice versa.
> 
>   If we just use a path to some fixed point in the access provider,
>   then we take away the users ability to beat up their OTT services to
>   improve their paths. 
> 
>   If we use only a path toward the service, it will be harder to 
>   hit on the service provider, if the service provider is bad.
> 
>   So, obviously, i would like to have all three RPM: to Netflix, AppleTV
>   and a well defined server in Comcast. Then i can triangulate where
>   my bufferbloat problem is.

Yes, the location of the server has a huge impact on the resulting number. The goal with responsiveness is to measure user-experience. If the user uses Netflix, AppleTV and Comcast streaming, that's what the user is interested in. If the user also uses some remote streaming service on the other side of the ocean, that's as well the responsiveness number the user would be interested in.

> d) Worst yet, without having seen more example numbers (a reference pointing
>   to some good collected RPM numbers would be excellent), my
>   concern is that instead of fixing bufferbloat on paths, we would simply
>   encourage OTT to co-locate servers to the access providers own measurement
>   point, aka: as close to the subscriber.

Deploying close to the subscriber does not yet fix the bufferbloat-problem. If the last-mile or the HTTP-implementation has bufferbloat issues, they are still going to present a bad RPM-number. Sure, once we have eradicated bufferbloat from the Internet, then closeness to the subscriber will become a more driving factor. And that would be a good problem to have :-)

> 
> e) To solve d), maybe two ideas:
> 
>   - relevant to improve bufferbloat is only (lRPM - iRPM), where
>     lRPM would be your current RPM, e.g.: under (l)oaded condition),
>     and iRPM is idle RPM. This still does not take away from the
>     fact that a path with more queuing hops or higher queue loads
>     will fare worse than the shorter physcial propagation latency path,
>     but it does mke the metric significantly be focussed on queueing,
>     and should help a lot when we do compare service that might not
>     have servers in the users metro area.

The difficulty here is that it is near impossible to really measure iRPM. Because, one cannot know whether the network really is idle or not.

> 
>   - lRPM/m - RPM under load per mile (roughly).
>     - Measure idle RTT in units of msec (iRTT)
> 
>     - Measure load RTT in units of msec (lRTT)
> 
>     - Just take iRTT as a measure for the path lenth.
>       normalizing it absolutely is not of first order
>       important, we are primarily interested in relative number,
>       and this keeps the example calculation simple.
> 
>     - the RTT increase because of queueing is (lRTT - iRTT).
> 
>     - (lRTT - iRTT) / iRTT is therefore something like queuing RTT
>       per path stretch. I think this is th relative number we want.
> 
>     - RPM = iRTT / (lRTT - iRTT) * 1000 turns this into some 
>       number increasing with desired non-bufferbloat performance
>       with enough significant in non fractionals.
> 
>     - Example: 
>        idle RTT:  5msec, loaded RTT: 20 msec =>  333 RPM
>        idle RTT: 10msec, loaded RTT: 20 msec => 1000 RPM
>        idle RTT: 15msec, loaded RTT: 20 msec => 3000 RPM
> 
>        This nicely shows how the RPM will go up when the physcial
>        path itself gets longer, but the relevant load RTT stays 
>        the same.
> 
>        idle RTT:  5msec, loaded RTT: 20 msec =>  333 RPM
>        idle RTT: 10msec, loaded RTT: 40 msec =>  333 RPM
>        idle RTT: 15msec, loaded RTT: 60 msec =>  333 RPM
> 
>        This nicely shows that we can have servers at different
>        physical distance and get the same RPM number, when the
>        bufferbloat is the same, e.g.: 15 msec worth of bufferbloat
>        for every 5msec propagation latency segment.
> 
> f) I can see how you do NOT want the type of metric i am
>   proposing, because it only focusses on the bufferbloat
>   factor, and you may want to stick to the full experience of
>   the user, where unmistakingly the propagation latency can
>   not be ignored, but to repeat from above:
> 
>   If we do not use a metric that fairly treats paths of different
>   propagation latencies as the same wrt. performance, i am
>   quite persuaded we will continue to just see big services
>   win out, because hey can more easily afford to get closer
>   to the user with their (rented/time-shared/owned) servers.
> 
>   Aka: Right now RPM is a metric that will specifically 
>   make it easier for one of the big providers of sttreaming
>   such as that of the authors to position themselves better
>   against smaller services streaming from further away.

The goal of the responsiveness measurement is not for it to be a diagnostic tool that allows to find the accurate location of bufferbloat. It is also not a tool to make some streaming service providers look better than others and/or make one "win" against the other.

The goal really is to have an accurate representation of responsiveness under working conditions. If the server end-point is at a very remote location, then indeed the RPM number will be lower. And the responsiveness measurement should reflect that, because it tries to assess the user-experience.


With the standardization of the methodology our hope is that content-providers (small or big) would adopt this methodology so that they can properly assess the network's quality for their users and their services.


Cheers,
Christoph


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-10-22 23:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-13 21:41 [Bloat] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt Christoph Paasch
2021-08-15 13:39 ` Erik Auerswald
2021-08-18 22:01   ` Christoph Paasch
2021-08-19  7:17     ` Erik Auerswald
2021-08-19 15:48       ` Christoph Paasch
2021-08-19 17:50         ` [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt) Dave Collier-Brown
2021-08-19 21:17           ` Kenneth Porter
2021-08-20  1:58             ` Dave Collier-Brown
2021-08-21  1:22               ` Kenneth Porter
2021-08-21 11:01                 ` Sebastian Moeller
2021-08-21 10:23           ` Erik Auerswald
2021-08-21 16:31             ` Dave Collier-Brown
2021-09-21 20:50   ` [Bloat] [ippm] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt Toerless Eckert
2021-10-22 23:19     ` Christoph Paasch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox