From: Erik Auerswald <auerswal@unix-ag.uni-kl.de>
To: bloat@lists.bufferbloat.net
Cc: draft-cpaasch-ippm-responsiveness@ietf.org, ippm@ietf.org
Subject: Re: [Bloat] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
Date: Sun, 15 Aug 2021 15:39:22 +0200 [thread overview]
Message-ID: <20210815133922.GA18118@unix-ag.uni-kl.de> (raw)
In-Reply-To: <YRbm8ZqLdi3xs3bl@MacBook-Pro-2.local>
[-- Attachment #1: Type: text/plain, Size: 7157 bytes --]
Hi,
I'd like to thank you for working on a nice I-D describing an interesting
and IMHO useful network measurement metric.
Since feedback was asked for, I'd like to try and provide constructive
feedback.
In general, I like the idea of "Round-trips per Minute" (RPM) as a
metric used to characterize (one aspect of) a network. I do think that
introducing this would improve the status quo. Since this RPM definition
comprises a specific way of adding load to the network and measuring a
complex metric, I think it is useful to "standardize" it.
I do not think RPM can replace all other metrics. This is, in a way,
mentioned in the introduction, where it is suggested to add RPM to
existing measurement platforms. As such I just want to point this out
more explicitely, but do not intend to diminish the RPM idea by this.
In short, I'd say it's complicated.
Bandwidth matters for bulk data transfer, e.g., downloading a huge update
required for playing a multiplayer game online.
Minimum latency matters for the feasibility of interactive applications,
e.g., controlling a toy car in your room vs. a robotic arm on the ISS
from Earth vs. orbital insertion around Mars from Earth. For a more
mundane use case consider a voice conference. (A good decade ago I
experienced a voice conferencing system running over IP that introduced
over one second of (minimum) latency and therefore was awkward to use.)
Expressing 'bufferbloat as a measure of "Round-trips per Minute" (RPM)'
exhibits (at least) two problems:
1. A high RPM value is associated with little bufferbloat problems.
2. A low RPM value may be caused by high minimum delay instead of
bufferbloat.
I think that RPM (i.e., under working conditions) measures a network's
usefulness for interactive applications, but not necessarily bufferbloat.
I do think that RPM is in itself more generally useful than minimum
latency or bandwidth.
A combination of low minimum latency with low RPM value strongly hints
at bufferbloat. Other combinations are less easily characterized.
Bufferbloat can still lie in hiding, e.g., when a link with bufferbloat
is not yet the bottleneck, or if the communications end-points are not
yet able to saturate the network inbetween. Thus high bandwidth can
result in high RPM values despite (hidden) bufferbloat.
The "Measuring is Hard" section mentions additional complications.
All in all, I do think that "measuring bufferbloat" and "measuring RPM"
should not be used synonymously. The I-D title clearly shows this:
RPM is measuring "Responsiveness under Working Conditions" which may be
affected by bufferbloat, among other potential factors, but is not in
itself bufferbloat.
Under the assumption that only a single value (performance score) is
considered, I do think that RPM is more generally useful than bandwidth
or idle latency.
On a meta-level, I think that the word "bufferbloat" is not used according
to a single self-consistent definition in the I-D.
Additionally, I think that the I-D should reference DNS, HTTP/2, and
TLS 1.3, since these protocols are required for implementing the RPM
measurement. The same for JSON, I think. Possibly URL.
Using "rpm.example" instead of "example.apple.com" would result in shorter
lines for the example JSON.
"host123.cdn.example" instead of "hostname123.cdnprovider.com" might be
a more appropriate example DNS name.
Adding an informative reference to RFC 2606 / BCP 32 might raise awareness
of the existence of a BCP on example DNS names.
Please find both a unified diff against the text rendering of the I-D,
and a word diff produced from the unified diff, attached to this email
in order to suggest editorial changes that are intended to improve the
reading experience. They are intended for reading and (possibly partial)
manual application, since the text rendering of an I-D is usually not
the preferred form of editing it.
Thanks,
Erik
--
Always use the right tool for the job.
-- Rob Pike
On Fri, Aug 13, 2021 at 02:41:05PM -0700, Christoph Paasch via Bloat wrote:
> I already posted this to the RPM-list, but the audience here on bloat should
> be interested as well.
>
>
> This is the specification of Apple's responsiveness/RPM test. We believe that it
> would be good for the bufferbloat-effort to have a specification of how to
> quantify the extend of bufferbloat from a user's perspective. Our
> Internet-draft is a first step in that direction and we hope that it will
> kick off some collaboration.
>
>
> Feedback is very welcome!
>
>
> Cheers,
> Christoph
>
>
> ----- Forwarded message from internet-drafts@ietf.org -----
>
> From: internet-drafts@ietf.org
> To: Christoph Paasch <cpaasch@apple.com>, Omer Shapira <oesh@apple.com>, Randall Meyer <rrm@apple.com>, Stuart Cheshire
> <cheshire@apple.com>
> Date: Fri, 13 Aug 2021 09:43:40 -0700
> Subject: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
>
>
> A new version of I-D, draft-cpaasch-ippm-responsiveness-00.txt
> has been successfully submitted by Christoph Paasch and posted to the
> IETF repository.
>
> Name: draft-cpaasch-ippm-responsiveness
> Revision: 00
> Title: Responsiveness under Working Conditions
> Document date: 2021-08-13
> Group: Individual Submission
> Pages: 12
> URL: https://www.ietf.org/archive/id/draft-cpaasch-ippm-responsiveness-00.txt
> Status: https://datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/
> Htmlized: https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness
>
>
> Abstract:
> Bufferbloat has been a long-standing problem on the Internet with
> more than a decade of work on standardizing technical solutions,
> implementations and testing. However, to this date, bufferbloat is
> still a very common problem for the end-users. Everyone "knows" that
> it is "normal" for a video conference to have problems when somebody
> else on the same home-network is watching a 4K movie.
>
> The reason for this problem is not the lack of technical solutions,
> but rather a lack of awareness of the problem-space, and a lack of
> tooling to accurately measure the problem. We believe that exposing
> the problem of bufferbloat to the end-user by measuring the end-
> users' experience at a high level will help to create the necessary
> awareness.
>
> This document is a first attempt at specifying a measurement
> methodology to evaluate bufferbloat the way common users are
> experiencing it today, using today's most frequently used protocols
> and mechanisms to accurately measure the user-experience. We also
> provide a way to express the bufferbloat as a measure of "Round-trips
> per minute" (RPM) to have a more intuitive way for the users to
> understand the notion of bufferbloat.
>
>
>
>
> The IETF Secretariat
>
>
>
> ----- End forwarded message -----
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
[-- Attachment #2: editorial-suggestions-2021-08-15-unified.diff --]
[-- Type: text/x-diff, Size: 19237 bytes --]
--- draft-cpaasch-ippm-responsiveness-00.txt 2021-08-15 12:01:01.213813125 +0200
+++ draft-cpaasch-ippm-responsiveness-00-ea.txt 2021-08-15 15:08:08.013416074 +0200
@@ -17,7 +17,7 @@
Bufferbloat has been a long-standing problem on the Internet with
more than a decade of work on standardizing technical solutions,
- implementations and testing. However, to this date, bufferbloat is
+ implementations, and testing. However, to this date, bufferbloat is
still a very common problem for the end-users. Everyone "knows" that
it is "normal" for a video conference to have problems when somebody
else on the same home-network is watching a 4K movie.
@@ -33,8 +33,8 @@
methodology to evaluate bufferbloat the way common users are
experiencing it today, using today's most frequently used protocols
and mechanisms to accurately measure the user-experience. We also
- provide a way to express the bufferbloat as a measure of "Round-trips
- per minute" (RPM) to have a more intuitive way for the users to
+ provide a way to express bufferbloat as a measure of "Round-trips
+ per Minute" (RPM) to have a more intuitive way for the users to
understand the notion of bufferbloat.
Status of This Memo
@@ -81,14 +81,14 @@
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
- 2. Measuring is hard . . . . . . . . . . . . . . . . . . . . . . 3
+ 2. Measuring is Hard . . . . . . . . . . . . . . . . . . . . . . 3
3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Measuring Responsiveness . . . . . . . . . . . . . . . . . . 5
4.1. Working Conditions . . . . . . . . . . . . . . . . . . . 5
4.1.1. Parallel vs Sequential Uplink and Downlink . . . . . 6
- 4.1.2. From single-flow to multi-flow . . . . . . . . . . . 7
- 4.1.3. Reaching saturation . . . . . . . . . . . . . . . . . 7
- 4.1.4. Final algorithm . . . . . . . . . . . . . . . . . . . 7
+ 4.1.2. From Single-flow to Multi-flow . . . . . . . . . . . 7
+ 4.1.3. Reaching Saturation . . . . . . . . . . . . . . . . . 7
+ 4.1.4. Final Algorithm . . . . . . . . . . . . . . . . . . . 7
4.2. Measuring Responsiveness . . . . . . . . . . . . . . . . 8
4.2.1. Aggregating Round-trips per Minute . . . . . . . . . 9
4.2.2. Statistical Confidence . . . . . . . . . . . . . . . 10
@@ -103,8 +103,8 @@
For many years, bufferbloat has been known as an unfortunately common
issue in todays networks [Bufferbloat]. Solutions like FQ-codel
- [RFC8289] or PIE [RFC8033] have been standardized and are to some
- extend widely implemented. Nevertheless, users still suffer from
+ [RFC8290] or PIE [RFC8033] have been standardized and are to some
+ extent widely implemented. Nevertheless, users still suffer from
bufferbloat.
@@ -129,7 +129,7 @@
bufferbloat problem.
We believe that it is necessary to create a standardized way for
- measuring the extend of bufferbloat in a network and express it to
+ measuring the extent of bufferbloat in a network and express it to
the user in a user-friendly way. This should help existing
measurement tools to add a bufferbloat measurement to their set of
metrics. It will also allow to raise the awareness to the problem
@@ -144,10 +144,10 @@
classification for those protocols is very common. It is thus very
important to use those protocols for the measurements to avoid
focusing on use-cases that are not actually affecting the end-user.
- Finally, we propose to use "round-trips per minute" as a metric to
- express the extend of bufferbloat.
+ Finally, we propose to use "Round-trips per Minute" as a metric to
+ express the extent of bufferbloat.
-2. Measuring is hard
+2. Measuring is Hard
There are several challenges around measuring bufferbloat accurately
on the Internet. These challenges are due to different factors.
@@ -155,7 +155,7 @@
problem space, and the reproducibility of the measurement.
It is well-known that transparent TCP proxies are widely deployed on
- port 443 and/or port 80, while less common on other ports. Thus,
+ port 443 and/or port 80, while less commonly on other ports. Thus,
choice of the port-number to measure bufferbloat has a significant
influence on the result. Other factors are the protocols being used.
TCP and UDP traffic may take a largely different path on the Internet
@@ -186,17 +186,17 @@
measurement. It seems that it's best to avoid extending the duration
of the test beyond what's needed.
- The problem space around the bufferbloat is huge. Traditionally, one
+ The problem space around bufferbloat is huge. Traditionally, one
thinks of bufferbloat happening on the routers and switches of the
Internet. Thus, simply measuring bufferbloat at the transport layer
would be sufficient. However, the networking stacks of the clients
and servers can also experience huge amounts of bufferbloat. Data
sitting in TCP sockets or waiting in the application to be scheduled
for sending causes artificial latency, which affects user-experience
- the same way the "traditional" bufferbloat does.
+ the same way "traditional" bufferbloat does.
Finally, measuring bufferbloat requires us to fill the buffers of the
- bottleneck and when buffer occupancy is at its peak, the latency
+ bottleneck, and when buffer occupancy is at its peak, the latency
measurement needs to be done. Achieving this in a reliable and
reproducible way is not easy. First, one needs to ensure that
buffers are actually full for a sustained period of time to allow for
@@ -250,15 +250,15 @@
bufferbloat.
4. Finally, in order for this measurement to be user-friendly to a
- wide audience it is important that such a measurement finishes
- within a short time-frame and short being anything below 20
+ wide audience, it is important that such a measurement finishes
+ within a short time-frame with short being anything below 20
seconds.
4. Measuring Responsiveness
The ability to reliably measure the responsiveness under typical
working conditions is predicated by the ability to reliably put the
- network in a state representative of the said conditions. Once the
+ network in a state representative of said conditions. Once the
network has reached the required state, its responsiveness can be
measured. The following explains how the former and the latter are
achieved.
@@ -270,7 +270,7 @@
experiencing ingress and egress flows that are similar to those when
used by humans in the typical day-to-day pattern.
- While any network can be put momentarily into working condition by
+ While any network can be put momentarily into working conditions by
the means of a single HTTP transaction, taking measurements requires
maintaining such conditions over sufficient time. Thus, measuring
the network responsiveness in a consistent way depends on our ability
@@ -286,7 +286,7 @@
way to achieve this is by creating multiple large bulk data-transfers
in either downstream or upstream direction. Similar to conventional
speed-test applications that also create a varying number of streams
- to measure throughput. Working-conditions does the same. It also
+ to measure throughput. Working conditions does the same. It also
requires a way to detect when the network is in a persistent working
condition, called "saturation". This can be achieved by monitoring
the instantaneous goodput over time. When the goodput stops
@@ -298,7 +298,7 @@
o Should not waste traffic, since the user may be paying for it
o Should finish within a short time-frame to avoid impacting other
- users on the same network and/or experience varying conditions
+ users on the same network and/or experiencing varying conditions
4.1.1. Parallel vs Sequential Uplink and Downlink
@@ -308,8 +308,8 @@
upstream) or the routing in the ISPs. Users sending data to an
Internet service will fill the bottleneck on the upstream path to the
server and thus expose a potential for bufferbloat to happen at this
- bottleneck. On the downlink direction any download from an Internet
- service will encounter a bottleneck and thus exposes another
+ bottleneck. In the downlink direction any download from an Internet
+ service will encounter a bottleneck and thus expose another
potential for bufferbloat. Thus, when measuring responsiveness under
working conditions it is important to consider both, the upstream and
the downstream bufferbloat. This opens the door to measure both
@@ -322,13 +322,16 @@
seconds of test per direction, while parallel measurement will allow
for 20 seconds of testing in both directions.
- However, a number caveats come with measuring in parallel: - Half-
- duplex links may not expose uplink and downlink bufferbloat: A half-
- duplex link may not allow during parallel measurement to saturate
- both the uplink and the downlink direction. Thus, bufferbloat in
- either of the directions may not be exposed during parallel
- measurement. - Debuggability of the results becomes more obscure:
- During parallel measurement it is impossible to differentiate on
+ However, a number caveats come with measuring in parallel:
+
+ - Half-duplex links may not expose uplink and downlink bufferbloat:
+ A half-duplex link may not allow to saturate both the uplink
+ and the downlink direction during parallel measurement. Thus,
+ bufferbloat in either of the directions may not be exposed during
+ parallel measurement.
+
+ - Debuggability of the results becomes more obscure:
+ During parallel measurement it is impossible to differentiate on
@@ -338,26 +341,26 @@
Internet-Draft Responsiveness under Working Conditions August 2021
- whether the bufferbloat happens in the uplink or the downlink
- direction.
+ whether the bufferbloat happens in the uplink or the downlink
+ direction.
-4.1.2. From single-flow to multi-flow
+4.1.2. From Single-flow to Multi-flow
- As described in RFC 6349, a single TCP connection may not be
+ As described in [RFC6349], a single TCP connection may not be
sufficient to saturate a path between a client and a server. On a
high-BDP network, traditional TCP window-size constraints of 4MB are
often not sufficient to fill the pipe. Additionally, traditional
- loss-based TCP congestion control algorithms aggressively reacts to
+ loss-based TCP congestion control algorithms aggressively react to
packet-loss by reducing the congestion window. This reaction will
- reduce the queuing in the network, and thus "artificially" make the
- bufferbloat appear lesser.
+ reduce the queuing in the network, and thus "artificially" make
+ bufferbloat appear less of a problem.
- The goal of the measurement is to keep the network as busy as
- possible in a sustained and persistent way. Thus, using multiple TCP
+ The goal is to keep the network as busy as possible in a sustained
+ and persistent way during the measurement. Thus, using multiple TCP
connections is needed for a sustained bufferbloat by gradually adding
- TCP flows until saturation is needed.
+ TCP flows until saturation is reached.
-4.1.3. Reaching saturation
+4.1.3. Reaching Saturation
It is best to detect when saturation has been reached so that the
measurement of responsiveness can start with the confidence that the
@@ -367,8 +370,8 @@
buffers are completely filled. Thus, this depends highly on the
congestion control that is being deployed on the sender-side.
Congestion control algorithms like BBR may reach high throughput
- without causing bufferbloat. (because the bandwidth-detection portion
- of BBR is effectively seeking the bottleneck capacity)
+ without causing bufferbloat (because the bandwidth-detection portion
+ of BBR is effectively seeking the bottleneck capacity).
It is advised to rather use loss-based congestion controls like Cubic
to "reliably" ensure that the buffers are filled.
@@ -379,7 +382,7 @@
packet-loss or ECN-marks signaling a congestion or even a full buffer
of the bottleneck link.
-4.1.4. Final algorithm
+4.1.4. Final Algorithm
The following is a proposal for an algorithm to reach saturation of a
network by using HTTP/2 upload (POST) or download (GET) requests of
@@ -404,7 +407,7 @@
throughput will remain stable. In the latter case, this means that
saturation has been reached and - more importantly - is stable.
- In detail, the steps of the algorithm are the following
+ In detail, the steps of the algorithm are the following:
o Create 4 load-bearing connections
@@ -453,7 +456,7 @@
the different stages of a separate network transaction as well as
measuring on the load-bearing connections themselves.
- Two aspects are being measured with this approach :
+ Two aspects are being measured with this approach:
1. How the network handles new connections and their different
stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2
@@ -463,19 +466,19 @@
2. How the network and the client/server networking stack handles
the latency on the load-bearing connections themselves. E.g.,
- Smart queuing techniques on the bottleneck will allow to keep the
+ smart queuing techniques on the bottleneck will allow to keep the
latency within a reasonable limit in the network and buffer-
- reducing techniques like TCP_NOTSENT_LOWAT makes sure the client
+ reducing techniques like TCP_NOTSENT_LOWAT make sure the client
and server TCP-stack is not a source of significant latency.
To measure the former, we send a DNS-request, establish a TCP-
connection on port 443, establish a TLS-context using TLS1.3 and send
- an HTTP2 GET request for an object of a single byte large. This
+ an HTTP/2 GET request for an object the size of a single byte. This
measurement will be repeated multiple times for accuracy. Each of
these stages allows to collect a single latency measurement that can
then be factored into the responsiveness computation.
- To measure the latter, on the load-bearing connections (that uses
+ To measure the latter, on the load-bearing connections (that use
HTTP/2) a GET request is multiplexed. This GET request is for a
1-byte object. This allows to measure the end-to-end latency on the
connections that are using the network at full speed.
@@ -492,10 +495,10 @@
an equal weight to each of these measurements.
Finally, the resulting latency needs to be exposed to the users.
- Users have been trained to accept metrics that have a notion of "The
+ Users have been trained to accept metrics that have a notion of "the
higher the better". Latency measuring in units of seconds however is
"the lower the better". Thus, converting the latency measurement to
- a frequency allows using the familiar notion of "The higher the
+ a frequency allows using the familiar notion of "the higher the
better". The term frequency has a very technical connotation. What
we are effectively measuring is the number of round-trips from the
@@ -513,7 +516,7 @@
which is a wink to the "revolutions per minute" that we are used to
in cars.
- Thus, our unit of measure is "Round-trip per Minute" (RPM) that
+ Thus, our unit of measure is "Round-trips per Minute" (RPM) that
expresses responsiveness under working conditions.
4.2.2. Statistical Confidence
@@ -527,13 +530,13 @@
5. Protocol Specification
By using standard protocols that are most commonly used by end-users,
- no new protocol needs to be specified. However, both client and
+ no new protocol needs to be specified. However, both clients and
servers need capabilities to execute this kind of measurement as well
- as a standard to flow to provision the client with the necessary
+ as a standard to follow to provision the client with the necessary
information.
First, the capabilities of both the client and the server: It is
- expected that both hosts support HTTP/2 over TLS 1.3. That the
+ expected that both hosts support HTTP/2 over TLS 1.3, and that the
client is able to send a GET-request and a POST. The server needs
the ability to serve both of these HTTP commands. Further, the
server endpoint is accessible through a hostname that can be resolved
@@ -546,13 +549,13 @@
1. A config URL/response: This is the configuration file/format used
by the client. It's a simple JSON file format that points the
client at the various URLs mentioned below. All of the fields
- are required except "test_endpoint". If the service-procier can
+ are required except "test_endpoint". If the service-provider can
pin all of the requests for a test run to a specific node in the
service (for a particular run), they can specify that node's name
in the "test_endpoint" field. It's preferred that pinning of
some sort is available. This is to ensure the measurement is
against the same paths and not switching hosts during a test run
- (ie moving from near POP A to near POP B) Sample content of this
+ (i.e., moving from near POP A to near POP B). Sample content of this
JSON would be:
@@ -577,7 +580,7 @@
3. A "large" URL/response: This needs to serve a status code of 200
and a body size of at least 8GB. The body can be bigger, and
- will need to grow as network speeds increases over time. The
+ will need to grow as network speeds increase over time. The
actual body content is irrelevant. The client will probably
never completely download the object.
@@ -618,16 +621,19 @@
Internet-Draft Responsiveness under Working Conditions August 2021
+ [RFC6349] ...
+
[RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White,
"Proportional Integral Controller Enhanced (PIE): A
Lightweight Control Scheme to Address the Bufferbloat
Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
<https://www.rfc-editor.org/info/rfc8033>.
- [RFC8289] Nichols, K., Jacobson, V., McGregor, A., Ed., and J.
- Iyengar, Ed., "Controlled Delay Active Queue Management",
- RFC 8289, DOI 10.17487/RFC8289, January 2018,
- <https://www.rfc-editor.org/info/rfc8289>.
+ [RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Ed., and
+ Gettys, J., "The Flow Queue CoDel Packet Scheduler and
+ Active Queue Management Algorithm", RFC 8290,
+ DOI 10.17487/RFC8290, January 2018,
+ <https://www.rfc-editor.org/info/rfc8290>.
Authors' Addresses
[-- Attachment #3: editorial-suggestions-2021-08-15-word.diff --]
[-- Type: text/x-diff, Size: 16064 bytes --]
[--- draft-cpaasch-ippm-responsiveness-00.txt-]{+++ draft-cpaasch-ippm-responsiveness-00-ea.txt+} 2021-08-15 [-12:01:01.213813125-] {+15:08:08.013416074+} +0200
@@ -17,7 +17,7 @@
Bufferbloat has been a long-standing problem on the Internet with
more than a decade of work on standardizing technical solutions,
[-implementations-]
{+implementations,+} and testing. However, to this date, bufferbloat is
still a very common problem for the end-users. Everyone "knows" that
it is "normal" for a video conference to have problems when somebody
else on the same home-network is watching a 4K movie.
@@ -33,8 +33,8 @@
methodology to evaluate bufferbloat the way common users are
experiencing it today, using today's most frequently used protocols
and mechanisms to accurately measure the user-experience. We also
provide a way to express [-the-] bufferbloat as a measure of "Round-trips
per [-minute"-] {+Minute"+} (RPM) to have a more intuitive way for the users to
understand the notion of bufferbloat.
Status of This Memo
@@ -81,14 +81,14 @@
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Measuring is [-hard-] {+Hard+} . . . . . . . . . . . . . . . . . . . . . . 3
3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Measuring Responsiveness . . . . . . . . . . . . . . . . . . 5
4.1. Working Conditions . . . . . . . . . . . . . . . . . . . 5
4.1.1. Parallel vs Sequential Uplink and Downlink . . . . . 6
4.1.2. From [-single-flow-] {+Single-flow+} to [-multi-flow-] {+Multi-flow+} . . . . . . . . . . . 7
4.1.3. Reaching [-saturation-] {+Saturation+} . . . . . . . . . . . . . . . . . 7
4.1.4. Final [-algorithm-] {+Algorithm+} . . . . . . . . . . . . . . . . . . . 7
4.2. Measuring Responsiveness . . . . . . . . . . . . . . . . 8
4.2.1. Aggregating Round-trips per Minute . . . . . . . . . 9
4.2.2. Statistical Confidence . . . . . . . . . . . . . . . 10
@@ -103,8 +103,8 @@
For many years, bufferbloat has been known as an unfortunately common
issue in todays networks [Bufferbloat]. Solutions like FQ-codel
[-[RFC8289]-]
{+[RFC8290]+} or PIE [RFC8033] have been standardized and are to some
[-extend-]
{+extent+} widely implemented. Nevertheless, users still suffer from
bufferbloat.
@@ -129,7 +129,7 @@
bufferbloat problem.
We believe that it is necessary to create a standardized way for
measuring the [-extend-] {+extent+} of bufferbloat in a network and express it to
the user in a user-friendly way. This should help existing
measurement tools to add a bufferbloat measurement to their set of
metrics. It will also allow to raise the awareness to the problem
@@ -144,10 +144,10 @@
classification for those protocols is very common. It is thus very
important to use those protocols for the measurements to avoid
focusing on use-cases that are not actually affecting the end-user.
Finally, we propose to use [-"round-trips-] {+"Round-trips+} per [-minute"-] {+Minute"+} as a metric to
express the [-extend-] {+extent+} of bufferbloat.
2. Measuring is [-hard-] {+Hard+}
There are several challenges around measuring bufferbloat accurately
on the Internet. These challenges are due to different factors.
@@ -155,7 +155,7 @@
problem space, and the reproducibility of the measurement.
It is well-known that transparent TCP proxies are widely deployed on
port 443 and/or port 80, while less [-common-] {+commonly+} on other ports. Thus,
choice of the port-number to measure bufferbloat has a significant
influence on the result. Other factors are the protocols being used.
TCP and UDP traffic may take a largely different path on the Internet
@@ -186,17 +186,17 @@
measurement. It seems that it's best to avoid extending the duration
of the test beyond what's needed.
The problem space around [-the-] bufferbloat is huge. Traditionally, one
thinks of bufferbloat happening on the routers and switches of the
Internet. Thus, simply measuring bufferbloat at the transport layer
would be sufficient. However, the networking stacks of the clients
and servers can also experience huge amounts of bufferbloat. Data
sitting in TCP sockets or waiting in the application to be scheduled
for sending causes artificial latency, which affects user-experience
the same way [-the-] "traditional" bufferbloat does.
Finally, measuring bufferbloat requires us to fill the buffers of the
[-bottleneck-]
{+bottleneck,+} and when buffer occupancy is at its peak, the latency
measurement needs to be done. Achieving this in a reliable and
reproducible way is not easy. First, one needs to ensure that
buffers are actually full for a sustained period of time to allow for
@@ -250,15 +250,15 @@
bufferbloat.
4. Finally, in order for this measurement to be user-friendly to a
wide [-audience-] {+audience,+} it is important that such a measurement finishes
within a short time-frame [-and-] {+with+} short being anything below 20
seconds.
4. Measuring Responsiveness
The ability to reliably measure the responsiveness under typical
working conditions is predicated by the ability to reliably put the
network in a state representative of [-the-] said conditions. Once the
network has reached the required state, its responsiveness can be
measured. The following explains how the former and the latter are
achieved.
@@ -270,7 +270,7 @@
experiencing ingress and egress flows that are similar to those when
used by humans in the typical day-to-day pattern.
While any network can be put momentarily into working [-condition-] {+conditions+} by
the means of a single HTTP transaction, taking measurements requires
maintaining such conditions over sufficient time. Thus, measuring
the network responsiveness in a consistent way depends on our ability
@@ -286,7 +286,7 @@
way to achieve this is by creating multiple large bulk data-transfers
in either downstream or upstream direction. Similar to conventional
speed-test applications that also create a varying number of streams
to measure throughput. [-Working-conditions-] {+Working conditions+} does the same. It also
requires a way to detect when the network is in a persistent working
condition, called "saturation". This can be achieved by monitoring
the instantaneous goodput over time. When the goodput stops
@@ -298,7 +298,7 @@
o Should not waste traffic, since the user may be paying for it
o Should finish within a short time-frame to avoid impacting other
users on the same network and/or [-experience-] {+experiencing+} varying conditions
4.1.1. Parallel vs Sequential Uplink and Downlink
@@ -308,8 +308,8 @@
upstream) or the routing in the ISPs. Users sending data to an
Internet service will fill the bottleneck on the upstream path to the
server and thus expose a potential for bufferbloat to happen at this
bottleneck. [-On-] {+In+} the downlink direction any download from an Internet
service will encounter a bottleneck and thus [-exposes-] {+expose+} another
potential for bufferbloat. Thus, when measuring responsiveness under
working conditions it is important to consider both, the upstream and
the downstream bufferbloat. This opens the door to measure both
@@ -322,13 +322,16 @@
seconds of test per direction, while parallel measurement will allow
for 20 seconds of testing in both directions.
However, a number caveats come with measuring in parallel:
- [-Half-
duplex-] {+Half-duplex+} links may not expose uplink and downlink bufferbloat:
A [-half-
duplex-] {+half-duplex+} link may not allow [-during parallel measurement-] to saturate both the uplink
and the downlink [-direction.-] {+direction during parallel measurement.+} Thus,
bufferbloat in either of the directions may not be exposed during
parallel measurement.
- Debuggability of the results becomes more obscure:
During parallel measurement it is impossible to differentiate on
@@ -338,26 +341,26 @@
Internet-Draft Responsiveness under Working Conditions August 2021
whether the bufferbloat happens in the uplink or the downlink
direction.
4.1.2. From [-single-flow-] {+Single-flow+} to [-multi-flow-] {+Multi-flow+}
As described in [-RFC 6349,-] {+[RFC6349],+} a single TCP connection may not be
sufficient to saturate a path between a client and a server. On a
high-BDP network, traditional TCP window-size constraints of 4MB are
often not sufficient to fill the pipe. Additionally, traditional
loss-based TCP congestion control algorithms aggressively [-reacts-] {+react+} to
packet-loss by reducing the congestion window. This reaction will
reduce the queuing in the network, and thus "artificially" make [-the-]
bufferbloat appear [-lesser.-] {+less of a problem.+}
The goal [-of the measurement-] is to keep the network as busy as possible in a sustained
and persistent [-way.-] {+way during the measurement.+} Thus, using multiple TCP
connections is needed for a sustained bufferbloat by gradually adding
TCP flows until saturation is [-needed.-] {+reached.+}
4.1.3. Reaching [-saturation-] {+Saturation+}
It is best to detect when saturation has been reached so that the
measurement of responsiveness can start with the confidence that the
@@ -367,8 +370,8 @@
buffers are completely filled. Thus, this depends highly on the
congestion control that is being deployed on the sender-side.
Congestion control algorithms like BBR may reach high throughput
without causing [-bufferbloat.-] {+bufferbloat+} (because the bandwidth-detection portion
of BBR is effectively seeking the bottleneck [-capacity)-] {+capacity).+}
It is advised to rather use loss-based congestion controls like Cubic
to "reliably" ensure that the buffers are filled.
@@ -379,7 +382,7 @@
packet-loss or ECN-marks signaling a congestion or even a full buffer
of the bottleneck link.
4.1.4. Final [-algorithm-] {+Algorithm+}
The following is a proposal for an algorithm to reach saturation of a
network by using HTTP/2 upload (POST) or download (GET) requests of
@@ -404,7 +407,7 @@
throughput will remain stable. In the latter case, this means that
saturation has been reached and - more importantly - is stable.
In detail, the steps of the algorithm are the [-following-] {+following:+}
o Create 4 load-bearing connections
@@ -453,7 +456,7 @@
the different stages of a separate network transaction as well as
measuring on the load-bearing connections themselves.
Two aspects are being measured with this [-approach :-] {+approach:+}
1. How the network handles new connections and their different
stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2
@@ -463,19 +466,19 @@
2. How the network and the client/server networking stack handles
the latency on the load-bearing connections themselves. E.g.,
[-Smart-]
{+smart+} queuing techniques on the bottleneck will allow to keep the
latency within a reasonable limit in the network and buffer-
reducing techniques like TCP_NOTSENT_LOWAT [-makes-] {+make+} sure the client
and server TCP-stack is not a source of significant latency.
To measure the former, we send a DNS-request, establish a TCP-
connection on port 443, establish a TLS-context using TLS1.3 and send
an [-HTTP2-] {+HTTP/2+} GET request for an object {+the size+} of a single [-byte large.-] {+byte.+} This
measurement will be repeated multiple times for accuracy. Each of
these stages allows to collect a single latency measurement that can
then be factored into the responsiveness computation.
To measure the latter, on the load-bearing connections (that [-uses-] {+use+}
HTTP/2) a GET request is multiplexed. This GET request is for a
1-byte object. This allows to measure the end-to-end latency on the
connections that are using the network at full speed.
@@ -492,10 +495,10 @@
an equal weight to each of these measurements.
Finally, the resulting latency needs to be exposed to the users.
Users have been trained to accept metrics that have a notion of [-"The-] {+"the+}
higher the better". Latency measuring in units of seconds however is
"the lower the better". Thus, converting the latency measurement to
a frequency allows using the familiar notion of [-"The-] {+"the+} higher the
better". The term frequency has a very technical connotation. What
we are effectively measuring is the number of round-trips from the
@@ -513,7 +516,7 @@
which is a wink to the "revolutions per minute" that we are used to
in cars.
Thus, our unit of measure is [-"Round-trip-] {+"Round-trips+} per Minute" (RPM) that
expresses responsiveness under working conditions.
4.2.2. Statistical Confidence
@@ -527,13 +530,13 @@
5. Protocol Specification
By using standard protocols that are most commonly used by end-users,
no new protocol needs to be specified. However, both [-client-] {+clients+} and
servers need capabilities to execute this kind of measurement as well
as a standard to [-flow-] {+follow+} to provision the client with the necessary
information.
First, the capabilities of both the client and the server: It is
expected that both hosts support HTTP/2 over TLS [-1.3. That-] {+1.3, and that+} the
client is able to send a GET-request and a POST. The server needs
the ability to serve both of these HTTP commands. Further, the
server endpoint is accessible through a hostname that can be resolved
@@ -546,13 +549,13 @@
1. A config URL/response: This is the configuration file/format used
by the client. It's a simple JSON file format that points the
client at the various URLs mentioned below. All of the fields
are required except "test_endpoint". If the [-service-procier-] {+service-provider+} can
pin all of the requests for a test run to a specific node in the
service (for a particular run), they can specify that node's name
in the "test_endpoint" field. It's preferred that pinning of
some sort is available. This is to ensure the measurement is
against the same paths and not switching hosts during a test run
[-(ie-]
{+(i.e.,+} moving from near POP A to near POP [-B)-] {+B).+} Sample content of this
JSON would be:
@@ -577,7 +580,7 @@
3. A "large" URL/response: This needs to serve a status code of 200
and a body size of at least 8GB. The body can be bigger, and
will need to grow as network speeds [-increases-] {+increase+} over time. The
actual body content is irrelevant. The client will probably
never completely download the object.
@@ -618,16 +621,19 @@
Internet-Draft Responsiveness under Working Conditions August 2021
{+[RFC6349] ...+}
[RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White,
"Proportional Integral Controller Enhanced (PIE): A
Lightweight Control Scheme to Address the Bufferbloat
Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
<https://www.rfc-editor.org/info/rfc8033>.
[-[RFC8289] Nichols, K., Jacobson, V., McGregor, A.,-]
{+[RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D.,+} Ed., and [-J.
Iyengar, Ed., "Controlled Delay-]
{+Gettys, J., "The Flow Queue CoDel Packet Scheduler and+}
Active Queue [-Management",-] {+Management Algorithm",+} RFC [-8289,-] {+8290,+}
DOI [-10.17487/RFC8289,-] {+10.17487/RFC8290,+} January 2018,
[-<https://www.rfc-editor.org/info/rfc8289>.-]
{+<https://www.rfc-editor.org/info/rfc8290>.+}
Authors' Addresses
next prev parent reply other threads:[~2021-08-15 13:39 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-13 21:41 Christoph Paasch
2021-08-15 13:39 ` Erik Auerswald [this message]
2021-08-18 22:01 ` Christoph Paasch
2021-08-19 7:17 ` Erik Auerswald
2021-08-19 15:48 ` Christoph Paasch
2021-08-19 17:50 ` [Bloat] Sidebar re illustrating quality (was: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt) Dave Collier-Brown
2021-08-19 21:17 ` Kenneth Porter
2021-08-20 1:58 ` Dave Collier-Brown
2021-08-21 1:22 ` Kenneth Porter
2021-08-21 11:01 ` Sebastian Moeller
2021-08-21 10:23 ` Erik Auerswald
2021-08-21 16:31 ` Dave Collier-Brown
2021-09-21 20:50 ` [Bloat] [ippm] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt Toerless Eckert
2021-10-22 23:19 ` Christoph Paasch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/bloat.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210815133922.GA18118@unix-ag.uni-kl.de \
--to=auerswal@unix-ag.uni-kl.de \
--cc=bloat@lists.bufferbloat.net \
--cc=draft-cpaasch-ippm-responsiveness@ietf.org \
--cc=ippm@ietf.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox