[Bloat] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
Christoph Paasch
cpaasch at apple.com
Wed Aug 18 18:01:42 EDT 2021
Hello Erik,
On 08/15/21 - 15:39, Erik Auerswald wrote:
> Hi,
>
> I'd like to thank you for working on a nice I-D describing an interesting
> and IMHO useful network measurement metric.
>
> Since feedback was asked for, I'd like to try and provide constructive
> feedback.
thanks a lot for your detailed feedback! Please see further inline.
> In general, I like the idea of "Round-trips per Minute" (RPM) as a
> metric used to characterize (one aspect of) a network. I do think that
> introducing this would improve the status quo. Since this RPM definition
> comprises a specific way of adding load to the network and measuring a
> complex metric, I think it is useful to "standardize" it.
>
> I do not think RPM can replace all other metrics. This is, in a way,
> mentioned in the introduction, where it is suggested to add RPM to
> existing measurement platforms. As such I just want to point this out
> more explicitely, but do not intend to diminish the RPM idea by this.
> In short, I'd say it's complicated.
Yes, I fully agree that RPM is not the only metric. It is one among many.
If there is a sentiment in our document that sounds like "RPM is the only
that matters", please let me know where so we can reword the text.
> Bandwidth matters for bulk data transfer, e.g., downloading a huge update
> required for playing a multiplayer game online.
>
> Minimum latency matters for the feasibility of interactive applications,
> e.g., controlling a toy car in your room vs. a robotic arm on the ISS
> from Earth vs. orbital insertion around Mars from Earth. For a more
> mundane use case consider a voice conference. (A good decade ago I
> experienced a voice conferencing system running over IP that introduced
> over one second of (minimum) latency and therefore was awkward to use.)
Wrt to minimum latency:
To some extend it is a subset of "RPM".
But admittedly, measuring minimum latency on its own is good for debugging
purposes and to know what one can get on a network that is not in persistent
working conditions.
> Expressing 'bufferbloat as a measure of "Round-trips per Minute" (RPM)'
> exhibits (at least) two problems:
>
> 1. A high RPM value is associated with little bufferbloat problems.
>
> 2. A low RPM value may be caused by high minimum delay instead of
> bufferbloat.
>
> I think that RPM (i.e., under working conditions) measures a network's
> usefulness for interactive applications, but not necessarily bufferbloat.
You are right and we are definitely misrepresenting this in the text.
I filed
https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/8.
If you want, feel free to submit a pull-request otherwise, we will get to
the issue in the next weeks.
> I do think that RPM is in itself more generally useful than minimum
> latency or bandwidth.
>
> A combination of low minimum latency with low RPM value strongly hints
> at bufferbloat. Other combinations are less easily characterized.
>
> Bufferbloat can still lie in hiding, e.g., when a link with bufferbloat
> is not yet the bottleneck, or if the communications end-points are not
> yet able to saturate the network inbetween. Thus high bandwidth can
> result in high RPM values despite (hidden) bufferbloat.
>
> The "Measuring is Hard" section mentions additional complications.
>
> All in all, I do think that "measuring bufferbloat" and "measuring RPM"
> should not be used synonymously. The I-D title clearly shows this:
> RPM is measuring "Responsiveness under Working Conditions" which may be
> affected by bufferbloat, among other potential factors, but is not in
> itself bufferbloat.
>
> Under the assumption that only a single value (performance score) is
> considered, I do think that RPM is more generally useful than bandwidth
> or idle latency.
>
> On a meta-level, I think that the word "bufferbloat" is not used according
> to a single self-consistent definition in the I-D.
Fully agree with all your points above on how we misrepresented the relation
between RPM and bufferbloat.
> Additionally, I think that the I-D should reference DNS, HTTP/2, and
> TLS 1.3, since these protocols are required for implementing the RPM
> measurement. The same for JSON, I think. Possibly URL.
Yes, we have not given the references & citations enough care.
(https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/2)
> Using "rpm.example" instead of "example.apple.com" would result in shorter
> lines for the example JSON.
>
> "host123.cdn.example" instead of "hostname123.cdnprovider.com" might be
> a more appropriate example DNS name.
Oups, we forgot to adjust these to a more generic hostname...
(https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/9)
> Adding an informative reference to RFC 2606 / BCP 32 might raise awareness
> of the existence of a BCP on example DNS names.
>
> Please find both a unified diff against the text rendering of the I-D,
> and a word diff produced from the unified diff, attached to this email
> in order to suggest editorial changes that are intended to improve the
> reading experience. They are intended for reading and (possibly partial)
> manual application, since the text rendering of an I-D is usually not
> the preferred form of editing it.
Thanks a lot for these
(https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/10)
Regards,
Christoph
>
> Thanks,
> Erik
> --
> Always use the right tool for the job.
> -- Rob Pike
>
>
> On Fri, Aug 13, 2021 at 02:41:05PM -0700, Christoph Paasch via Bloat wrote:
> > I already posted this to the RPM-list, but the audience here on bloat should
> > be interested as well.
> >
> >
> > This is the specification of Apple's responsiveness/RPM test. We believe that it
> > would be good for the bufferbloat-effort to have a specification of how to
> > quantify the extend of bufferbloat from a user's perspective. Our
> > Internet-draft is a first step in that direction and we hope that it will
> > kick off some collaboration.
> >
> >
> > Feedback is very welcome!
> >
> >
> > Cheers,
> > Christoph
> >
> >
> > ----- Forwarded message from internet-drafts at ietf.org -----
> >
> > From: internet-drafts at ietf.org
> > To: Christoph Paasch <cpaasch at apple.com>, Omer Shapira <oesh at apple.com>, Randall Meyer <rrm at apple.com>, Stuart Cheshire
> > <cheshire at apple.com>
> > Date: Fri, 13 Aug 2021 09:43:40 -0700
> > Subject: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
> >
> >
> > A new version of I-D, draft-cpaasch-ippm-responsiveness-00.txt
> > has been successfully submitted by Christoph Paasch and posted to the
> > IETF repository.
> >
> > Name: draft-cpaasch-ippm-responsiveness
> > Revision: 00
> > Title: Responsiveness under Working Conditions
> > Document date: 2021-08-13
> > Group: Individual Submission
> > Pages: 12
> > URL: https://www.ietf.org/archive/id/draft-cpaasch-ippm-responsiveness-00.txt
> > Status: https://datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/
> > Htmlized: https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness
> >
> >
> > Abstract:
> > Bufferbloat has been a long-standing problem on the Internet with
> > more than a decade of work on standardizing technical solutions,
> > implementations and testing. However, to this date, bufferbloat is
> > still a very common problem for the end-users. Everyone "knows" that
> > it is "normal" for a video conference to have problems when somebody
> > else on the same home-network is watching a 4K movie.
> >
> > The reason for this problem is not the lack of technical solutions,
> > but rather a lack of awareness of the problem-space, and a lack of
> > tooling to accurately measure the problem. We believe that exposing
> > the problem of bufferbloat to the end-user by measuring the end-
> > users' experience at a high level will help to create the necessary
> > awareness.
> >
> > This document is a first attempt at specifying a measurement
> > methodology to evaluate bufferbloat the way common users are
> > experiencing it today, using today's most frequently used protocols
> > and mechanisms to accurately measure the user-experience. We also
> > provide a way to express the bufferbloat as a measure of "Round-trips
> > per minute" (RPM) to have a more intuitive way for the users to
> > understand the notion of bufferbloat.
> >
> >
> >
> >
> > The IETF Secretariat
> >
> >
> >
> > ----- End forwarded message -----
> > _______________________________________________
> > Bloat mailing list
> > Bloat at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat
> --- draft-cpaasch-ippm-responsiveness-00.txt 2021-08-15 12:01:01.213813125 +0200
> +++ draft-cpaasch-ippm-responsiveness-00-ea.txt 2021-08-15 15:08:08.013416074 +0200
> @@ -17,7 +17,7 @@
>
> Bufferbloat has been a long-standing problem on the Internet with
> more than a decade of work on standardizing technical solutions,
> - implementations and testing. However, to this date, bufferbloat is
> + implementations, and testing. However, to this date, bufferbloat is
> still a very common problem for the end-users. Everyone "knows" that
> it is "normal" for a video conference to have problems when somebody
> else on the same home-network is watching a 4K movie.
> @@ -33,8 +33,8 @@
> methodology to evaluate bufferbloat the way common users are
> experiencing it today, using today's most frequently used protocols
> and mechanisms to accurately measure the user-experience. We also
> - provide a way to express the bufferbloat as a measure of "Round-trips
> - per minute" (RPM) to have a more intuitive way for the users to
> + provide a way to express bufferbloat as a measure of "Round-trips
> + per Minute" (RPM) to have a more intuitive way for the users to
> understand the notion of bufferbloat.
>
> Status of This Memo
> @@ -81,14 +81,14 @@
> Table of Contents
>
> 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
> - 2. Measuring is hard . . . . . . . . . . . . . . . . . . . . . . 3
> + 2. Measuring is Hard . . . . . . . . . . . . . . . . . . . . . . 3
> 3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
> 4. Measuring Responsiveness . . . . . . . . . . . . . . . . . . 5
> 4.1. Working Conditions . . . . . . . . . . . . . . . . . . . 5
> 4.1.1. Parallel vs Sequential Uplink and Downlink . . . . . 6
> - 4.1.2. From single-flow to multi-flow . . . . . . . . . . . 7
> - 4.1.3. Reaching saturation . . . . . . . . . . . . . . . . . 7
> - 4.1.4. Final algorithm . . . . . . . . . . . . . . . . . . . 7
> + 4.1.2. From Single-flow to Multi-flow . . . . . . . . . . . 7
> + 4.1.3. Reaching Saturation . . . . . . . . . . . . . . . . . 7
> + 4.1.4. Final Algorithm . . . . . . . . . . . . . . . . . . . 7
> 4.2. Measuring Responsiveness . . . . . . . . . . . . . . . . 8
> 4.2.1. Aggregating Round-trips per Minute . . . . . . . . . 9
> 4.2.2. Statistical Confidence . . . . . . . . . . . . . . . 10
> @@ -103,8 +103,8 @@
>
> For many years, bufferbloat has been known as an unfortunately common
> issue in todays networks [Bufferbloat]. Solutions like FQ-codel
> - [RFC8289] or PIE [RFC8033] have been standardized and are to some
> - extend widely implemented. Nevertheless, users still suffer from
> + [RFC8290] or PIE [RFC8033] have been standardized and are to some
> + extent widely implemented. Nevertheless, users still suffer from
> bufferbloat.
>
>
> @@ -129,7 +129,7 @@
> bufferbloat problem.
>
> We believe that it is necessary to create a standardized way for
> - measuring the extend of bufferbloat in a network and express it to
> + measuring the extent of bufferbloat in a network and express it to
> the user in a user-friendly way. This should help existing
> measurement tools to add a bufferbloat measurement to their set of
> metrics. It will also allow to raise the awareness to the problem
> @@ -144,10 +144,10 @@
> classification for those protocols is very common. It is thus very
> important to use those protocols for the measurements to avoid
> focusing on use-cases that are not actually affecting the end-user.
> - Finally, we propose to use "round-trips per minute" as a metric to
> - express the extend of bufferbloat.
> + Finally, we propose to use "Round-trips per Minute" as a metric to
> + express the extent of bufferbloat.
>
> -2. Measuring is hard
> +2. Measuring is Hard
>
> There are several challenges around measuring bufferbloat accurately
> on the Internet. These challenges are due to different factors.
> @@ -155,7 +155,7 @@
> problem space, and the reproducibility of the measurement.
>
> It is well-known that transparent TCP proxies are widely deployed on
> - port 443 and/or port 80, while less common on other ports. Thus,
> + port 443 and/or port 80, while less commonly on other ports. Thus,
> choice of the port-number to measure bufferbloat has a significant
> influence on the result. Other factors are the protocols being used.
> TCP and UDP traffic may take a largely different path on the Internet
> @@ -186,17 +186,17 @@
> measurement. It seems that it's best to avoid extending the duration
> of the test beyond what's needed.
>
> - The problem space around the bufferbloat is huge. Traditionally, one
> + The problem space around bufferbloat is huge. Traditionally, one
> thinks of bufferbloat happening on the routers and switches of the
> Internet. Thus, simply measuring bufferbloat at the transport layer
> would be sufficient. However, the networking stacks of the clients
> and servers can also experience huge amounts of bufferbloat. Data
> sitting in TCP sockets or waiting in the application to be scheduled
> for sending causes artificial latency, which affects user-experience
> - the same way the "traditional" bufferbloat does.
> + the same way "traditional" bufferbloat does.
>
> Finally, measuring bufferbloat requires us to fill the buffers of the
> - bottleneck and when buffer occupancy is at its peak, the latency
> + bottleneck, and when buffer occupancy is at its peak, the latency
> measurement needs to be done. Achieving this in a reliable and
> reproducible way is not easy. First, one needs to ensure that
> buffers are actually full for a sustained period of time to allow for
> @@ -250,15 +250,15 @@
> bufferbloat.
>
> 4. Finally, in order for this measurement to be user-friendly to a
> - wide audience it is important that such a measurement finishes
> - within a short time-frame and short being anything below 20
> + wide audience, it is important that such a measurement finishes
> + within a short time-frame with short being anything below 20
> seconds.
>
> 4. Measuring Responsiveness
>
> The ability to reliably measure the responsiveness under typical
> working conditions is predicated by the ability to reliably put the
> - network in a state representative of the said conditions. Once the
> + network in a state representative of said conditions. Once the
> network has reached the required state, its responsiveness can be
> measured. The following explains how the former and the latter are
> achieved.
> @@ -270,7 +270,7 @@
> experiencing ingress and egress flows that are similar to those when
> used by humans in the typical day-to-day pattern.
>
> - While any network can be put momentarily into working condition by
> + While any network can be put momentarily into working conditions by
> the means of a single HTTP transaction, taking measurements requires
> maintaining such conditions over sufficient time. Thus, measuring
> the network responsiveness in a consistent way depends on our ability
> @@ -286,7 +286,7 @@
> way to achieve this is by creating multiple large bulk data-transfers
> in either downstream or upstream direction. Similar to conventional
> speed-test applications that also create a varying number of streams
> - to measure throughput. Working-conditions does the same. It also
> + to measure throughput. Working conditions does the same. It also
> requires a way to detect when the network is in a persistent working
> condition, called "saturation". This can be achieved by monitoring
> the instantaneous goodput over time. When the goodput stops
> @@ -298,7 +298,7 @@
> o Should not waste traffic, since the user may be paying for it
>
> o Should finish within a short time-frame to avoid impacting other
> - users on the same network and/or experience varying conditions
> + users on the same network and/or experiencing varying conditions
>
> 4.1.1. Parallel vs Sequential Uplink and Downlink
>
> @@ -308,8 +308,8 @@
> upstream) or the routing in the ISPs. Users sending data to an
> Internet service will fill the bottleneck on the upstream path to the
> server and thus expose a potential for bufferbloat to happen at this
> - bottleneck. On the downlink direction any download from an Internet
> - service will encounter a bottleneck and thus exposes another
> + bottleneck. In the downlink direction any download from an Internet
> + service will encounter a bottleneck and thus expose another
> potential for bufferbloat. Thus, when measuring responsiveness under
> working conditions it is important to consider both, the upstream and
> the downstream bufferbloat. This opens the door to measure both
> @@ -322,13 +322,16 @@
> seconds of test per direction, while parallel measurement will allow
> for 20 seconds of testing in both directions.
>
> - However, a number caveats come with measuring in parallel: - Half-
> - duplex links may not expose uplink and downlink bufferbloat: A half-
> - duplex link may not allow during parallel measurement to saturate
> - both the uplink and the downlink direction. Thus, bufferbloat in
> - either of the directions may not be exposed during parallel
> - measurement. - Debuggability of the results becomes more obscure:
> - During parallel measurement it is impossible to differentiate on
> + However, a number caveats come with measuring in parallel:
> +
> + - Half-duplex links may not expose uplink and downlink bufferbloat:
> + A half-duplex link may not allow to saturate both the uplink
> + and the downlink direction during parallel measurement. Thus,
> + bufferbloat in either of the directions may not be exposed during
> + parallel measurement.
> +
> + - Debuggability of the results becomes more obscure:
> + During parallel measurement it is impossible to differentiate on
>
>
>
> @@ -338,26 +341,26 @@
> Internet-Draft Responsiveness under Working Conditions August 2021
>
>
> - whether the bufferbloat happens in the uplink or the downlink
> - direction.
> + whether the bufferbloat happens in the uplink or the downlink
> + direction.
>
> -4.1.2. From single-flow to multi-flow
> +4.1.2. From Single-flow to Multi-flow
>
> - As described in RFC 6349, a single TCP connection may not be
> + As described in [RFC6349], a single TCP connection may not be
> sufficient to saturate a path between a client and a server. On a
> high-BDP network, traditional TCP window-size constraints of 4MB are
> often not sufficient to fill the pipe. Additionally, traditional
> - loss-based TCP congestion control algorithms aggressively reacts to
> + loss-based TCP congestion control algorithms aggressively react to
> packet-loss by reducing the congestion window. This reaction will
> - reduce the queuing in the network, and thus "artificially" make the
> - bufferbloat appear lesser.
> + reduce the queuing in the network, and thus "artificially" make
> + bufferbloat appear less of a problem.
>
> - The goal of the measurement is to keep the network as busy as
> - possible in a sustained and persistent way. Thus, using multiple TCP
> + The goal is to keep the network as busy as possible in a sustained
> + and persistent way during the measurement. Thus, using multiple TCP
> connections is needed for a sustained bufferbloat by gradually adding
> - TCP flows until saturation is needed.
> + TCP flows until saturation is reached.
>
> -4.1.3. Reaching saturation
> +4.1.3. Reaching Saturation
>
> It is best to detect when saturation has been reached so that the
> measurement of responsiveness can start with the confidence that the
> @@ -367,8 +370,8 @@
> buffers are completely filled. Thus, this depends highly on the
> congestion control that is being deployed on the sender-side.
> Congestion control algorithms like BBR may reach high throughput
> - without causing bufferbloat. (because the bandwidth-detection portion
> - of BBR is effectively seeking the bottleneck capacity)
> + without causing bufferbloat (because the bandwidth-detection portion
> + of BBR is effectively seeking the bottleneck capacity).
>
> It is advised to rather use loss-based congestion controls like Cubic
> to "reliably" ensure that the buffers are filled.
> @@ -379,7 +382,7 @@
> packet-loss or ECN-marks signaling a congestion or even a full buffer
> of the bottleneck link.
>
> -4.1.4. Final algorithm
> +4.1.4. Final Algorithm
>
> The following is a proposal for an algorithm to reach saturation of a
> network by using HTTP/2 upload (POST) or download (GET) requests of
> @@ -404,7 +407,7 @@
> throughput will remain stable. In the latter case, this means that
> saturation has been reached and - more importantly - is stable.
>
> - In detail, the steps of the algorithm are the following
> + In detail, the steps of the algorithm are the following:
>
> o Create 4 load-bearing connections
>
> @@ -453,7 +456,7 @@
> the different stages of a separate network transaction as well as
> measuring on the load-bearing connections themselves.
>
> - Two aspects are being measured with this approach :
> + Two aspects are being measured with this approach:
>
> 1. How the network handles new connections and their different
> stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2
> @@ -463,19 +466,19 @@
>
> 2. How the network and the client/server networking stack handles
> the latency on the load-bearing connections themselves. E.g.,
> - Smart queuing techniques on the bottleneck will allow to keep the
> + smart queuing techniques on the bottleneck will allow to keep the
> latency within a reasonable limit in the network and buffer-
> - reducing techniques like TCP_NOTSENT_LOWAT makes sure the client
> + reducing techniques like TCP_NOTSENT_LOWAT make sure the client
> and server TCP-stack is not a source of significant latency.
>
> To measure the former, we send a DNS-request, establish a TCP-
> connection on port 443, establish a TLS-context using TLS1.3 and send
> - an HTTP2 GET request for an object of a single byte large. This
> + an HTTP/2 GET request for an object the size of a single byte. This
> measurement will be repeated multiple times for accuracy. Each of
> these stages allows to collect a single latency measurement that can
> then be factored into the responsiveness computation.
>
> - To measure the latter, on the load-bearing connections (that uses
> + To measure the latter, on the load-bearing connections (that use
> HTTP/2) a GET request is multiplexed. This GET request is for a
> 1-byte object. This allows to measure the end-to-end latency on the
> connections that are using the network at full speed.
> @@ -492,10 +495,10 @@
> an equal weight to each of these measurements.
>
> Finally, the resulting latency needs to be exposed to the users.
> - Users have been trained to accept metrics that have a notion of "The
> + Users have been trained to accept metrics that have a notion of "the
> higher the better". Latency measuring in units of seconds however is
> "the lower the better". Thus, converting the latency measurement to
> - a frequency allows using the familiar notion of "The higher the
> + a frequency allows using the familiar notion of "the higher the
> better". The term frequency has a very technical connotation. What
> we are effectively measuring is the number of round-trips from the
>
> @@ -513,7 +516,7 @@
> which is a wink to the "revolutions per minute" that we are used to
> in cars.
>
> - Thus, our unit of measure is "Round-trip per Minute" (RPM) that
> + Thus, our unit of measure is "Round-trips per Minute" (RPM) that
> expresses responsiveness under working conditions.
>
> 4.2.2. Statistical Confidence
> @@ -527,13 +530,13 @@
> 5. Protocol Specification
>
> By using standard protocols that are most commonly used by end-users,
> - no new protocol needs to be specified. However, both client and
> + no new protocol needs to be specified. However, both clients and
> servers need capabilities to execute this kind of measurement as well
> - as a standard to flow to provision the client with the necessary
> + as a standard to follow to provision the client with the necessary
> information.
>
> First, the capabilities of both the client and the server: It is
> - expected that both hosts support HTTP/2 over TLS 1.3. That the
> + expected that both hosts support HTTP/2 over TLS 1.3, and that the
> client is able to send a GET-request and a POST. The server needs
> the ability to serve both of these HTTP commands. Further, the
> server endpoint is accessible through a hostname that can be resolved
> @@ -546,13 +549,13 @@
> 1. A config URL/response: This is the configuration file/format used
> by the client. It's a simple JSON file format that points the
> client at the various URLs mentioned below. All of the fields
> - are required except "test_endpoint". If the service-procier can
> + are required except "test_endpoint". If the service-provider can
> pin all of the requests for a test run to a specific node in the
> service (for a particular run), they can specify that node's name
> in the "test_endpoint" field. It's preferred that pinning of
> some sort is available. This is to ensure the measurement is
> against the same paths and not switching hosts during a test run
> - (ie moving from near POP A to near POP B) Sample content of this
> + (i.e., moving from near POP A to near POP B). Sample content of this
> JSON would be:
>
>
> @@ -577,7 +580,7 @@
>
> 3. A "large" URL/response: This needs to serve a status code of 200
> and a body size of at least 8GB. The body can be bigger, and
> - will need to grow as network speeds increases over time. The
> + will need to grow as network speeds increase over time. The
> actual body content is irrelevant. The client will probably
> never completely download the object.
>
> @@ -618,16 +621,19 @@
> Internet-Draft Responsiveness under Working Conditions August 2021
>
>
> + [RFC6349] ...
> +
> [RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White,
> "Proportional Integral Controller Enhanced (PIE): A
> Lightweight Control Scheme to Address the Bufferbloat
> Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
> <https://www.rfc-editor.org/info/rfc8033>.
>
> - [RFC8289] Nichols, K., Jacobson, V., McGregor, A., Ed., and J.
> - Iyengar, Ed., "Controlled Delay Active Queue Management",
> - RFC 8289, DOI 10.17487/RFC8289, January 2018,
> - <https://www.rfc-editor.org/info/rfc8289>.
> + [RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Ed., and
> + Gettys, J., "The Flow Queue CoDel Packet Scheduler and
> + Active Queue Management Algorithm", RFC 8290,
> + DOI 10.17487/RFC8290, January 2018,
> + <https://www.rfc-editor.org/info/rfc8290>.
>
> Authors' Addresses
>
> [--- draft-cpaasch-ippm-responsiveness-00.txt-]{+++ draft-cpaasch-ippm-responsiveness-00-ea.txt+} 2021-08-15 [-12:01:01.213813125-] {+15:08:08.013416074+} +0200
> @@ -17,7 +17,7 @@
>
> Bufferbloat has been a long-standing problem on the Internet with
> more than a decade of work on standardizing technical solutions,
> [-implementations-]
> {+implementations,+} and testing. However, to this date, bufferbloat is
> still a very common problem for the end-users. Everyone "knows" that
> it is "normal" for a video conference to have problems when somebody
> else on the same home-network is watching a 4K movie.
> @@ -33,8 +33,8 @@
> methodology to evaluate bufferbloat the way common users are
> experiencing it today, using today's most frequently used protocols
> and mechanisms to accurately measure the user-experience. We also
> provide a way to express [-the-] bufferbloat as a measure of "Round-trips
> per [-minute"-] {+Minute"+} (RPM) to have a more intuitive way for the users to
> understand the notion of bufferbloat.
>
> Status of This Memo
> @@ -81,14 +81,14 @@
> Table of Contents
>
> 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
> 2. Measuring is [-hard-] {+Hard+} . . . . . . . . . . . . . . . . . . . . . . 3
> 3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
> 4. Measuring Responsiveness . . . . . . . . . . . . . . . . . . 5
> 4.1. Working Conditions . . . . . . . . . . . . . . . . . . . 5
> 4.1.1. Parallel vs Sequential Uplink and Downlink . . . . . 6
> 4.1.2. From [-single-flow-] {+Single-flow+} to [-multi-flow-] {+Multi-flow+} . . . . . . . . . . . 7
> 4.1.3. Reaching [-saturation-] {+Saturation+} . . . . . . . . . . . . . . . . . 7
> 4.1.4. Final [-algorithm-] {+Algorithm+} . . . . . . . . . . . . . . . . . . . 7
> 4.2. Measuring Responsiveness . . . . . . . . . . . . . . . . 8
> 4.2.1. Aggregating Round-trips per Minute . . . . . . . . . 9
> 4.2.2. Statistical Confidence . . . . . . . . . . . . . . . 10
> @@ -103,8 +103,8 @@
>
> For many years, bufferbloat has been known as an unfortunately common
> issue in todays networks [Bufferbloat]. Solutions like FQ-codel
> [-[RFC8289]-]
> {+[RFC8290]+} or PIE [RFC8033] have been standardized and are to some
> [-extend-]
> {+extent+} widely implemented. Nevertheless, users still suffer from
> bufferbloat.
>
>
> @@ -129,7 +129,7 @@
> bufferbloat problem.
>
> We believe that it is necessary to create a standardized way for
> measuring the [-extend-] {+extent+} of bufferbloat in a network and express it to
> the user in a user-friendly way. This should help existing
> measurement tools to add a bufferbloat measurement to their set of
> metrics. It will also allow to raise the awareness to the problem
> @@ -144,10 +144,10 @@
> classification for those protocols is very common. It is thus very
> important to use those protocols for the measurements to avoid
> focusing on use-cases that are not actually affecting the end-user.
> Finally, we propose to use [-"round-trips-] {+"Round-trips+} per [-minute"-] {+Minute"+} as a metric to
> express the [-extend-] {+extent+} of bufferbloat.
>
> 2. Measuring is [-hard-] {+Hard+}
>
> There are several challenges around measuring bufferbloat accurately
> on the Internet. These challenges are due to different factors.
> @@ -155,7 +155,7 @@
> problem space, and the reproducibility of the measurement.
>
> It is well-known that transparent TCP proxies are widely deployed on
> port 443 and/or port 80, while less [-common-] {+commonly+} on other ports. Thus,
> choice of the port-number to measure bufferbloat has a significant
> influence on the result. Other factors are the protocols being used.
> TCP and UDP traffic may take a largely different path on the Internet
> @@ -186,17 +186,17 @@
> measurement. It seems that it's best to avoid extending the duration
> of the test beyond what's needed.
>
> The problem space around [-the-] bufferbloat is huge. Traditionally, one
> thinks of bufferbloat happening on the routers and switches of the
> Internet. Thus, simply measuring bufferbloat at the transport layer
> would be sufficient. However, the networking stacks of the clients
> and servers can also experience huge amounts of bufferbloat. Data
> sitting in TCP sockets or waiting in the application to be scheduled
> for sending causes artificial latency, which affects user-experience
> the same way [-the-] "traditional" bufferbloat does.
>
> Finally, measuring bufferbloat requires us to fill the buffers of the
> [-bottleneck-]
> {+bottleneck,+} and when buffer occupancy is at its peak, the latency
> measurement needs to be done. Achieving this in a reliable and
> reproducible way is not easy. First, one needs to ensure that
> buffers are actually full for a sustained period of time to allow for
> @@ -250,15 +250,15 @@
> bufferbloat.
>
> 4. Finally, in order for this measurement to be user-friendly to a
> wide [-audience-] {+audience,+} it is important that such a measurement finishes
> within a short time-frame [-and-] {+with+} short being anything below 20
> seconds.
>
> 4. Measuring Responsiveness
>
> The ability to reliably measure the responsiveness under typical
> working conditions is predicated by the ability to reliably put the
> network in a state representative of [-the-] said conditions. Once the
> network has reached the required state, its responsiveness can be
> measured. The following explains how the former and the latter are
> achieved.
> @@ -270,7 +270,7 @@
> experiencing ingress and egress flows that are similar to those when
> used by humans in the typical day-to-day pattern.
>
> While any network can be put momentarily into working [-condition-] {+conditions+} by
> the means of a single HTTP transaction, taking measurements requires
> maintaining such conditions over sufficient time. Thus, measuring
> the network responsiveness in a consistent way depends on our ability
> @@ -286,7 +286,7 @@
> way to achieve this is by creating multiple large bulk data-transfers
> in either downstream or upstream direction. Similar to conventional
> speed-test applications that also create a varying number of streams
> to measure throughput. [-Working-conditions-] {+Working conditions+} does the same. It also
> requires a way to detect when the network is in a persistent working
> condition, called "saturation". This can be achieved by monitoring
> the instantaneous goodput over time. When the goodput stops
> @@ -298,7 +298,7 @@
> o Should not waste traffic, since the user may be paying for it
>
> o Should finish within a short time-frame to avoid impacting other
> users on the same network and/or [-experience-] {+experiencing+} varying conditions
>
> 4.1.1. Parallel vs Sequential Uplink and Downlink
>
> @@ -308,8 +308,8 @@
> upstream) or the routing in the ISPs. Users sending data to an
> Internet service will fill the bottleneck on the upstream path to the
> server and thus expose a potential for bufferbloat to happen at this
> bottleneck. [-On-] {+In+} the downlink direction any download from an Internet
> service will encounter a bottleneck and thus [-exposes-] {+expose+} another
> potential for bufferbloat. Thus, when measuring responsiveness under
> working conditions it is important to consider both, the upstream and
> the downstream bufferbloat. This opens the door to measure both
> @@ -322,13 +322,16 @@
> seconds of test per direction, while parallel measurement will allow
> for 20 seconds of testing in both directions.
>
> However, a number caveats come with measuring in parallel:
>
> - [-Half-
> duplex-] {+Half-duplex+} links may not expose uplink and downlink bufferbloat:
> A [-half-
> duplex-] {+half-duplex+} link may not allow [-during parallel measurement-] to saturate both the uplink
> and the downlink [-direction.-] {+direction during parallel measurement.+} Thus,
> bufferbloat in either of the directions may not be exposed during
> parallel measurement.
>
> - Debuggability of the results becomes more obscure:
> During parallel measurement it is impossible to differentiate on
>
>
>
> @@ -338,26 +341,26 @@
> Internet-Draft Responsiveness under Working Conditions August 2021
>
>
> whether the bufferbloat happens in the uplink or the downlink
> direction.
>
> 4.1.2. From [-single-flow-] {+Single-flow+} to [-multi-flow-] {+Multi-flow+}
>
> As described in [-RFC 6349,-] {+[RFC6349],+} a single TCP connection may not be
> sufficient to saturate a path between a client and a server. On a
> high-BDP network, traditional TCP window-size constraints of 4MB are
> often not sufficient to fill the pipe. Additionally, traditional
> loss-based TCP congestion control algorithms aggressively [-reacts-] {+react+} to
> packet-loss by reducing the congestion window. This reaction will
> reduce the queuing in the network, and thus "artificially" make [-the-]
> bufferbloat appear [-lesser.-] {+less of a problem.+}
>
> The goal [-of the measurement-] is to keep the network as busy as possible in a sustained
> and persistent [-way.-] {+way during the measurement.+} Thus, using multiple TCP
> connections is needed for a sustained bufferbloat by gradually adding
> TCP flows until saturation is [-needed.-] {+reached.+}
>
> 4.1.3. Reaching [-saturation-] {+Saturation+}
>
> It is best to detect when saturation has been reached so that the
> measurement of responsiveness can start with the confidence that the
> @@ -367,8 +370,8 @@
> buffers are completely filled. Thus, this depends highly on the
> congestion control that is being deployed on the sender-side.
> Congestion control algorithms like BBR may reach high throughput
> without causing [-bufferbloat.-] {+bufferbloat+} (because the bandwidth-detection portion
> of BBR is effectively seeking the bottleneck [-capacity)-] {+capacity).+}
>
> It is advised to rather use loss-based congestion controls like Cubic
> to "reliably" ensure that the buffers are filled.
> @@ -379,7 +382,7 @@
> packet-loss or ECN-marks signaling a congestion or even a full buffer
> of the bottleneck link.
>
> 4.1.4. Final [-algorithm-] {+Algorithm+}
>
> The following is a proposal for an algorithm to reach saturation of a
> network by using HTTP/2 upload (POST) or download (GET) requests of
> @@ -404,7 +407,7 @@
> throughput will remain stable. In the latter case, this means that
> saturation has been reached and - more importantly - is stable.
>
> In detail, the steps of the algorithm are the [-following-] {+following:+}
>
> o Create 4 load-bearing connections
>
> @@ -453,7 +456,7 @@
> the different stages of a separate network transaction as well as
> measuring on the load-bearing connections themselves.
>
> Two aspects are being measured with this [-approach :-] {+approach:+}
>
> 1. How the network handles new connections and their different
> stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2
> @@ -463,19 +466,19 @@
>
> 2. How the network and the client/server networking stack handles
> the latency on the load-bearing connections themselves. E.g.,
> [-Smart-]
> {+smart+} queuing techniques on the bottleneck will allow to keep the
> latency within a reasonable limit in the network and buffer-
> reducing techniques like TCP_NOTSENT_LOWAT [-makes-] {+make+} sure the client
> and server TCP-stack is not a source of significant latency.
>
> To measure the former, we send a DNS-request, establish a TCP-
> connection on port 443, establish a TLS-context using TLS1.3 and send
> an [-HTTP2-] {+HTTP/2+} GET request for an object {+the size+} of a single [-byte large.-] {+byte.+} This
> measurement will be repeated multiple times for accuracy. Each of
> these stages allows to collect a single latency measurement that can
> then be factored into the responsiveness computation.
>
> To measure the latter, on the load-bearing connections (that [-uses-] {+use+}
> HTTP/2) a GET request is multiplexed. This GET request is for a
> 1-byte object. This allows to measure the end-to-end latency on the
> connections that are using the network at full speed.
> @@ -492,10 +495,10 @@
> an equal weight to each of these measurements.
>
> Finally, the resulting latency needs to be exposed to the users.
> Users have been trained to accept metrics that have a notion of [-"The-] {+"the+}
> higher the better". Latency measuring in units of seconds however is
> "the lower the better". Thus, converting the latency measurement to
> a frequency allows using the familiar notion of [-"The-] {+"the+} higher the
> better". The term frequency has a very technical connotation. What
> we are effectively measuring is the number of round-trips from the
>
> @@ -513,7 +516,7 @@
> which is a wink to the "revolutions per minute" that we are used to
> in cars.
>
> Thus, our unit of measure is [-"Round-trip-] {+"Round-trips+} per Minute" (RPM) that
> expresses responsiveness under working conditions.
>
> 4.2.2. Statistical Confidence
> @@ -527,13 +530,13 @@
> 5. Protocol Specification
>
> By using standard protocols that are most commonly used by end-users,
> no new protocol needs to be specified. However, both [-client-] {+clients+} and
> servers need capabilities to execute this kind of measurement as well
> as a standard to [-flow-] {+follow+} to provision the client with the necessary
> information.
>
> First, the capabilities of both the client and the server: It is
> expected that both hosts support HTTP/2 over TLS [-1.3. That-] {+1.3, and that+} the
> client is able to send a GET-request and a POST. The server needs
> the ability to serve both of these HTTP commands. Further, the
> server endpoint is accessible through a hostname that can be resolved
> @@ -546,13 +549,13 @@
> 1. A config URL/response: This is the configuration file/format used
> by the client. It's a simple JSON file format that points the
> client at the various URLs mentioned below. All of the fields
> are required except "test_endpoint". If the [-service-procier-] {+service-provider+} can
> pin all of the requests for a test run to a specific node in the
> service (for a particular run), they can specify that node's name
> in the "test_endpoint" field. It's preferred that pinning of
> some sort is available. This is to ensure the measurement is
> against the same paths and not switching hosts during a test run
> [-(ie-]
> {+(i.e.,+} moving from near POP A to near POP [-B)-] {+B).+} Sample content of this
> JSON would be:
>
>
> @@ -577,7 +580,7 @@
>
> 3. A "large" URL/response: This needs to serve a status code of 200
> and a body size of at least 8GB. The body can be bigger, and
> will need to grow as network speeds [-increases-] {+increase+} over time. The
> actual body content is irrelevant. The client will probably
> never completely download the object.
>
> @@ -618,16 +621,19 @@
> Internet-Draft Responsiveness under Working Conditions August 2021
>
>
> {+[RFC6349] ...+}
>
> [RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White,
> "Proportional Integral Controller Enhanced (PIE): A
> Lightweight Control Scheme to Address the Bufferbloat
> Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
> <https://www.rfc-editor.org/info/rfc8033>.
>
> [-[RFC8289] Nichols, K., Jacobson, V., McGregor, A.,-]
>
> {+[RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D.,+} Ed., and [-J.
> Iyengar, Ed., "Controlled Delay-]
> {+Gettys, J., "The Flow Queue CoDel Packet Scheduler and+}
> Active Queue [-Management",-] {+Management Algorithm",+} RFC [-8289,-] {+8290,+}
> DOI [-10.17487/RFC8289,-] {+10.17487/RFC8290,+} January 2018,
> [-<https://www.rfc-editor.org/info/rfc8289>.-]
> {+<https://www.rfc-editor.org/info/rfc8290>.+}
>
> Authors' Addresses
>
More information about the Bloat
mailing list