[Bloat] Credit and/or collaboration on a responsiveness metric?

Tue Jul 6 17:56:17 EDT 2021

In coming up with metrics, I would really encourage you to think about
making use of tdigest to gather statistics in some of your on-line
measurement. I'm not sure users see "average" behavior. I mean if
someone is getting great latency numbers most of the time, with a small
percentage of unacceptable values, I don't think their meeting is seeing
that "average" latency as the performance.

	Kathie

On 7/6/21 12:04 PM, Matt Mathis via Bloat wrote:
> (Adding other Apple developers back in)
> Jonathan, you didn't even go in the direction I was expecting.   My
> fragmentary ideas:
> 
> Round counting (the underlying primitive) can be measured or estimated
> in several different ways at different layers:
> TCP/transport layer:
> 
>     From a .pcap, count rounds: data->ACK->data.  easiest in reverse
>     time, but timeouts are hard
>     From polled smoothed RTT (TCP_INFO or Web100): SUM
>     (poll_interval/SRTT) is an estimate of the number of elapsed rounds
>     - The SRTT algorithm has been quite stable for  decades
>     - This algorithm could be applied to ~ 4 Billion MLab traces,
>     collected over the last 11 years but are not exposed in the current
>     data processing pipeline (my current project)
> 
> transport ABI (untested idea);
> 
>     Use instrumented minimal TCP or QUIC applications (e.g. chargen,
>     echo and discard) to count rounds
>     For WFID, this would also include the socket buffer backlog, and how
>     intelligently the kernel manages buffer space
> 
> library:
> 
>     Use ping messages in http, websockets and other "application" protocols
> 
>     This also includes the library buffers and their management
> 
> 
> Note that (rounds per second) * (throughput) is network power
> 
> Also the number of rounds "consumed" by an application can be measured.
> 
> I will see everybody on Dave's call.
> 
> Thanks,
> --MM--
> The best way to predict the future is to create it.  - Alan Kay
> 
> We must not tolerate intolerance;
>        however our response must be carefully measured: 
>             too strong would be hypocritical and risks spiraling out of
> control;
>             too weak risks being mistaken for tacit approval.
> 
> 
> On Tue, Jul 6, 2021 at 1:48 AM Jonathan Morton <chromatix99 at gmail.com
> <mailto:chromatix99 at gmail.com>> wrote:
> 
>     > On 6 Jul, 2021, at 2:21 am, Matt Mathis <mattmathis at google.com
>     <mailto:mattmathis at google.com>> wrote:
>     >
>     > The rounds based responsiveness metric is awesome!   There are
>     several slightly different versions, with slightly different
>     properties....
>     >
>     > I would like to write a little paper (probably for the IAB
>     workshop), but don't want to short change anybody else's credit, or
>     worse, scoop somebody else's work in progress.   I don't really know
>     if I am retracing somebody else's steps, or on a parallel but
>     different path (more likely).   I would be really sad to publish
>     something and then find out later that I trashed some PhD students'
>     thesis....
> 
>     It's possible that I had some small influence in originating it,
>     although Dave did most of the corporate marketing.
> 
>     My idea was simply to express delays and latencies as a frequency,
>     in Hz, so that "bigger numbers are better", rather than always in
>     milliseconds, where "smaller numbers are better".  The advantage of
>     Hz is that you can directly compare it to framerates of video or
>     gameplay.
> 
>     Conversely, an advantage of "rounds per minute" is that you don't
>     need to deal with fractions or rounding for relatively modest and
>     common levels of bloat, where latencies of 1-5 seconds are typical.
> 
>     I'm not overly concerned with taking credit for it, though.  It's a
>     reasonably obvious idea to anyone who takes a genuine interest in
>     this field, and other people did most of the hard work.
> 
>     > Please let me know if you know of anybody else working in this
>     space, of any publications that might be in progress or if people
>     might be interested in another collaborator.
> 
>     There are two distinct types of latency that RPM can be used to
>     measure, and I have written a short Internet Draft describing the
>     distinction:
> 
>            
>     https://www.ietf.org/archive/id/draft-morton-tsvwg-interflow-intraflow-delays-00.html
>     <https://www.ietf.org/archive/id/draft-morton-tsvwg-interflow-intraflow-delays-00.html>
> 
>     Briefly, "inter-flow delays" (or BFID) are what you measure with an
>     independent latency-measuring flow, and "intra-flow delays" (or
>     WFID) are what you measure by inserting latency probes into an
>     existing flow (whether at the protocol level with HTTP2, or by
>     extracting it from existing application activity).  The two
>     typically differ when the path bottleneck has a flow-isolating
>     queue, or when the application flow experiences loss and
>     retransmission recovery.
> 
>     I think both measures are important in different contexts.  An
>     individual application may be concerned with its own intra-flow
>     delay, as that determines how quickly it can respond to changes in
>     network conditions or user intent.  Network engineers should be
>     concerned with inter-flow delays, as those determine what effect a
>     bulk application load has on other, more latency-sensitive
>     applications.  The two are also optimally controlled by different
>     mechanisms - FQ versus AQM - which is why the combination of the two
>     is so powerful.
> 
>     Feel free to use material from the above with appropriate attribution.
> 
>      - Jonathan Morton
> 
> 
> _______________________________________________
> Bloat mailing list
> Bloat at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>