[Bloat] Credit and/or collaboration on a responsiveness metric?
Kathleen Nichols
nichols at pollere.net
Tue Jul 6 17:56:17 EDT 2021
In coming up with metrics, I would really encourage you to think about
making use of tdigest to gather statistics in some of your on-line
measurement. I'm not sure users see "average" behavior. I mean if
someone is getting great latency numbers most of the time, with a small
percentage of unacceptable values, I don't think their meeting is seeing
that "average" latency as the performance.
Kathie
On 7/6/21 12:04 PM, Matt Mathis via Bloat wrote:
> (Adding other Apple developers back in)
> Jonathan, you didn't even go in the direction I was expecting. My
> fragmentary ideas:
>
> Round counting (the underlying primitive) can be measured or estimated
> in several different ways at different layers:
> TCP/transport layer:
>
> From a .pcap, count rounds: data->ACK->data. easiest in reverse
> time, but timeouts are hard
> From polled smoothed RTT (TCP_INFO or Web100): SUM
> (poll_interval/SRTT) is an estimate of the number of elapsed rounds
> - The SRTT algorithm has been quite stable for decades
> - This algorithm could be applied to ~ 4 Billion MLab traces,
> collected over the last 11 years but are not exposed in the current
> data processing pipeline (my current project)
>
> transport ABI (untested idea);
>
> Use instrumented minimal TCP or QUIC applications (e.g. chargen,
> echo and discard) to count rounds
> For WFID, this would also include the socket buffer backlog, and how
> intelligently the kernel manages buffer space
>
> library:
>
> Use ping messages in http, websockets and other "application" protocols
>
> This also includes the library buffers and their management
>
>
> Note that (rounds per second) * (throughput) is network power
>
> Also the number of rounds "consumed" by an application can be measured.
>
> I will see everybody on Dave's call.
>
> Thanks,
> --MM--
> The best way to predict the future is to create it. - Alan Kay
>
> We must not tolerate intolerance;
> however our response must be carefully measured:
> too strong would be hypocritical and risks spiraling out of
> control;
> too weak risks being mistaken for tacit approval.
>
>
> On Tue, Jul 6, 2021 at 1:48 AM Jonathan Morton <chromatix99 at gmail.com
> <mailto:chromatix99 at gmail.com>> wrote:
>
> > On 6 Jul, 2021, at 2:21 am, Matt Mathis <mattmathis at google.com
> <mailto:mattmathis at google.com>> wrote:
> >
> > The rounds based responsiveness metric is awesome! There are
> several slightly different versions, with slightly different
> properties....
> >
> > I would like to write a little paper (probably for the IAB
> workshop), but don't want to short change anybody else's credit, or
> worse, scoop somebody else's work in progress. I don't really know
> if I am retracing somebody else's steps, or on a parallel but
> different path (more likely). I would be really sad to publish
> something and then find out later that I trashed some PhD students'
> thesis....
>
> It's possible that I had some small influence in originating it,
> although Dave did most of the corporate marketing.
>
> My idea was simply to express delays and latencies as a frequency,
> in Hz, so that "bigger numbers are better", rather than always in
> milliseconds, where "smaller numbers are better". The advantage of
> Hz is that you can directly compare it to framerates of video or
> gameplay.
>
> Conversely, an advantage of "rounds per minute" is that you don't
> need to deal with fractions or rounding for relatively modest and
> common levels of bloat, where latencies of 1-5 seconds are typical.
>
> I'm not overly concerned with taking credit for it, though. It's a
> reasonably obvious idea to anyone who takes a genuine interest in
> this field, and other people did most of the hard work.
>
> > Please let me know if you know of anybody else working in this
> space, of any publications that might be in progress or if people
> might be interested in another collaborator.
>
> There are two distinct types of latency that RPM can be used to
> measure, and I have written a short Internet Draft describing the
> distinction:
>
>
> https://www.ietf.org/archive/id/draft-morton-tsvwg-interflow-intraflow-delays-00.html
> <https://www.ietf.org/archive/id/draft-morton-tsvwg-interflow-intraflow-delays-00.html>
>
> Briefly, "inter-flow delays" (or BFID) are what you measure with an
> independent latency-measuring flow, and "intra-flow delays" (or
> WFID) are what you measure by inserting latency probes into an
> existing flow (whether at the protocol level with HTTP2, or by
> extracting it from existing application activity). The two
> typically differ when the path bottleneck has a flow-isolating
> queue, or when the application flow experiences loss and
> retransmission recovery.
>
> I think both measures are important in different contexts. An
> individual application may be concerned with its own intra-flow
> delay, as that determines how quickly it can respond to changes in
> network conditions or user intent. Network engineers should be
> concerned with inter-flow delays, as those determine what effect a
> bulk application load has on other, more latency-sensitive
> applications. The two are also optimally controlled by different
> mechanisms - FQ versus AQM - which is why the combination of the two
> is so powerful.
>
> Feel free to use material from the above with appropriate attribution.
>
> - Jonathan Morton
>
>
> _______________________________________________
> Bloat mailing list
> Bloat at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>
More information about the Bloat
mailing list