[Bloat] Credit and/or collaboration on a responsiveness metric?

Matt Mathis mattmathis at google.com
Tue Jul 6 15:04:35 EDT 2021


(Adding other Apple developers back in)
Jonathan, you didn't even go in the direction I was expecting.   My
fragmentary ideas:

Round counting (the underlying primitive) can be measured or estimated in
several different ways at different layers:
TCP/transport layer:

>From a .pcap, count rounds: data->ACK->data.  easiest in reverse time, but
timeouts are hard
>From polled smoothed RTT (TCP_INFO or Web100): SUM (poll_interval/SRTT) is
an estimate of the number of elapsed rounds
- The SRTT algorithm has been quite stable for  decades
- This algorithm could be applied to ~ 4 Billion MLab traces, collected
over the last 11 years but are not exposed in the current data processing
pipeline (my current project)

transport ABI (untested idea);

Use instrumented minimal TCP or QUIC applications (e.g. chargen, echo and
discard) to count rounds
For WFID, this would also include the socket buffer backlog, and how
intelligently the kernel manages buffer space

library:

Use ping messages in http, websockets and other "application" protocols

This also includes the library buffers and their management


Note that (rounds per second) * (throughput) is network power

Also the number of rounds "consumed" by an application can be measured.

I will see everybody on Dave's call.

Thanks,
--MM--
The best way to predict the future is to create it.  - Alan Kay

We must not tolerate intolerance;
       however our response must be carefully measured:
            too strong would be hypocritical and risks spiraling out of
control;
            too weak risks being mistaken for tacit approval.


On Tue, Jul 6, 2021 at 1:48 AM Jonathan Morton <chromatix99 at gmail.com>
wrote:

> > On 6 Jul, 2021, at 2:21 am, Matt Mathis <mattmathis at google.com> wrote:
> >
> > The rounds based responsiveness metric is awesome!   There are several
> slightly different versions, with slightly different properties....
> >
> > I would like to write a little paper (probably for the IAB workshop),
> but don't want to short change anybody else's credit, or worse, scoop
> somebody else's work in progress.   I don't really know if I am retracing
> somebody else's steps, or on a parallel but different path (more likely).
>  I would be really sad to publish something and then find out later that I
> trashed some PhD students' thesis....
>
> It's possible that I had some small influence in originating it, although
> Dave did most of the corporate marketing.
>
> My idea was simply to express delays and latencies as a frequency, in Hz,
> so that "bigger numbers are better", rather than always in milliseconds,
> where "smaller numbers are better".  The advantage of Hz is that you can
> directly compare it to framerates of video or gameplay.
>
> Conversely, an advantage of "rounds per minute" is that you don't need to
> deal with fractions or rounding for relatively modest and common levels of
> bloat, where latencies of 1-5 seconds are typical.
>
> I'm not overly concerned with taking credit for it, though.  It's a
> reasonably obvious idea to anyone who takes a genuine interest in this
> field, and other people did most of the hard work.
>
> > Please let me know if you know of anybody else working in this space, of
> any publications that might be in progress or if people might be interested
> in another collaborator.
>
> There are two distinct types of latency that RPM can be used to measure,
> and I have written a short Internet Draft describing the distinction:
>
>
> https://www.ietf.org/archive/id/draft-morton-tsvwg-interflow-intraflow-delays-00.html
>
> Briefly, "inter-flow delays" (or BFID) are what you measure with an
> independent latency-measuring flow, and "intra-flow delays" (or WFID) are
> what you measure by inserting latency probes into an existing flow (whether
> at the protocol level with HTTP2, or by extracting it from existing
> application activity).  The two typically differ when the path bottleneck
> has a flow-isolating queue, or when the application flow experiences loss
> and retransmission recovery.
>
> I think both measures are important in different contexts.  An individual
> application may be concerned with its own intra-flow delay, as that
> determines how quickly it can respond to changes in network conditions or
> user intent.  Network engineers should be concerned with inter-flow delays,
> as those determine what effect a bulk application load has on other, more
> latency-sensitive applications.  The two are also optimally controlled by
> different mechanisms - FQ versus AQM - which is why the combination of the
> two is so powerful.
>
> Feel free to use material from the above with appropriate attribution.
>
>  - Jonathan Morton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/bloat/attachments/20210706/0cb5607d/attachment-0001.html>


More information about the Bloat mailing list