[Rpm] Outch! I found a problem with responsiveness

Tue Oct 5 13:26:12 EDT 2021

On 4 Oct 2021, at 16:23, Matt Mathis via Rpm <rpm at lists.bufferbloat.net> wrote:

> It has a super Heisenberg problem, to the point where it  is unlikely to have much predictive value under conditions that are different from the measurement itself.    The problem comes from the unbound specification for "under load" and the impact of the varying drop/mark rate changing the number of rounds needed to complete a transaction, such as a page load.
> 
> For modern TCP on an otherwise unloaded link with any minimally correct queue management (including drop tail), the page load time is insensitive to the details of the queue management.    There will be a little bit of link idle in the first few RTT (early slowstart), and then under a huge range of conditions for both the web page and the AQM, TCP will maintain at least a short queue at the bottleneck

Surely you mean: TCP will maintain an EVER GROWING queue at the bottleneck? (Of course, the congestion control algorithm in use affects the precise nature of queue growth here. For simplicity here I’m assuming Reno or CUBIC.)

> TCP will also avoid sending any duplicate data, so the total data sent will be determined by the total number of bytes in the page, and the total elapsed time, by the page size and link rate (plus the idle from startup).

You are focusing on time-to-completion for a flow. For clicking “send” on an email, this is a useful metric. For watching a two-hour movie, served as a single large HTTP GET for the entire media file, and playing it as it arrives, time-to-completion is not very interesting. What matters is consistent smooth delivery of the bytes within that flow, so the video can be played as it arrives. And if I get bored of that video and click another, the the amount of (now unwanted) stale packets sitting in the bottleneck queue is what limits how quickly I get to see the new video start playing.

> If AQM is used to increase the responsiveness, the losses or ECN marks will cause the browser to take additional RTTs to load the page.  If there is no cross traffic, these two effects (more rounds at higher RPM) will exactly counterbalance each other.

Right: Improving responsiveness has *no* downside on time-to-completion for a flow. Throughput -- in bytes per second -- is unchanged. What improving responsiveness does is improve what happens throughout the lifetime of the transfer, without affecting the end time either for better or for worse.

> This is perhaps why there are BB deniers: for many simple tasks it has zero impact.

Of course. In the development of any technology we solve the most obvious problems first, and the less obvious ones later.

If there was a bug that occasionally resulted in a corrupted file system and loss of data, would people argue that we shouldn’t fix it on the grounds that sometimes it *doesn’t* corrupt the file system?

If you car brakes didn’t work, would people argue that it doesn’t matter, because -- statistically speaking -- the brake pedal is depressed for only a tiny percentage of overall the time you spend driving?

Stuart Cheshire