[Rpm] Outch! I found a problem with responsiveness

Tue Oct 5 12:18:55 EDT 2021

Hello Matt,

On 10/04/21 - 16:23, Matt Mathis via Rpm wrote:
> It has a super Heisenberg problem, to the point where it  is unlikely to
> have much predictive value under conditions that are different from the
> measurement itself.    The problem comes from the unbound specification for
> "under load" and the impact of the varying drop/mark rate changing the
> number of rounds needed to complete a transaction, such as a page load.

this is absolutely right. This is why it is not just "Responsiveness", but
"Responsiveness under working conditions" and it is important to specify the
"working conditions" properly. They need to be using a "realistic" workload,
while at the same time exploring the boundaries. This is why we chose a set
of HTTP/2 bulk data-transfers, using standard congestion controls.

> For modern TCP on an otherwise unloaded link with any minimally correct
> queue management (including drop tail), the page load time is insensitive
> to the details of the queue management.    There will be a little bit of
> link idle in the first few RTT (early slowstart), and then under a huge
> range of conditions for both the web page and the AQM, TCP will maintain at
> least a short queue at the bottleneck with zero idle, up until the last
> segment is delivered,   TCP will also avoid sending any duplicate data, so
> the total data sent will be determined by the total number of bytes in the
> page, and the total elapsed time, by the page size and link rate (plus the
> idle from startup).
> 
> If AQM is used to increase the responsiveness, the losses or ECN marks will
> cause the browser to take additional RTTs to load the page.  If there is no
> cross traffic, these two effects (more rounds at higher RPM) will exactly
> counterbalance each other.
> 
> This is perhaps why there are BB deniers: for many simple tasks it has zero
> impact.

That's right. BB is a transient problem that is extremely short-lived.

Having tried for the past year to reliably demo the user-visible
impact of bufferbloat, I have learned two things:

1. When it happens, it is bad - really bad.
2. However, it is very difficult to trigger it "on-demand".

> A concrete definition for "under load" should help to compare metrics
> between implementations, but may not help predicting application
> performance.

"Responsiveness under working conditions" is a metric similar to throughput
measured by tools like speedtest. Sure, speedtest may measure close to 1Gbps
on my home-network, but that does not mean that I am able to actually send
my emails at 1Gbps.
The same is true for responsiveness. It pushes the network to its limit and
explores the capabilities at that point.

Talk to you soon!

Cheers,
Christoph