[Starlink] [Rpm] [ippm] [M-Lab-Discuss] misery metrics & consequences

Tue Oct 25 13:02:56 EDT 2022

I don't understand the information in the link. It looks like lines on a 
map to some form of dials which are too small to read.

One can sample and create a Gaussian per the central limit theorem (CLT) 
if the underlying process probability density functions converge, i.e. 
can be integrated to 1.  With that said, normalizing does lose 
information and doesn't say much about tails, outliers, etc.

We should be careful in assuming only the tails matter and that all 
traffic follows heavy-tailed distributions. With bufferbloat it's the 
minimum of the latency PDF that shifts. Codel watches a minimum. 
"Jacobson suggested that average queue length actually contains no 
information at all about packet demand or network load.[3][5] He 
suggested that a better metric might be the minimum queue length during 
a sliding time window."

We need statistical tools that also allow for the analysis of 
non-parametric distributions too. Hotelling T2 assumes the multivariate 
distributions are Gaussian. Kolmogorov-Smirnov tests can be used for 
non-parametric distributions. We find both are needed for SPC used by 
our automation systems.

Sample subgroups of one really don't give sufficient information about 
any type of distribution, parametric or non-parametric.

Bob
> Dear all,
> 
> After some time in silence on the IPPM list, I like to make some
> comments here. As we presented in the draft-ietf-ippm-route-00 (now
> the RFC9198), the main problem is the traffic follows heavy-tailed
> distributions when it is seen from the end-to-end points: the origin
> of most of the issues in that video. Therefore, treating it as
> parametric distribution is not possible, unless you are dealing with a
> complex distribution like the Stable distribution:
> 
> 	• B. Mandelbrot, “New methods in statistical economics,” Journal of
> political economy, vol. 71, no. 5, pp. 421–440, 1963.
> 
> 	•  ——, “The variation of certain speculative prices,” The journal of
> business, vol. 36, no. 4, pp. 394–419, 1963.
> 
> (and so it will be extremely complex a high computing demands.)
> This is why we propose to use quartiles to characterize delays in the
> RFC9198. Then, I am doing some research to understand how the delay
> can change with network load, using the quartiles.
> You can see some measurements done during the pandemic, showing the
> congestion as a function of the time (24 hours maximum):
> 
> https://cnet.fi.uba.ar/ignacio.alvarez-hamelin/RIPE-Atlas-measurement-24681441_m_win_data_world_map.html
> 
> [you can zoom in and out, pan it, clicking on the Xs you can close
> dialogs, to reopen them click on the link]
> 
> 
> 
> Best,
> 
> 	Ignacio
> 
> 
> ___________________________________
> 
> 
> _______________________________________________________________
> 
> Dr. Ing. José Ignacio Alvarez-Hamelin
> CONICET and Facultad de Ingeniería, Universidad de Buenos Aires
> Av. Paseo Colón 850 - C1063ACV - Buenos Aires - Argentina
> +54 (11) 5285 0716 / 5285 0705
> e-mail: ihameli at cnet.fi.uba.ar
> web: http://cnet.fi.uba.ar/ignacio.alvarez-hamelin/
> _______________________________________________________________
> 
> 
> 
>> On 23 Oct 2022, at 08:57, Sebastian Moeller <moeller0 at gmx.de> wrote:
>> 
>> Hi Glenn,
>> 
>> 
>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm 
>>> <rpm at lists.bufferbloat.net> wrote:
>>> 
>>> As a classic died in the wool empiricist, granted that you can 
>>> identify "misery" factors, given a population of 1,000 users, how do 
>>> you propose deriving a misery index for that population?
>>> 
>>> We can measure download, upload, ping, jitter pretty much without 
>>> user intervention.  For the measurements you hypothesize, how you you 
>>> automatically extract those indecies without subjective user 
>>> contamination.
>>> 
>>> I.e.  my download speed sucks. Measure the download speed.
>>> 
>>> My isp doesn't fix my problem. Measure what? How?
>>> 
>>> Human survey technology is 70+ years old and it still has problems 
>>> figuring out how to correlate opinion with fact.
>>> 
>>> Without an objective measurement scheme that doesn't require human 
>>> interaction, the misery index is a cool hypothesis with no way to 
>>> link to actual data.  What objective measurements can be made?  
>>> Answer that and the index becomes useful. Otherwise it's just 
>>> consumer whining.
>>> 
>>> Not trying to be combative here, in fact I like the concept you 
>>> support, but I'm hard pressed to see how the concept can lead to 
>>> data, and the data lead to policy proposals.
>> 
>> 	[SM] So it seems that outside of seemingly simple to test throughput 
>> numbers*, the next most important quality number (or the most 
>> important depending on subjective ranking) is how does latency change 
>> under "load". Absolute latency is also important albeit static high 
>> latency can be worked around within limits so the change under load 
>> seems more relevant.
>> 	All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's 
>> bounceback test offer methods to asses latency change under load**, as 
>> do waveforms bufferbloat tests and even to a degree Ookla's 
>> speedtest.net. IMHO something like latency increase under load or 
>> apple's responsiveness measure RPM (basically the inverse of the 
>> latency under load calculated on a per minute basis, so it scales in 
>> the typical higher numbers are better way, unlike raw latency under 
>> load numbers where smaller is better).
>> 	IMHO what networkQuality is missing ATM is to measure and report the 
>> unloaded RPM as well as the loaded the first gives a measure over the 
>> static latency the second over how well things keep working if 
>> capacity gets tight. They report the base RTT which can be converted 
>> to RPM. As an example:
>> 
>> macbook:~ user$ networkQuality -v
>> ==== SUMMARY ====
>> Upload capacity: 24.341 Mbps
>> Download capacity: 91.951 Mbps
>> Upload flows: 20
>> Download flows: 16
>> Responsiveness: High (2123 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:44:39
>> End: 10/23/22, 13:44:53
>> OS Version: Version 12.6 (Build 21G115)
>> 
>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, 
>> while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on 
>> this link load reduces the responsiveness by 3750-2123 = 1627 RPM a 
>> reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM 
>> and scheduling on the router.
>> 
>> Without competent AQM/shaping I get:
>> ==== SUMMARY ====
>> Upload capacity: 15.101 Mbps
>> Download capacity: 97.664 Mbps
>> Upload flows: 20
>> Download flows: 12
>> Responsiveness: Medium (427 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:51:50
>> End: 10/23/22, 13:52:06
>> OS Version: Version 12.6 (Build 21G115)
>> latency under load: 60000/427 = 140.52 ms
>> base RPM: 60000/16 = 3750 RPM
>> reduction RPM: 100-100*427/3750 = 88.6%
>> 
>> 
>> I understand apple's desire to have a single reported number with a 
>> single qualifier medium/high/... because in the end a link is only 
>> reliably usable if responsiveness under load stays acceptable, but 
>> with two numbers it is easier to see what one's ISP could do to help. 
>> (I guess some ISPs might already be unhappy with the single number, so 
>> this needs some diplomacy/tact)
>> 
>> Regards
>> 	Sebastian
>> 
>> 
>> 
>> *) Seemingly as quite some ISPs operate their own speedtest servers in 
>> their network and ignore customers not reaching the contracted rates 
>> into speedtest-servers located in different ASs. As the product is 
>> called internet access I a inclined to expect that my ISP maintains 
>> sufficient peering/transit capacity to reach the next tier of AS at my 
>> contracted rate (the EU legislative seems to agree, see EU directive 
>> 2015/2120).
>> 
>> **) Most do by creating load themselves and measuring throughput at 
>> the same time, bounceback IIUC will focus on the latency measurement 
>> and leave the load generation optional (so offers a mode to measure 
>> responsiveness of a live network with minimal measurement traffic). 
>> @Bob, please correct me if this is wrong.
>> 
>> 
>>> 
>>> 
>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht at gmail.com> wrote:
>>> One of the best talks I've ever seen on how to measure customer
>>> satisfaction properly just went up after the P99 Conference.
>>> 
>>> It's called Misery Metrics.
>>> 
>>> After going through a deep dive as to why and how we think and act on
>>> percentiles, bins, and other statistical methods as to how we use the
>>> web and internet are *so wrong* (well worth watching and thinking
>>> about if you are relying on or creating network metrics today), it
>>> then points to the real metrics that matter to users and the ultimate
>>> success of an internet business: Timeouts, retries, misses, failed
>>> queries, angry phone calls, abandoned shopping carts and loss of
>>> engagement.
>>> 
>>> https://www.p99conf.io/session/misery-metrics-consequences/
>>> 
>>> The ending advice was - don't aim to make a specific percentile
>>> acceptable, aim for an acceptable % of misery.
>>> 
>>> I enjoyed the p99 conference more than any conference I've attended 
>>> in years.
>>> 
>>> --
>>> This song goes out to all the folk that thought Stadia would work:
>>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>>> Dave Täht CEO, TekLibre, LLC
>>> 
>>> --
>>> You received this message because you are subscribed to the Google 
>>> Groups "discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to discuss+unsubscribe at measurementlab.net.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>>> _______________________________________________
>>> Rpm mailing list
>>> Rpm at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/rpm
>> 
>> _______________________________________________
>> ippm mailing list
>> ippm at ietf.org
>> https://www.ietf.org/mailman/listinfo/ippm
> 
> _______________________________________________
> Rpm mailing list
> Rpm at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm