[Starlink] [Rpm] [tsvwg] [M-Lab-Discuss] misery metrics & consequences

rjmcmahon rjmcmahon at rjmcmahon.com
Tue Oct 25 11:17:18 EDT 2022


One sample for a subgroup, from an SPC perspective, is typically 
insufficient, e.g. Shewart control charts. Below are some suggestions:

https://bookdown.org/lawson/an_introduction_to_acceptance_sampling_and_spc_with_r26/shewhart-control-charts-in-phase-i.html

o) Define the subgroup size: Initially, this is a constant number of 4 
or 5 items per each subgroup taken over a short enough interval of time 
so that variation among them is due only to common causes.

o) Define the Subgroup Frequency: The subgroups collected should be 
spaced out in time, but collected often enough so that they can 
represent opportunities for the process to change.

o) Define the number of subgroups: Generally 25 or more subgroups are 
necessary to establish the characteristics of a stable process. If some 
subgroups are eliminated before calculating the revised control limits 
due to the discovery of assignable causes, additional subgroups may need 
to be collected so that there are at least 25 subgroups used in 
calculating the revised limits.

Then return the mean and variance per the control chart tables and the 
subgroup size. Also, keep in mind that the subgrouping is normalizing 
the samples so information is lost if the underlying distribution is not 
normal. That's why we give the full histogram in iperf 2. One can 
compare against normal.

https://en.wikipedia.org/wiki/Control_chart

Bob

> On Mon, Oct 24, 2022 at 7:44 PM Christoph Paasch
> <cpaasch=40apple.com at dmarc.ietf.org> wrote:
> 
>> On Oct 24, 2022, at 1:57 PM, Sebastian Moeller <moeller0 at gmx.de>
>> wrote:
>> Hi Christoph
>> 
>> On Oct 24, 2022, at 22:08, Christoph Paasch <cpaasch at apple.com>
>> wrote:
>> 
>> Hello Sebastian,
>> 
>> On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink
>> <starlink at lists.bufferbloat.net> wrote:
>> 
>> Hi Glenn,
>> 
>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm
>> <rpm at lists.bufferbloat.net> wrote:
>> 
>> As a classic died in the wool empiricist, granted that you can
>> identify "misery" factors, given a population of 1,000 users, how do
>> you propose deriving a misery index for that population?
>> 
>> We can measure download, upload, ping, jitter pretty much without
>> user intervention.  For the measurements you hypothesize, how you
>> you automatically extract those indecies without subjective user
>> contamination.
>> 
>> I.e.  my download speed sucks. Measure the download speed.
>> 
>> My isp doesn't fix my problem. Measure what? How?
>> 
>> Human survey technology is 70+ years old and it still has problems
>> figuring out how to correlate opinion with fact.
>> 
>> Without an objective measurement scheme that doesn't require human
>> interaction, the misery index is a cool hypothesis with no way to
>> link to actual data.  What objective measurements can be made?
>> Answer that and the index becomes useful. Otherwise it's just
>> consumer whining.
>> 
>> Not trying to be combative here, in fact I like the concept you
>> support, but I'm hard pressed to see how the concept can lead to
>> data, and the data lead to policy proposals.
>> 
>> [SM] So it seems that outside of seemingly simple to test
>> throughput numbers*, the next most important quality number (or the
>> most important depending on subjective ranking) is how does latency
>> change under "load". Absolute latency is also important albeit
>> static high latency can be worked around within limits so the change
>> under load seems more relevant.
>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's
>> bounceback test offer methods to asses latency change under load**,
>> as do waveforms bufferbloat tests and even to a degree Ookla's
>> speedtest.net [1]. IMHO something like latency increase under load
>> or apple's responsiveness measure RPM (basically the inverse of the
>> latency under load calculated on a per minute basis, so it scales in
>> the typical higher numbers are better way, unlike raw latency under
>> load numbers where smaller is better).
>> IMHO what networkQuality is missing ATM is to measure and report
>> the unloaded RPM as well as the loaded the first gives a measure
>> over the static latency the second over how well things keep working
>> if capacity gets tight. They report the base RTT which can be
>> converted to RPM. As an example:
>> 
>> macbook:~ user$ networkQuality -v
>> ==== SUMMARY ====
>> 
>> Upload capacity: 24.341 Mbps
>> Download capacity: 91.951 Mbps
>> Upload flows: 20
>> Download flows: 16
>> Responsiveness: High (2123 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:44:39
>> End: 10/23/22, 13:44:53
>> OS Version: Version 12.6 (Build 21G115)
> 
> You should update to latest macOS:
> 
> $ networkQuality
> ==== SUMMARY ====
> Uplink capacity: 326.789 Mbps
> Downlink capacity: 446.359 Mbps
> Responsiveness: High (2195 RPM)
> Idle Latency: 5.833 milli-seconds
> 
> ;-)
> 
>  [SM] I wish... just updated to the latest and greatest for this
> hardware (A1398):
> 
> macbook-pro:DPZ smoeller$ networkQuality
> ==== SUMMARY ====
> 
> Upload capacity: 7.478 Mbps
> Download capacity: 2.415 Mbps
> Upload flows: 16
> Download flows: 20
> Responsiveness: Low (90 RPM)
> macbook-pro:DPZ smoeller$ networkQuality -v
> ==== SUMMARY ====
> 
> Upload capacity: 5.830 Mbps
> Download capacity: 6.077 Mbps
> Upload flows: 12
> Download flows: 20
> Responsiveness: Low (56 RPM)
> Base RTT: 134
> Start: 10/24/22, 22:47:48
> End: 10/24/22, 22:48:09
> OS Version: Version 12.6.1 (Build 21G217)
> macbook-pro:DPZ smoeller$
> 
> Still, I only see the "Base RTT" with the -v switch and I am not sure
> whether that is identical to your "Idle Latency".
> 
> I guess I need to convince my employer to exchange that macbook
> (actually because the battery starts bulging and not because I am
> behind with networkQuality versions ;) )
> 
> Yes, you would need macOS Ventura to get the latest and greatest.
> 
>>> But, what I read is: You are suggesting that “Idle Latency”
>>> should be expressed in RPM as well? Or, Responsiveness expressed
>>> in millisecond ?
>> 
>> [SM] Yes, I am fine with either (or both) the idea is to make it
>> really easy to see whether/how much "working conditions" deteriorate
>> the responsiveness / increase the latency-under-load. At least in
>> verbose mode it would be sweet if nwtworkQuality could expose that
>> information.
> 
> I see - let me think about that…
> 
> +1 w/ Sebastian's point here. IMHO it would be great if the
> responsiveness under load and when idle were reported:
> 
>   (a) symmetrically, with the same metrics for both cases, and
> 
>   (b) in both RPM and ms terms for both cases
> 
> So instead of:
> 
> Responsiveness: High (2195 RPM)
> Idle Latency: 5.833 milli-seconds
> 
> Perhaps something like:
> 
> Loaded Responsiveness: High (XXXX RPM)
> Loaded Latency: X.XXX milli-seconds
> Idle Responsiveness: High (XXXX RPM)
> Idle Latency: X.XXX milli-seconds
> 
> Having both RPM and ms available for loaded and unloaded cases would
> seem to make it easier to compare loaded and idle performance more
> directly and in a more apples-to-apples way.
> 
> best,
> neal
> 
> 
> 
> Links:
> ------
> [1] http://speedtest.net
> _______________________________________________
> Rpm mailing list
> Rpm at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ControlChartConstantsAndFormulae.pdf
Type: application/pdf
Size: 59376 bytes
Desc: not available
URL: <https://lists.bufferbloat.net/pipermail/starlink/attachments/20221025/fde3bbca/attachment-0001.pdf>


More information about the Starlink mailing list