* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 11:57 ` [Starlink] [Rpm] " Sebastian Moeller
@ 2022-10-23 12:17 ` Dave Collier-Brown
2022-10-23 12:26 ` Sebastian Moeller
2022-10-24 20:08 ` Christoph Paasch
` (2 subsequent siblings)
3 siblings, 1 reply; 30+ messages in thread
From: Dave Collier-Brown @ 2022-10-23 12:17 UTC (permalink / raw)
To: starlink
[-- Attachment #1: Type: text/plain, Size: 8417 bytes --]
If our business-transaction customers are made miserable by timeouts, by analogy it follows that home internet users are made miserable by
* stalls, "buffering" and complete disappearance in conference-calls
* "shouting down the well" distortion in any kind of audio
Disappearance is probably disconnection, and easy to measure
The others are delay-related, and can be computed from timestampe and sequence numbers.
Can we provide a tool to expose the latter? A "miserometer"?
--dave
On 10/23/22 07:57, Sebastian Moeller via Starlink wrote:
Hi Glenn,
On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net><mailto:rpm@lists.bufferbloat.net> wrote:
As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
I.e. my download speed sucks. Measure the download speed.
My isp doesn't fix my problem. Measure what? How?
Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
[SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
macbook:~ user$ networkQuality -v
==== SUMMARY ====
Upload capacity: 24.341 Mbps
Download capacity: 91.951 Mbps
Upload flows: 20
Download flows: 16
Responsiveness: High (2123 RPM)
Base RTT: 16
Start: 10/23/22, 13:44:39
End: 10/23/22, 13:44:53
OS Version: Version 12.6 (Build 21G115)
Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
Without competent AQM/shaping I get:
==== SUMMARY ====
Upload capacity: 15.101 Mbps
Download capacity: 97.664 Mbps
Upload flows: 20
Download flows: 12
Responsiveness: Medium (427 RPM)
Base RTT: 16
Start: 10/23/22, 13:51:50
End: 10/23/22, 13:52:06
OS Version: Version 12.6 (Build 21G115)
latency under load: 60000/427 = 140.52 ms
base RPM: 60000/16 = 3750 RPM
reduction RPM: 100-100*427/3750 = 88.6%
I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
Regards
Sebastian
*) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
**) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com><mailto:dave.taht@gmail.com> wrote:
One of the best talks I've ever seen on how to measure customer
satisfaction properly just went up after the P99 Conference.
It's called Misery Metrics.
After going through a deep dive as to why and how we think and act on
percentiles, bins, and other statistical methods as to how we use the
web and internet are *so wrong* (well worth watching and thinking
about if you are relying on or creating network metrics today), it
then points to the real metrics that matter to users and the ultimate
success of an internet business: Timeouts, retries, misses, failed
queries, angry phone calls, abandoned shopping carts and loss of
engagement.
https://www.p99conf.io/session/misery-metrics-consequences/
The ending advice was - don't aim to make a specific percentile
acceptable, aim for an acceptable % of misery.
I enjoyed the p99 conference more than any conference I've attended in years.
--
This song goes out to all the folk that thought Stadia would work:
https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
Dave Täht CEO, TekLibre, LLC
--
You received this message because you are subscribed to the Google Groups "discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@measurementlab.net<mailto:discuss+unsubscribe@measurementlab.net>.
To view this discussion on the web visit https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
_______________________________________________
Rpm mailing list
Rpm@lists.bufferbloat.net<mailto:Rpm@lists.bufferbloat.net>
https://lists.bufferbloat.net/listinfo/rpm
_______________________________________________
Starlink mailing list
Starlink@lists.bufferbloat.net<mailto:Starlink@lists.bufferbloat.net>
https://lists.bufferbloat.net/listinfo/starlink
--
David Collier-Brown, | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com<mailto:dave.collier-brown@indexexchange.com> | -- Mark Twain
CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.
[-- Attachment #2: Type: text/html, Size: 10271 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 12:17 ` Dave Collier-Brown
@ 2022-10-23 12:26 ` Sebastian Moeller
2022-10-23 13:11 ` Dave Collier-Brown
0 siblings, 1 reply; 30+ messages in thread
From: Sebastian Moeller @ 2022-10-23 12:26 UTC (permalink / raw)
To: Dave Collier-Brown; +Cc: starlink
Hi David,
> On Oct 23, 2022, at 14:17, Dave Collier-Brown via Starlink <starlink@lists.bufferbloat.net> wrote:
>
> If our business-transaction customers are made miserable by timeouts, by analogy it follows that home internet users are made miserable by
>
> • stalls, "buffering" and complete disappearance in conference-calls
> • "shouting down the well" distortion in any kind of audio
> Disappearance is probably disconnection, and easy to measure
>
> The others are delay-related, and can be computed from timestampe and sequence numbers.
>
> Can we provide a tool to expose the latter? A "miserometer"?
[SM] Kathy Nichols' pping (https://github.com/pollere/pping) might be an option, either on the ISP side or run on CPEs with some method to harvest the collected data from the ISP side. Protocols with less fields readable like QUIC would require special care to evaluate the spin-bit if that exists. Or just resort to active polling and ping* each CPE once per second or so (for a course resolution, you could increase the polling rate on detecting anomalies thereby risking to make congestion slightly worse). None of this will allow to measure within home network congestion though, but it might still be a wortwhile diagnostic to know that the access link is OK, while the user reports latency issues.
Regards
Sebastian
*) I think there are dedicated devices available that allow to ping large numbers of IPs in a periodic fashion.
>
> --dave
>
>
>
> On 10/23/22 07:57, Sebastian Moeller via Starlink wrote:
>> Hi Glenn,
>>
>>
>>
>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net>
>>> wrote:
>>>
>>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>>
>>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>>
>>> I.e. my download speed sucks. Measure the download speed.
>>>
>>> My isp doesn't fix my problem. Measure what? How?
>>>
>>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>>
>>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>>
>>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>>>
>> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
>> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>>
>> macbook:~ user$ networkQuality -v
>> ==== SUMMARY ====
>> Upload capacity: 24.341 Mbps
>> Download capacity: 91.951 Mbps
>> Upload flows: 20
>> Download flows: 16
>> Responsiveness: High (2123 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:44:39
>> End: 10/23/22, 13:44:53
>> OS Version: Version 12.6 (Build 21G115)
>>
>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>>
>> Without competent AQM/shaping I get:
>> ==== SUMMARY ====
>> Upload capacity: 15.101 Mbps
>> Download capacity: 97.664 Mbps
>> Upload flows: 20
>> Download flows: 12
>> Responsiveness: Medium (427 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:51:50
>> End: 10/23/22, 13:52:06
>> OS Version: Version 12.6 (Build 21G115)
>> latency under load: 60000/427 = 140.52 ms
>> base RPM: 60000/16 = 3750 RPM
>> reduction RPM: 100-100*427/3750 = 88.6%
>>
>>
>> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>>
>> Regards
>> Sebastian
>>
>>
>>
>> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>>
>> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>>
>>
>>
>>>
>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht
>>> <dave.taht@gmail.com>
>>> wrote:
>>> One of the best talks I've ever seen on how to measure customer
>>> satisfaction properly just went up after the P99 Conference.
>>>
>>> It's called Misery Metrics.
>>>
>>> After going through a deep dive as to why and how we think and act on
>>> percentiles, bins, and other statistical methods as to how we use the
>>> web and internet are *so wrong* (well worth watching and thinking
>>> about if you are relying on or creating network metrics today), it
>>> then points to the real metrics that matter to users and the ultimate
>>> success of an internet business: Timeouts, retries, misses, failed
>>> queries, angry phone calls, abandoned shopping carts and loss of
>>> engagement.
>>>
>>>
>>> https://www.p99conf.io/session/misery-metrics-consequences/
>>>
>>>
>>> The ending advice was - don't aim to make a specific percentile
>>> acceptable, aim for an acceptable % of misery.
>>>
>>> I enjoyed the p99 conference more than any conference I've attended in years.
>>>
>>> --
>>> This song goes out to all the folk that thought Stadia would work:
>>>
>>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>>>
>>> Dave Täht CEO, TekLibre, LLC
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to
>>> discuss+unsubscribe@measurementlab.net
>>> .
>>> To view this discussion on the web visit
>>> https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com
>>> .
>>> _______________________________________________
>>> Rpm mailing list
>>>
>>> Rpm@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/rpm
>> _______________________________________________
>> Starlink mailing list
>>
>> Starlink@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/starlink
> --
> David Collier-Brown, | Always do right. This will gratify
> System Programmer and Author | some people and astonish the rest
>
> dave.collier-brown@indexexchange.com | -- Mark Twain
>
> CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.
>
> _______________________________________________
> Starlink mailing list
> Starlink@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/starlink
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 12:26 ` Sebastian Moeller
@ 2022-10-23 13:11 ` Dave Collier-Brown
2022-10-23 13:52 ` Sebastian Moeller
0 siblings, 1 reply; 30+ messages in thread
From: Dave Collier-Brown @ 2022-10-23 13:11 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: starlink
[-- Attachment #1: Type: text/plain, Size: 12348 bytes --]
On 10/23/22 08:26, Sebastian Moeller wrote:
> [SM] Kathy Nichols' pping (https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpollere%2Fpping&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LZLefdgL2yTM%2F5jOLeKLCRcWokWA4ox4Vs0RwYScmqg%3D&reserved=0) might be an option, either on the ISP side or run on CPEs with some method to harvest the collected data from the ISP side.
Yes: I use pping to investigate occasional problems at work, but I was
thinking more about home networks, where some big speed-changes happen
and local congestion happens.
> Protocols with less fields readable like QUIC would require special care to evaluate the spin-bit if that exists. Or just resort to active polling and ping* each CPE once per second or so (for a course resolution, you could increase the polling rate on detecting anomalies thereby risking to make congestion slightly worse). None of this will allow to measure within home network congestion though, but it might still be a wortwhile diagnostic to know that the access link is OK, while the user reports latency issues.
If one has a good way to capture RTT and data rate for one problematic
app, say zoom, then one could see that network problems were happening
at the same time as lags and dropouts.
ISPs would positively /hate/ that, of course.
--dave
>
> Regards
> Sebastian
>
> *) I think there are dedicated devices available that allow to ping large numbers of IPs in a periodic fashion.
>
>
>> --dave
>>
>>
>>
>> On 10/23/22 07:57, Sebastian Moeller via Starlink wrote:
>>> Hi Glenn,
>>>
>>>
>>>
>>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm<rpm@lists.bufferbloat.net>
>>>> wrote:
>>>>
>>>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>>>
>>>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>>>
>>>> I.e. my download speed sucks. Measure the download speed.
>>>>
>>>> My isp doesn't fix my problem. Measure what? How?
>>>>
>>>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>>>
>>>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>>>
>>>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>>>>
>>> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
>>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
>>> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>>>
>>> macbook:~ user$ networkQuality -v
>>> ==== SUMMARY ====
>>> Upload capacity: 24.341 Mbps
>>> Download capacity: 91.951 Mbps
>>> Upload flows: 20
>>> Download flows: 16
>>> Responsiveness: High (2123 RPM)
>>> Base RTT: 16
>>> Start: 10/23/22, 13:44:39
>>> End: 10/23/22, 13:44:53
>>> OS Version: Version 12.6 (Build 21G115)
>>>
>>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>>>
>>> Without competent AQM/shaping I get:
>>> ==== SUMMARY ====
>>> Upload capacity: 15.101 Mbps
>>> Download capacity: 97.664 Mbps
>>> Upload flows: 20
>>> Download flows: 12
>>> Responsiveness: Medium (427 RPM)
>>> Base RTT: 16
>>> Start: 10/23/22, 13:51:50
>>> End: 10/23/22, 13:52:06
>>> OS Version: Version 12.6 (Build 21G115)
>>> latency under load: 60000/427 = 140.52 ms
>>> base RPM: 60000/16 = 3750 RPM
>>> reduction RPM: 100-100*427/3750 = 88.6%
>>>
>>>
>>> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>>>
>>> Regards
>>> Sebastian
>>>
>>>
>>>
>>> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>>>
>>> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>>>
>>>
>>>
>>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht
>>>> <dave.taht@gmail.com>
>>>> wrote:
>>>> One of the best talks I've ever seen on how to measure customer
>>>> satisfaction properly just went up after the P99 Conference.
>>>>
>>>> It's called Misery Metrics.
>>>>
>>>> After going through a deep dive as to why and how we think and act on
>>>> percentiles, bins, and other statistical methods as to how we use the
>>>> web and internet are *so wrong* (well worth watching and thinking
>>>> about if you are relying on or creating network metrics today), it
>>>> then points to the real metrics that matter to users and the ultimate
>>>> success of an internet business: Timeouts, retries, misses, failed
>>>> queries, angry phone calls, abandoned shopping carts and loss of
>>>> engagement.
>>>>
>>>>
>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.p99conf.io%2Fsession%2Fmisery-metrics-consequences%2F&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2BcMJsNSiXRF%2F77x%2FADA88rnaFK8YbIIBKPOua2Rz41s%3D&reserved=0
>>>>
>>>>
>>>> The ending advice was - don't aim to make a specific percentile
>>>> acceptable, aim for an acceptable % of misery.
>>>>
>>>> I enjoyed the p99 conference more than any conference I've attended in years.
>>>>
>>>> --
>>>> This song goes out to all the folk that thought Stadia would work:
>>>>
>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fposts%2Fdtaht_the-mushroom-song-activity-6981366665607352320-FXtz&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=aSrboZRnm30gb6ZRrFtZ01Gl65axo1vmxaouBE1%2FK9k%3D&reserved=0
>>>>
>>>> Dave Täht CEO, TekLibre, LLC
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups "discuss" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an email to
>>>> discuss+unsubscribe@measurementlab.net
>>>> .
>>>> To view this discussion on the web visit
>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fa%2Fmeasurementlab.net%2Fd%2Fmsgid%2Fdiscuss%2FCAA93jw4w27a1EO_QQG7NNkih%252BC3QQde5%253D_7OqGeS9xy9nB6wkg%2540mail.gmail.com&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a2Yru9HMRhkHuP6M8qsA5pgB20uw11w%2BdiyX%2Fy9VYTQ%3D&reserved=0
>>>> .
>>>> _______________________________________________
>>>> Rpm mailing list
>>>>
>>>> Rpm@lists.bufferbloat.net
>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Frpm&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=51HHEyIB0moDJPIsjDnhNNT4YxMvIAGiyh3he5WguVU%3D&reserved=0
>>> _______________________________________________
>>> Starlink mailing list
>>>
>>> Starlink@lists.bufferbloat.net
>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fLB3ojY%2FXNZ2%2FhWc%2B2WfwOhHz1vLIrC653g2ZmlLRrA%3D&reserved=0
>> --
>> David Collier-Brown, | Always do right. This will gratify
>> System Programmer and Author | some people and astonish the rest
>>
>> dave.collier-brown@indexexchange.com | -- Mark Twain
>>
>> CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.
>>
>> _______________________________________________
>> Starlink mailing list
>> Starlink@lists.bufferbloat.net
>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fLB3ojY%2FXNZ2%2FhWc%2B2WfwOhHz1vLIrC653g2ZmlLRrA%3D&reserved=0
--
David Collier-Brown, | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com | -- Mark Twain
[-- Attachment #2: Type: text/html, Size: 17577 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 13:11 ` Dave Collier-Brown
@ 2022-10-23 13:52 ` Sebastian Moeller
2022-10-23 14:00 ` Dave Collier-Brown
2022-10-23 14:08 ` Dave Collier-Brown
0 siblings, 2 replies; 30+ messages in thread
From: Sebastian Moeller @ 2022-10-23 13:52 UTC (permalink / raw)
To: Dave Collier-Brown; +Cc: starlink
Hi David,
> On Oct 23, 2022, at 15:11, Dave Collier-Brown <dave.collier-Brown@indexexchange.com> wrote:
>
> On 10/23/22 08:26, Sebastian Moeller wrote:
>
>> [SM] Kathy Nichols' pping (https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpollere%2Fpping&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LZLefdgL2yTM%2F5jOLeKLCRcWokWA4ox4Vs0RwYScmqg%3D&reserved=0) might be an option, either on the ISP side or run on CPEs with some method to harvest the collected data from the ISP side.
> Yes: I use pping to investigate occasional problems at work, but I was thinking more about home networks, where some big speed-changes happen and local congestion happens.
[SM] Okay. In the context of cake-autorate (https://github.com/lynxthecat/CAKE-autorate/blob/main/README.md) we implemented a flight recorder type logging that continuously logs the last X (configurable) epoch and stores bot shaper and achieved rates as well as the results from the latency probes. This script can be used with rate setting disabled to record relevant data and the user just needs to remember to export the data after experiencing interesting/abnormal events. Sure this does not have per application resolution, but should give some idea about current latency as well as current traffic. I will admit though that this logging is not exactly cheap CPU-wise and lacks the precision of packet captures... but it can be operated as flight recorder where relevant events can be exported/stored post-hoc...
>
>> Protocols with less fields readable like QUIC would require special care to evaluate the spin-bit if that exists. Or just resort to active polling and ping* each CPE once per second or so (for a course resolution, you could increase the polling rate on detecting anomalies thereby risking to make congestion slightly worse). None of this will allow to measure within home network congestion though, but it might still be a wortwhile diagnostic to know that the access link is OK, while the user reports latency issues.
> If one has a good way to capture RTT and data rate for one problematic app, say zoom, then one could see that network problems were happening at the same time as lags and dropouts.
[SM] As above logging all traffic is relatively easy, per application or per flow will require different tools or packet captures...
> ISPs would positively hate that, of course.
[SM] Assuming they come out of this looking bad, if the outcome is to imply the local WiFi being the root cause ISPs might actually appreciate it ;)
Regards
Sebastian
>
> --dave
>
>
>
>
>>
>> Regards
>> Sebastian
>>
>> *) I think there are dedicated devices available that allow to ping large numbers of IPs in a periodic fashion.
>>
>>
>>
>>> --dave
>>>
>>>
>>>
>>> On 10/23/22 07:57, Sebastian Moeller via Starlink wrote:
>>>
>>>> Hi Glenn,
>>>>
>>>>
>>>>
>>>>
>>>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net>
>>>>>
>>>>> wrote:
>>>>>
>>>>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>>>>
>>>>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>>>>
>>>>> I.e. my download speed sucks. Measure the download speed.
>>>>>
>>>>> My isp doesn't fix my problem. Measure what? How?
>>>>>
>>>>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>>>>
>>>>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>>>>
>>>>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>>>>>
>>>>>
>>>> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
>>>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
>>>> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>>>>
>>>> macbook:~ user$ networkQuality -v
>>>> ==== SUMMARY ====
>>>> Upload capacity: 24.341 Mbps
>>>> Download capacity: 91.951 Mbps
>>>> Upload flows: 20
>>>> Download flows: 16
>>>> Responsiveness: High (2123 RPM)
>>>> Base RTT: 16
>>>> Start: 10/23/22, 13:44:39
>>>> End: 10/23/22, 13:44:53
>>>> OS Version: Version 12.6 (Build 21G115)
>>>>
>>>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>>>>
>>>> Without competent AQM/shaping I get:
>>>> ==== SUMMARY ====
>>>> Upload capacity: 15.101 Mbps
>>>> Download capacity: 97.664 Mbps
>>>> Upload flows: 20
>>>> Download flows: 12
>>>> Responsiveness: Medium (427 RPM)
>>>> Base RTT: 16
>>>> Start: 10/23/22, 13:51:50
>>>> End: 10/23/22, 13:52:06
>>>> OS Version: Version 12.6 (Build 21G115)
>>>> latency under load: 60000/427 = 140.52 ms
>>>> base RPM: 60000/16 = 3750 RPM
>>>> reduction RPM: 100-100*427/3750 = 88.6%
>>>>
>>>>
>>>> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>>>>
>>>> Regards
>>>> Sebastian
>>>>
>>>>
>>>>
>>>> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>>>>
>>>> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>>>>
>>>>
>>>>
>>>>
>>>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht
>>>>>
>>>>> <dave.taht@gmail.com>
>>>>>
>>>>> wrote:
>>>>> One of the best talks I've ever seen on how to measure customer
>>>>> satisfaction properly just went up after the P99 Conference.
>>>>>
>>>>> It's called Misery Metrics.
>>>>>
>>>>> After going through a deep dive as to why and how we think and act on
>>>>> percentiles, bins, and other statistical methods as to how we use the
>>>>> web and internet are *so wrong* (well worth watching and thinking
>>>>> about if you are relying on or creating network metrics today), it
>>>>> then points to the real metrics that matter to users and the ultimate
>>>>> success of an internet business: Timeouts, retries, misses, failed
>>>>> queries, angry phone calls, abandoned shopping carts and loss of
>>>>> engagement.
>>>>>
>>>>>
>>>>>
>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.p99conf.io%2Fsession%2Fmisery-metrics-consequences%2F&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2BcMJsNSiXRF%2F77x%2FADA88rnaFK8YbIIBKPOua2Rz41s%3D&reserved=0
>>>>>
>>>>>
>>>>>
>>>>> The ending advice was - don't aim to make a specific percentile
>>>>> acceptable, aim for an acceptable % of misery.
>>>>>
>>>>> I enjoyed the p99 conference more than any conference I've attended in years.
>>>>>
>>>>> --
>>>>> This song goes out to all the folk that thought Stadia would work:
>>>>>
>>>>>
>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fposts%2Fdtaht_the-mushroom-song-activity-6981366665607352320-FXtz&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=aSrboZRnm30gb6ZRrFtZ01Gl65axo1vmxaouBE1%2FK9k%3D&reserved=0
>>>>>
>>>>>
>>>>> Dave Täht CEO, TekLibre, LLC
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google Groups "discuss" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an email to
>>>>>
>>>>> discuss+unsubscribe@measurementlab.net
>>>>>
>>>>> .
>>>>> To view this discussion on the web visit
>>>>>
>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fa%2Fmeasurementlab.net%2Fd%2Fmsgid%2Fdiscuss%2FCAA93jw4w27a1EO_QQG7NNkih%252BC3QQde5%253D_7OqGeS9xy9nB6wkg%2540mail.gmail.com&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a2Yru9HMRhkHuP6M8qsA5pgB20uw11w%2BdiyX%2Fy9VYTQ%3D&reserved=0
>>>>>
>>>>> .
>>>>> _______________________________________________
>>>>> Rpm mailing list
>>>>>
>>>>>
>>>>> Rpm@lists.bufferbloat.net
>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Frpm&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=51HHEyIB0moDJPIsjDnhNNT4YxMvIAGiyh3he5WguVU%3D&reserved=0
>>>> _______________________________________________
>>>> Starlink mailing list
>>>>
>>>>
>>>> Starlink@lists.bufferbloat.net
>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fLB3ojY%2FXNZ2%2FhWc%2B2WfwOhHz1vLIrC653g2ZmlLRrA%3D&reserved=0
>>> --
>>> David Collier-Brown, | Always do right. This will gratify
>>> System Programmer and Author | some people and astonish the rest
>>>
>>>
>>> dave.collier-brown@indexexchange.com
>>> | -- Mark Twain
>>>
>>> CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.
>>>
>>> _______________________________________________
>>> Starlink mailing list
>>>
>>> Starlink@lists.bufferbloat.net
>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C678b8216944a43dba97508dab4f1c261%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021247687178024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fLB3ojY%2FXNZ2%2FhWc%2B2WfwOhHz1vLIrC653g2ZmlLRrA%3D&reserved=0
> --
> David Collier-Brown, | Always do right. This will gratify
> System Programmer and Author | some people and astonish the rest
>
> dave.collier-brown@indexexchange.com | -- Mark Twain
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 13:52 ` Sebastian Moeller
@ 2022-10-23 14:00 ` Dave Collier-Brown
2022-10-23 14:08 ` Sebastian Moeller
2022-10-23 14:08 ` Dave Collier-Brown
1 sibling, 1 reply; 30+ messages in thread
From: Dave Collier-Brown @ 2022-10-23 14:00 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: starlink
On 10/23/22 09:52, Sebastian Moeller wrote:
> [SM] Okay. In the context of cake-autorate (https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flynxthecat%2FCAKE-autorate%2Fblob%2Fmain%2FREADME.md&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LwP%2F45%2FIdIJgdFGITFEgYa2egg4xyQEqkE21HYKb0nM%3D&reserved=0) we implemented a flight recorder type logging that continuously logs the last X (configurable) epoch and stores bot shaper and achieved rates as well as the results from the latency probes. This script can be used with rate setting disabled to record relevant data and the user just needs to remember to export the data after experiencing interesting/abnormal events. Sure this does not have per application resolution, but should give some idea about current latency as well as current traffic. I will admit though that this logging is not exactly cheap CPU-wise and lacks the precision of packet captures... but it can be operated as flight recorder where relevant events can be exported/stored post-hoc...
>
Ah, now that sounds interesting. I especially like the use of "flight
recorder", as one could express using it as "whenever your MS Teams
session crashes ..."
>>> Protocols with less fields readable like QUIC would require special care to evaluate the spin-bit if that exists. Or just resort to active polling and ping* each CPE once per second or so (for a course resolution, you could increase the polling rate on detecting anomalies thereby risking to make congestion slightly worse). None of this will allow to measure within home network congestion though, but it might still be a wortwhile diagnostic to know that the access link is OK, while the user reports latency issues.
>> If one has a good way to capture RTT and data rate for one problematic app, say zoom, then one could see that network problems were happening at the same time as lags and dropouts.
> [SM] As above logging all traffic is relatively easy, per application or per flow will require different tools or packet captures...
>
>> ISPs would positively hate that, of course.
> [SM] Assuming they come out of this looking bad, if the outcome is to imply the local WiFi being the root cause ISPs might actually appreciate it ;)
Ah, I hadn't thought of that!
--dave
>
> Regards
> Sebastian
>
>> --dave
>>
>>
>>
>>
>>> Regards
>>> Sebastian
>>>
>>> *) I think there are dedicated devices available that allow to ping large numbers of IPs in a periodic fashion.
>>>
>>>
>>>
>>>> --dave
>>>>
>>>>
>>>>
>>>> On 10/23/22 07:57, Sebastian Moeller via Starlink wrote:
>>>>
>>>>> Hi Glenn,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net>
>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>>>>>
>>>>>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>>>>>
>>>>>> I.e. my download speed sucks. Measure the download speed.
>>>>>>
>>>>>> My isp doesn't fix my problem. Measure what? How?
>>>>>>
>>>>>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>>>>>
>>>>>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>>>>>
>>>>>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>>>>>>
>>>>>>
>>>>> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
>>>>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
>>>>> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>>>>>
>>>>> macbook:~ user$ networkQuality -v
>>>>> ==== SUMMARY ====
>>>>> Upload capacity: 24.341 Mbps
>>>>> Download capacity: 91.951 Mbps
>>>>> Upload flows: 20
>>>>> Download flows: 16
>>>>> Responsiveness: High (2123 RPM)
>>>>> Base RTT: 16
>>>>> Start: 10/23/22, 13:44:39
>>>>> End: 10/23/22, 13:44:53
>>>>> OS Version: Version 12.6 (Build 21G115)
>>>>>
>>>>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>>>>>
>>>>> Without competent AQM/shaping I get:
>>>>> ==== SUMMARY ====
>>>>> Upload capacity: 15.101 Mbps
>>>>> Download capacity: 97.664 Mbps
>>>>> Upload flows: 20
>>>>> Download flows: 12
>>>>> Responsiveness: Medium (427 RPM)
>>>>> Base RTT: 16
>>>>> Start: 10/23/22, 13:51:50
>>>>> End: 10/23/22, 13:52:06
>>>>> OS Version: Version 12.6 (Build 21G115)
>>>>> latency under load: 60000/427 = 140.52 ms
>>>>> base RPM: 60000/16 = 3750 RPM
>>>>> reduction RPM: 100-100*427/3750 = 88.6%
>>>>>
>>>>>
>>>>> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>>>>>
>>>>> Regards
>>>>> Sebastian
>>>>>
>>>>>
>>>>>
>>>>> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>>>>>
>>>>> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht
>>>>>>
>>>>>> <dave.taht@gmail.com>
>>>>>>
>>>>>> wrote:
>>>>>> One of the best talks I've ever seen on how to measure customer
>>>>>> satisfaction properly just went up after the P99 Conference.
>>>>>>
>>>>>> It's called Misery Metrics.
>>>>>>
>>>>>> After going through a deep dive as to why and how we think and act on
>>>>>> percentiles, bins, and other statistical methods as to how we use the
>>>>>> web and internet are *so wrong* (well worth watching and thinking
>>>>>> about if you are relying on or creating network metrics today), it
>>>>>> then points to the real metrics that matter to users and the ultimate
>>>>>> success of an internet business: Timeouts, retries, misses, failed
>>>>>> queries, angry phone calls, abandoned shopping carts and loss of
>>>>>> engagement.
>>>>>>
>>>>>>
>>>>>>
>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.p99conf.io%2Fsession%2Fmisery-metrics-consequences%2F&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2FOe2eo9f7JQ8bnQRB23HEaeXq6G9QxSQ%2FZkNb%2F6ctyU%3D&reserved=0
>>>>>>
>>>>>>
>>>>>>
>>>>>> The ending advice was - don't aim to make a specific percentile
>>>>>> acceptable, aim for an acceptable % of misery.
>>>>>>
>>>>>> I enjoyed the p99 conference more than any conference I've attended in years.
>>>>>>
>>>>>> --
>>>>>> This song goes out to all the folk that thought Stadia would work:
>>>>>>
>>>>>>
>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fposts%2Fdtaht_the-mushroom-song-activity-6981366665607352320-FXtz&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ALKX4qknTgJBAiBET9j2yfdyhuEmM5rs2Ng3%2B09rat4%3D&reserved=0
>>>>>>
>>>>>>
>>>>>> Dave Täht CEO, TekLibre, LLC
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google Groups "discuss" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send an email to
>>>>>>
>>>>>> discuss+unsubscribe@measurementlab.net
>>>>>>
>>>>>> .
>>>>>> To view this discussion on the web visit
>>>>>>
>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fa%2Fmeasurementlab.net%2Fd%2Fmsgid%2Fdiscuss%2FCAA93jw4w27a1EO_QQG7NNkih%252BC3QQde5%253D_7OqGeS9xy9nB6wkg%2540mail.gmail.com&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HVk9tgu97ElRdvdHiiE3PSuEzT6PM731Ag4XMIVDJIU%3D&reserved=0
>>>>>>
>>>>>> .
>>>>>> _______________________________________________
>>>>>> Rpm mailing list
>>>>>>
>>>>>>
>>>>>> Rpm@lists.bufferbloat.net
>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Frpm&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9Qd2WIP0ONe2zt%2FX3r0ws3QQMkRNjfmeY7dl9LH6T9k%3D&reserved=0
>>>>> _______________________________________________
>>>>> Starlink mailing list
>>>>>
>>>>>
>>>>> Starlink@lists.bufferbloat.net
>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0
>>>> --
>>>> David Collier-Brown, | Always do right. This will gratify
>>>> System Programmer and Author | some people and astonish the rest
>>>>
>>>>
>>>> dave.collier-brown@indexexchange.com
>>>> | -- Mark Twain
>>>>
>>>> CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.
>>>>
>>>> _______________________________________________
>>>> Starlink mailing list
>>>>
>>>> Starlink@lists.bufferbloat.net
>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0
>> --
>> David Collier-Brown, | Always do right. This will gratify
>> System Programmer and Author | some people and astonish the rest
>>
>> dave.collier-brown@indexexchange.com | -- Mark Twain
--
David Collier-Brown, | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com | -- Mark Twain
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 14:00 ` Dave Collier-Brown
@ 2022-10-23 14:08 ` Sebastian Moeller
0 siblings, 0 replies; 30+ messages in thread
From: Sebastian Moeller @ 2022-10-23 14:08 UTC (permalink / raw)
To: Dave Collier-Brown; +Cc: starlink
Hi David,
> On Oct 23, 2022, at 16:00, Dave Collier-Brown <dave.collier-Brown@indexexchange.com> wrote:
>
>
> On 10/23/22 09:52, Sebastian Moeller wrote:
>> [SM] Okay. In the context of cake-autorate (https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flynxthecat%2FCAKE-autorate%2Fblob%2Fmain%2FREADME.md&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LwP%2F45%2FIdIJgdFGITFEgYa2egg4xyQEqkE21HYKb0nM%3D&reserved=0) we implemented a flight recorder type logging that continuously logs the last X (configurable) epoch and stores bot shaper and achieved rates as well as the results from the latency probes. This script can be used with rate setting disabled to record relevant data and the user just needs to remember to export the data after experiencing interesting/abnormal events. Sure this does not have per application resolution, but should give some idea about current latency as well as current traffic. I will admit though that this logging is not exactly cheap CPU-wise and lacks the precision of packet captures... but it can be operated as flight recorder where relevant events can be exported/stored post-hoc...
>>
> Ah, now that sounds interesting. I especially like the use of "flight recorder", as one could express using it as "whenever your MS Teams session crashes ..."
[SM] Yepp, there is a cost though such a log file easily gets into the multi-dozend MB size range if the logging interval is set high, but the last 10 minutes should not be that costly (then again many routers are both storage and RAM limited, so this might be a problem).
>>>> Protocols with less fields readable like QUIC would require special care to evaluate the spin-bit if that exists. Or just resort to active polling and ping* each CPE once per second or so (for a course resolution, you could increase the polling rate on detecting anomalies thereby risking to make congestion slightly worse). None of this will allow to measure within home network congestion though, but it might still be a wortwhile diagnostic to know that the access link is OK, while the user reports latency issues.
>>> If one has a good way to capture RTT and data rate for one problematic app, say zoom, then one could see that network problems were happening at the same time as lags and dropouts.
>> [SM] As above logging all traffic is relatively easy, per application or per flow will require different tools or packet captures...
>>
>>> ISPs would positively hate that, of course.
>> [SM] Assuming they come out of this looking bad, if the outcome is to imply the local WiFi being the root cause ISPs might actually appreciate it ;)
>
> Ah, I hadn't thought of that!
[SM] I admit that currently ay such localization would require an additional trace showing congestion/latency increase from internal hosts, while the autorate logs show no such issue on the wan link. This is where iperf2's bounceback could be helpful.
Regards
Sebastian
>
> --dave
>
>
>>
>> Regards
>> Sebastian
>>
>>> --dave
>>>
>>>
>>>
>>>
>>>> Regards
>>>> Sebastian
>>>>
>>>> *) I think there are dedicated devices available that allow to ping large numbers of IPs in a periodic fashion.
>>>>
>>>>
>>>>
>>>>> --dave
>>>>>
>>>>>
>>>>>
>>>>> On 10/23/22 07:57, Sebastian Moeller via Starlink wrote:
>>>>>
>>>>>> Hi Glenn,
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net>
>>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>>>>>>
>>>>>>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>>>>>>
>>>>>>> I.e. my download speed sucks. Measure the download speed.
>>>>>>>
>>>>>>> My isp doesn't fix my problem. Measure what? How?
>>>>>>>
>>>>>>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>>>>>>
>>>>>>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>>>>>>
>>>>>>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>>>>>>>
>>>>>>>
>>>>>> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
>>>>>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
>>>>>> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>>>>>>
>>>>>> macbook:~ user$ networkQuality -v
>>>>>> ==== SUMMARY ====
>>>>>> Upload capacity: 24.341 Mbps
>>>>>> Download capacity: 91.951 Mbps
>>>>>> Upload flows: 20
>>>>>> Download flows: 16
>>>>>> Responsiveness: High (2123 RPM)
>>>>>> Base RTT: 16
>>>>>> Start: 10/23/22, 13:44:39
>>>>>> End: 10/23/22, 13:44:53
>>>>>> OS Version: Version 12.6 (Build 21G115)
>>>>>>
>>>>>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>>>>>>
>>>>>> Without competent AQM/shaping I get:
>>>>>> ==== SUMMARY ====
>>>>>> Upload capacity: 15.101 Mbps
>>>>>> Download capacity: 97.664 Mbps
>>>>>> Upload flows: 20
>>>>>> Download flows: 12
>>>>>> Responsiveness: Medium (427 RPM)
>>>>>> Base RTT: 16
>>>>>> Start: 10/23/22, 13:51:50
>>>>>> End: 10/23/22, 13:52:06
>>>>>> OS Version: Version 12.6 (Build 21G115)
>>>>>> latency under load: 60000/427 = 140.52 ms
>>>>>> base RPM: 60000/16 = 3750 RPM
>>>>>> reduction RPM: 100-100*427/3750 = 88.6%
>>>>>>
>>>>>>
>>>>>> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>>>>>>
>>>>>> Regards
>>>>>> Sebastian
>>>>>>
>>>>>>
>>>>>>
>>>>>> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>>>>>>
>>>>>> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht
>>>>>>>
>>>>>>> <dave.taht@gmail.com>
>>>>>>>
>>>>>>> wrote:
>>>>>>> One of the best talks I've ever seen on how to measure customer
>>>>>>> satisfaction properly just went up after the P99 Conference.
>>>>>>>
>>>>>>> It's called Misery Metrics.
>>>>>>>
>>>>>>> After going through a deep dive as to why and how we think and act on
>>>>>>> percentiles, bins, and other statistical methods as to how we use the
>>>>>>> web and internet are *so wrong* (well worth watching and thinking
>>>>>>> about if you are relying on or creating network metrics today), it
>>>>>>> then points to the real metrics that matter to users and the ultimate
>>>>>>> success of an internet business: Timeouts, retries, misses, failed
>>>>>>> queries, angry phone calls, abandoned shopping carts and loss of
>>>>>>> engagement.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.p99conf.io%2Fsession%2Fmisery-metrics-consequences%2F&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2FOe2eo9f7JQ8bnQRB23HEaeXq6G9QxSQ%2FZkNb%2F6ctyU%3D&reserved=0
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The ending advice was - don't aim to make a specific percentile
>>>>>>> acceptable, aim for an acceptable % of misery.
>>>>>>>
>>>>>>> I enjoyed the p99 conference more than any conference I've attended in years.
>>>>>>>
>>>>>>> --
>>>>>>> This song goes out to all the folk that thought Stadia would work:
>>>>>>>
>>>>>>>
>>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fposts%2Fdtaht_the-mushroom-song-activity-6981366665607352320-FXtz&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ALKX4qknTgJBAiBET9j2yfdyhuEmM5rs2Ng3%2B09rat4%3D&reserved=0
>>>>>>>
>>>>>>>
>>>>>>> Dave Täht CEO, TekLibre, LLC
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google Groups "discuss" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, send an email to
>>>>>>>
>>>>>>> discuss+unsubscribe@measurementlab.net
>>>>>>>
>>>>>>> .
>>>>>>> To view this discussion on the web visit
>>>>>>>
>>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fa%2Fmeasurementlab.net%2Fd%2Fmsgid%2Fdiscuss%2FCAA93jw4w27a1EO_QQG7NNkih%252BC3QQde5%253D_7OqGeS9xy9nB6wkg%2540mail.gmail.com&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HVk9tgu97ElRdvdHiiE3PSuEzT6PM731Ag4XMIVDJIU%3D&reserved=0
>>>>>>>
>>>>>>> .
>>>>>>> _______________________________________________
>>>>>>> Rpm mailing list
>>>>>>>
>>>>>>>
>>>>>>> Rpm@lists.bufferbloat.net
>>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Frpm&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9Qd2WIP0ONe2zt%2FX3r0ws3QQMkRNjfmeY7dl9LH6T9k%3D&reserved=0
>>>>>> _______________________________________________
>>>>>> Starlink mailing list
>>>>>>
>>>>>>
>>>>>> Starlink@lists.bufferbloat.net
>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0
>>>>> --
>>>>> David Collier-Brown, | Always do right. This will gratify
>>>>> System Programmer and Author | some people and astonish the rest
>>>>>
>>>>>
>>>>> dave.collier-brown@indexexchange.com
>>>>> | -- Mark Twain
>>>>>
>>>>> CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.
>>>>>
>>>>> _______________________________________________
>>>>> Starlink mailing list
>>>>>
>>>>> Starlink@lists.bufferbloat.net
>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0
>>> --
>>> David Collier-Brown, | Always do right. This will gratify
>>> System Programmer and Author | some people and astonish the rest
>>>
>>> dave.collier-brown@indexexchange.com | -- Mark Twain
>
> --
> David Collier-Brown, | Always do right. This will gratify
> System Programmer and Author | some people and astonish the rest
> dave.collier-brown@indexexchange.com | -- Mark Twain
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 13:52 ` Sebastian Moeller
2022-10-23 14:00 ` Dave Collier-Brown
@ 2022-10-23 14:08 ` Dave Collier-Brown
2022-10-23 15:01 ` tom
1 sibling, 1 reply; 30+ messages in thread
From: Dave Collier-Brown @ 2022-10-23 14:08 UTC (permalink / raw)
To: Sebastian Moeller; +Cc: starlink
[-- Attachment #1: Type: text/plain, Size: 15374 bytes --]
OK, it's pretty clear that we're already measuring and adapting to
misery, does anyone have a good reason to want to provide a "misery meter"?
I'd normally be tempted, but I'm working in the ML team in a startup,
and have been having trouble even /reading/ email this year (;-))
--dave
On 10/23/22 09:52, Sebastian Moeller wrote:
> [EXTERNAL] This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
> Hi David,
>
>
>> On Oct 23, 2022, at 15:11, Dave Collier-Brown<dave.collier-Brown@indexexchange.com> wrote:
>>
>> On 10/23/22 08:26, Sebastian Moeller wrote:
>>
>>> [SM] Kathy Nichols' pping (https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpollere%2Fpping&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=JsNkkEDTzwCintX7H2KhsviIuc2S4r7RFYIevucnKeA%3D&reserved=0) might be an option, either on the ISP side or run on CPEs with some method to harvest the collected data from the ISP side.
>> Yes: I use pping to investigate occasional problems at work, but I was thinking more about home networks, where some big speed-changes happen and local congestion happens.
> [SM] Okay. In the context of cake-autorate (https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flynxthecat%2FCAKE-autorate%2Fblob%2Fmain%2FREADME.md&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LwP%2F45%2FIdIJgdFGITFEgYa2egg4xyQEqkE21HYKb0nM%3D&reserved=0) we implemented a flight recorder type logging that continuously logs the last X (configurable) epoch and stores bot shaper and achieved rates as well as the results from the latency probes. This script can be used with rate setting disabled to record relevant data and the user just needs to remember to export the data after experiencing interesting/abnormal events. Sure this does not have per application resolution, but should give some idea about current latency as well as current traffic. I will admit though that this logging is not exactly cheap CPU-wise and lacks the precision of packet captures... but it can be operated as flight recorder where relevant events can be exported/stored post-hoc...
>
>
>>> Protocols with less fields readable like QUIC would require special care to evaluate the spin-bit if that exists. Or just resort to active polling and ping* each CPE once per second or so (for a course resolution, you could increase the polling rate on detecting anomalies thereby risking to make congestion slightly worse). None of this will allow to measure within home network congestion though, but it might still be a wortwhile diagnostic to know that the access link is OK, while the user reports latency issues.
>> If one has a good way to capture RTT and data rate for one problematic app, say zoom, then one could see that network problems were happening at the same time as lags and dropouts.
> [SM] As above logging all traffic is relatively easy, per application or per flow will require different tools or packet captures...
>
>> ISPs would positively hate that, of course.
> [SM] Assuming they come out of this looking bad, if the outcome is to imply the local WiFi being the root cause ISPs might actually appreciate it ;)
>
> Regards
> Sebastian
>
>> --dave
>>
>>
>>
>>
>>> Regards
>>> Sebastian
>>>
>>> *) I think there are dedicated devices available that allow to ping large numbers of IPs in a periodic fashion.
>>>
>>>
>>>
>>>> --dave
>>>>
>>>>
>>>>
>>>> On 10/23/22 07:57, Sebastian Moeller via Starlink wrote:
>>>>
>>>>> Hi Glenn,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm<rpm@lists.bufferbloat.net>
>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>>>>>
>>>>>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>>>>>
>>>>>> I.e. my download speed sucks. Measure the download speed.
>>>>>>
>>>>>> My isp doesn't fix my problem. Measure what? How?
>>>>>>
>>>>>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>>>>>
>>>>>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>>>>>
>>>>>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>>>>>>
>>>>>>
>>>>> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
>>>>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
>>>>> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>>>>>
>>>>> macbook:~ user$ networkQuality -v
>>>>> ==== SUMMARY ====
>>>>> Upload capacity: 24.341 Mbps
>>>>> Download capacity: 91.951 Mbps
>>>>> Upload flows: 20
>>>>> Download flows: 16
>>>>> Responsiveness: High (2123 RPM)
>>>>> Base RTT: 16
>>>>> Start: 10/23/22, 13:44:39
>>>>> End: 10/23/22, 13:44:53
>>>>> OS Version: Version 12.6 (Build 21G115)
>>>>>
>>>>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>>>>>
>>>>> Without competent AQM/shaping I get:
>>>>> ==== SUMMARY ====
>>>>> Upload capacity: 15.101 Mbps
>>>>> Download capacity: 97.664 Mbps
>>>>> Upload flows: 20
>>>>> Download flows: 12
>>>>> Responsiveness: Medium (427 RPM)
>>>>> Base RTT: 16
>>>>> Start: 10/23/22, 13:51:50
>>>>> End: 10/23/22, 13:52:06
>>>>> OS Version: Version 12.6 (Build 21G115)
>>>>> latency under load: 60000/427 = 140.52 ms
>>>>> base RPM: 60000/16 = 3750 RPM
>>>>> reduction RPM: 100-100*427/3750 = 88.6%
>>>>>
>>>>>
>>>>> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>>>>>
>>>>> Regards
>>>>> Sebastian
>>>>>
>>>>>
>>>>>
>>>>> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>>>>>
>>>>> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht
>>>>>>
>>>>>> <dave.taht@gmail.com>
>>>>>>
>>>>>> wrote:
>>>>>> One of the best talks I've ever seen on how to measure customer
>>>>>> satisfaction properly just went up after the P99 Conference.
>>>>>>
>>>>>> It's called Misery Metrics.
>>>>>>
>>>>>> After going through a deep dive as to why and how we think and act on
>>>>>> percentiles, bins, and other statistical methods as to how we use the
>>>>>> web and internet are *so wrong* (well worth watching and thinking
>>>>>> about if you are relying on or creating network metrics today), it
>>>>>> then points to the real metrics that matter to users and the ultimate
>>>>>> success of an internet business: Timeouts, retries, misses, failed
>>>>>> queries, angry phone calls, abandoned shopping carts and loss of
>>>>>> engagement.
>>>>>>
>>>>>>
>>>>>>
>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.p99conf.io%2Fsession%2Fmisery-metrics-consequences%2F&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2FOe2eo9f7JQ8bnQRB23HEaeXq6G9QxSQ%2FZkNb%2F6ctyU%3D&reserved=0
>>>>>>
>>>>>>
>>>>>>
>>>>>> The ending advice was - don't aim to make a specific percentile
>>>>>> acceptable, aim for an acceptable % of misery.
>>>>>>
>>>>>> I enjoyed the p99 conference more than any conference I've attended in years.
>>>>>>
>>>>>> --
>>>>>> This song goes out to all the folk that thought Stadia would work:
>>>>>>
>>>>>>
>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fposts%2Fdtaht_the-mushroom-song-activity-6981366665607352320-FXtz&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ALKX4qknTgJBAiBET9j2yfdyhuEmM5rs2Ng3%2B09rat4%3D&reserved=0
>>>>>>
>>>>>>
>>>>>> Dave Täht CEO, TekLibre, LLC
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google Groups "discuss" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send an email to
>>>>>>
>>>>>> discuss+unsubscribe@measurementlab.net
>>>>>>
>>>>>> .
>>>>>> To view this discussion on the web visit
>>>>>>
>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fa%2Fmeasurementlab.net%2Fd%2Fmsgid%2Fdiscuss%2FCAA93jw4w27a1EO_QQG7NNkih%252BC3QQde5%253D_7OqGeS9xy9nB6wkg%2540mail.gmail.com&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HVk9tgu97ElRdvdHiiE3PSuEzT6PM731Ag4XMIVDJIU%3D&reserved=0
>>>>>>
>>>>>> .
>>>>>> _______________________________________________
>>>>>> Rpm mailing list
>>>>>>
>>>>>>
>>>>>> Rpm@lists.bufferbloat.net
>>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Frpm&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9Qd2WIP0ONe2zt%2FX3r0ws3QQMkRNjfmeY7dl9LH6T9k%3D&reserved=0
>>>>> _______________________________________________
>>>>> Starlink mailing list
>>>>>
>>>>>
>>>>> Starlink@lists.bufferbloat.net
>>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0
>>>> --
>>>> David Collier-Brown, | Always do right. This will gratify
>>>> System Programmer and Author | some people and astonish the rest
>>>>
>>>>
>>>> dave.collier-brown@indexexchange.com
>>>> | -- Mark Twain
>>>>
>>>> CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.
>>>>
>>>> _______________________________________________
>>>> Starlink mailing list
>>>>
>>>> Starlink@lists.bufferbloat.net
>>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0
>> --
>> David Collier-Brown, | Always do right. This will gratify
>> System Programmer and Author | some people and astonish the rest
>>
>> dave.collier-brown@indexexchange.com | -- Mark Twain
--
David Collier-Brown, | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com | -- Mark Twain
[-- Attachment #2: Type: text/html, Size: 21509 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 14:08 ` Dave Collier-Brown
@ 2022-10-23 15:01 ` tom
0 siblings, 0 replies; 30+ messages in thread
From: tom @ 2022-10-23 15:01 UTC (permalink / raw)
To: 'Dave Collier-Brown', 'Sebastian Moeller'; +Cc: starlink
[-- Attachment #1: Type: text/plain, Size: 18587 bytes --]
I think a “good reason” is a way to meaningfully compare the usefulness of various ISPs for teleconferencing. Needs obviously different from web browsing, email, or, steaming but critical to many people. Lots of federal dollars going out to build out “broadband” but generally with up and download speed as the only metrics and now acknowledgement of the “misery” jitter or just log latency can impose.
From: Starlink <starlink-bounces@lists.bufferbloat.net> On Behalf Of Dave Collier-Brown via Starlink
Sent: Sunday, October 23, 2022 10:08 AM
To: Sebastian Moeller <moeller0@gmx.de>
Cc: starlink@lists.bufferbloat.net
Subject: Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
OK, it's pretty clear that we're already measuring and adapting to misery, does anyone have a good reason to want to provide a "misery meter"?
I'd normally be tempted, but I'm working in the ML team in a startup, and have been having trouble even reading email this year (;-))
--dave
On 10/23/22 09:52, Sebastian Moeller wrote:
[EXTERNAL] This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi David,
On Oct 23, 2022, at 15:11, Dave Collier-Brown <mailto:dave.collier-Brown@indexexchange.com> <dave.collier-Brown@indexexchange.com> wrote:
On 10/23/22 08:26, Sebastian Moeller wrote:
[SM] Kathy Nichols' pping (https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpollere%2Fpping <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpollere%2Fpping&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=JsNkkEDTzwCintX7H2KhsviIuc2S4r7RFYIevucnKeA%3D&reserved=0> &data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=JsNkkEDTzwCintX7H2KhsviIuc2S4r7RFYIevucnKeA%3D&reserved=0) might be an option, either on the ISP side or run on CPEs with some method to harvest the collected data from the ISP side.
Yes: I use pping to investigate occasional problems at work, but I was thinking more about home networks, where some big speed-changes happen and local congestion happens.
[SM] Okay. In the context of cake-autorate (https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flynxthecat%2FCAKE-autorate%2Fblob%2Fmain%2FREADME.md <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flynxthecat%2FCAKE-autorate%2Fblob%2Fmain%2FREADME.md&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LwP%2F45%2FIdIJgdFGITFEgYa2egg4xyQEqkE21HYKb0nM%3D&reserved=0> &data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LwP%2F45%2FIdIJgdFGITFEgYa2egg4xyQEqkE21HYKb0nM%3D&reserved=0) we implemented a flight recorder type logging that continuously logs the last X (configurable) epoch and stores bot shaper and achieved rates as well as the results from the latency probes. This script can be used with rate setting disabled to record relevant data and the user just needs to remember to export the data after experiencing interesting/abnormal events. Sure this does not have per application resolution, but should give some idea about current latency as well as current traffic. I will admit though that this logging is not exactly cheap CPU-wise and lacks the precision of packet captures... but it can be operated as flight recorder where relevant events can be exported/stored post-hoc...
Protocols with less fields readable like QUIC would require special care to evaluate the spin-bit if that exists. Or just resort to active polling and ping* each CPE once per second or so (for a course resolution, you could increase the polling rate on detecting anomalies thereby risking to make congestion slightly worse). None of this will allow to measure within home network congestion though, but it might still be a wortwhile diagnostic to know that the access link is OK, while the user reports latency issues.
If one has a good way to capture RTT and data rate for one problematic app, say zoom, then one could see that network problems were happening at the same time as lags and dropouts.
[SM] As above logging all traffic is relatively easy, per application or per flow will require different tools or packet captures...
ISPs would positively hate that, of course.
[SM] Assuming they come out of this looking bad, if the outcome is to imply the local WiFi being the root cause ISPs might actually appreciate it ;)
Regards
Sebastian
--dave
Regards
Sebastian
*) I think there are dedicated devices available that allow to ping large numbers of IPs in a periodic fashion.
--dave
On 10/23/22 07:57, Sebastian Moeller via Starlink wrote:
Hi Glenn,
On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <mailto:rpm@lists.bufferbloat.net> <rpm@lists.bufferbloat.net>
wrote:
As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
I.e. my download speed sucks. Measure the download speed.
My isp doesn't fix my problem. Measure what? How?
Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
[SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
macbook:~ user$ networkQuality -v
==== SUMMARY ====
Upload capacity: 24.341 Mbps
Download capacity: 91.951 Mbps
Upload flows: 20
Download flows: 16
Responsiveness: High (2123 RPM)
Base RTT: 16
Start: 10/23/22, 13:44:39
End: 10/23/22, 13:44:53
OS Version: Version 12.6 (Build 21G115)
Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
Without competent AQM/shaping I get:
==== SUMMARY ====
Upload capacity: 15.101 Mbps
Download capacity: 97.664 Mbps
Upload flows: 20
Download flows: 12
Responsiveness: Medium (427 RPM)
Base RTT: 16
Start: 10/23/22, 13:51:50
End: 10/23/22, 13:52:06
OS Version: Version 12.6 (Build 21G115)
latency under load: 60000/427 = 140.52 ms
base RPM: 60000/16 = 3750 RPM
reduction RPM: 100-100*427/3750 = 88.6%
I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
Regards
Sebastian
*) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
**) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
On Fri, Oct 21, 2022, 5:20 PM Dave Taht
<mailto:dave.taht@gmail.com> <dave.taht@gmail.com>
wrote:
One of the best talks I've ever seen on how to measure customer
satisfaction properly just went up after the P99 Conference.
It's called Misery Metrics.
After going through a deep dive as to why and how we think and act on
percentiles, bins, and other statistical methods as to how we use the
web and internet are *so wrong* (well worth watching and thinking
about if you are relying on or creating network metrics today), it
then points to the real metrics that matter to users and the ultimate
success of an internet business: Timeouts, retries, misses, failed
queries, angry phone calls, abandoned shopping carts and loss of
engagement.
https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.p99conf.io%2Fsession%2Fmisery-metrics-consequences%2F <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.p99conf.io%2Fsession%2Fmisery-metrics-consequences%2F&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2FOe2eo9f7JQ8bnQRB23HEaeXq6G9QxSQ%2FZkNb%2F6ctyU%3D&reserved=0> &data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2FOe2eo9f7JQ8bnQRB23HEaeXq6G9QxSQ%2FZkNb%2F6ctyU%3D&reserved=0
The ending advice was - don't aim to make a specific percentile
acceptable, aim for an acceptable % of misery.
I enjoyed the p99 conference more than any conference I've attended in years.
--
This song goes out to all the folk that thought Stadia would work:
https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fposts%2Fdtaht_the-mushroom-song-activity-6981366665607352320-FXtz <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fposts%2Fdtaht_the-mushroom-song-activity-6981366665607352320-FXtz&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ALKX4qknTgJBAiBET9j2yfdyhuEmM5rs2Ng3%2B09rat4%3D&reserved=0> &data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ALKX4qknTgJBAiBET9j2yfdyhuEmM5rs2Ng3%2B09rat4%3D&reserved=0
Dave Täht CEO, TekLibre, LLC
--
You received this message because you are subscribed to the Google Groups "discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
discuss+unsubscribe@measurementlab.net <mailto:discuss+unsubscribe@measurementlab.net>
.
To view this discussion on the web visit
https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fa%2Fmeasurementlab.net%2Fd%2Fmsgid%2Fdiscuss%2FCAA93jw4w27a1EO_QQG7NNkih%252BC3QQde5%253D_7OqGeS9xy9nB6wkg%2540mail.gmail.com&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HVk9tgu97ElRdvdHiiE3PSuEzT6PM731Ag4XMIVDJIU%3D&reserved=0
.
_______________________________________________
Rpm mailing list
Rpm@lists.bufferbloat.net <mailto:Rpm@lists.bufferbloat.net>
https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Frpm <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Frpm&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9Qd2WIP0ONe2zt%2FX3r0ws3QQMkRNjfmeY7dl9LH6T9k%3D&reserved=0> &data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9Qd2WIP0ONe2zt%2FX3r0ws3QQMkRNjfmeY7dl9LH6T9k%3D&reserved=0
_______________________________________________
Starlink mailing list
Starlink@lists.bufferbloat.net <mailto:Starlink@lists.bufferbloat.net>
https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0> &data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0
--
David Collier-Brown, | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com <mailto:dave.collier-brown@indexexchange.com>
| -- Mark Twain
CONFIDENTIALITY NOTICE AND DISCLAIMER : This telecommunication, including any and all attachments, contains confidential information intended only for the person(s) to whom it is addressed. Any dissemination, distribution, copying or disclosure is strictly prohibited and is not a waiver of confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and delete the message from your inbox and deleted items folders. This telecommunication does not constitute an express or implied agreement to conduct transactions by electronic means, nor does it constitute a contract offer, a contract amendment or an acceptance of a contract offer. Contract terms contained in this telecommunication are subject to legal review and the completion of formal documentation and are not binding until same is confirmed in writing and has been signed by an authorized signatory.
_______________________________________________
Starlink mailing list
Starlink@lists.bufferbloat.net <mailto:Starlink@lists.bufferbloat.net>
https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.bufferbloat.net%2Flistinfo%2Fstarlink&data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0> &data=05%7C01%7C%7C0de1c0ddf51f41aef70c08dab4fde5ad%7Cb07c069022b843668d8d7b845d088e18%7C1%7C0%7C638021299817872156%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hEuwM2IalFt67cx%2FkqQuHNR%2FL%2B8pwH0PKtMCiFMb6yU%3D&reserved=0
--
David Collier-Brown, | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com <mailto:dave.collier-brown@indexexchange.com> | -- Mark Twain
--
David Collier-Brown, | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dave.collier-brown@indexexchange.com <mailto:dave.collier-brown@indexexchange.com> | -- Mark Twain
[-- Attachment #2: Type: text/html, Size: 27310 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 11:57 ` [Starlink] [Rpm] " Sebastian Moeller
2022-10-23 12:17 ` Dave Collier-Brown
@ 2022-10-24 20:08 ` Christoph Paasch
2022-10-24 20:57 ` Sebastian Moeller
2022-10-25 15:50 ` [Starlink] [ippm] " J Ignacio Alvarez-Hamelin
2022-10-25 16:07 ` [Starlink] " J Ignacio Alvarez-Hamelin
3 siblings, 1 reply; 30+ messages in thread
From: Christoph Paasch @ 2022-10-24 20:08 UTC (permalink / raw)
To: Sebastian Moeller
Cc: Glenn Fishbine, Rpm, tsvwg IETF list, IETF IPPM WG,
Dave Taht via Starlink,
Measurement Analysis and Tools Working Group, discuss
[-- Attachment #1: Type: text/plain, Size: 7488 bytes --]
Hello Sebastian,
> On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink <starlink@lists.bufferbloat.net> wrote:
>
> Hi Glenn,
>
>
>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net> wrote:
>>
>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>
>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>
>> I.e. my download speed sucks. Measure the download speed.
>>
>> My isp doesn't fix my problem. Measure what? How?
>>
>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>
>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>
>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>
> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net <http://speedtest.net/>. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>
> macbook:~ user$ networkQuality -v
> ==== SUMMARY ====
> Upload capacity: 24.341 Mbps
> Download capacity: 91.951 Mbps
> Upload flows: 20
> Download flows: 16
> Responsiveness: High (2123 RPM)
> Base RTT: 16
> Start: 10/23/22, 13:44:39
> End: 10/23/22, 13:44:53
> OS Version: Version 12.6 (Build 21G115)
You should update to latest macOS:
$ networkQuality
==== SUMMARY ====
Uplink capacity: 326.789 Mbps
Downlink capacity: 446.359 Mbps
Responsiveness: High (2195 RPM)
Idle Latency: 5.833 milli-seconds
;-)
But, what I read is: You are suggesting that “Idle Latency” should be expressed in RPM as well? Or, Responsiveness expressed in millisecond ?
Christoph
>
> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>
> Without competent AQM/shaping I get:
> ==== SUMMARY ====
> Upload capacity: 15.101 Mbps
> Download capacity: 97.664 Mbps
> Upload flows: 20
> Download flows: 12
> Responsiveness: Medium (427 RPM)
> Base RTT: 16
> Start: 10/23/22, 13:51:50
> End: 10/23/22, 13:52:06
> OS Version: Version 12.6 (Build 21G115)
> latency under load: 60000/427 = 140.52 ms
> base RPM: 60000/16 = 3750 RPM
> reduction RPM: 100-100*427/3750 = 88.6%
>
>
> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>
> Regards
> Sebastian
>
>
>
> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>
> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>
>
>>
>>
>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com> wrote:
>> One of the best talks I've ever seen on how to measure customer
>> satisfaction properly just went up after the P99 Conference.
>>
>> It's called Misery Metrics.
>>
>> After going through a deep dive as to why and how we think and act on
>> percentiles, bins, and other statistical methods as to how we use the
>> web and internet are *so wrong* (well worth watching and thinking
>> about if you are relying on or creating network metrics today), it
>> then points to the real metrics that matter to users and the ultimate
>> success of an internet business: Timeouts, retries, misses, failed
>> queries, angry phone calls, abandoned shopping carts and loss of
>> engagement.
>>
>> https://www.p99conf.io/session/misery-metrics-consequences/
>>
>> The ending advice was - don't aim to make a specific percentile
>> acceptable, aim for an acceptable % of misery.
>>
>> I enjoyed the p99 conference more than any conference I've attended in years.
>>
>> --
>> This song goes out to all the folk that thought Stadia would work:
>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>> Dave Täht CEO, TekLibre, LLC
>>
>> --
>> You received this message because you are subscribed to the Google Groups "discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@measurementlab.net.
>> To view this discussion on the web visit https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>> _______________________________________________
>> Rpm mailing list
>> Rpm@lists.bufferbloat.net <mailto:Rpm@lists.bufferbloat.net>
>> https://lists.bufferbloat.net/listinfo/rpm
>
> _______________________________________________
> Starlink mailing list
> Starlink@lists.bufferbloat.net <mailto:Starlink@lists.bufferbloat.net>
> https://lists.bufferbloat.net/listinfo/starlink
[-- Attachment #2: Type: text/html, Size: 43924 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-24 20:08 ` Christoph Paasch
@ 2022-10-24 20:57 ` Sebastian Moeller
2022-10-24 23:44 ` Christoph Paasch
0 siblings, 1 reply; 30+ messages in thread
From: Sebastian Moeller @ 2022-10-24 20:57 UTC (permalink / raw)
To: Christoph Paasch
Cc: Glenn Fishbine, Rpm, tsvwg IETF list, IETF IPPM WG,
Dave Taht via Starlink,
Measurement Analysis and Tools Working Group, discuss
Hi Christoph
> On Oct 24, 2022, at 22:08, Christoph Paasch <cpaasch@apple.com> wrote:
>
> Hello Sebastian,
>
>> On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink <starlink@lists.bufferbloat.net> wrote:
>>
>> Hi Glenn,
>>
>>
>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net> wrote:
>>>
>>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>>
>>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>>
>>> I.e. my download speed sucks. Measure the download speed.
>>>
>>> My isp doesn't fix my problem. Measure what? How?
>>>
>>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>>
>>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>>
>>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>>
>> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
>> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>>
>> macbook:~ user$ networkQuality -v
>> ==== SUMMARY ====
>> Upload capacity: 24.341 Mbps
>> Download capacity: 91.951 Mbps
>> Upload flows: 20
>> Download flows: 16
>> Responsiveness: High (2123 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:44:39
>> End: 10/23/22, 13:44:53
>> OS Version: Version 12.6 (Build 21G115)
>
> You should update to latest macOS:
>
> $ networkQuality
> ==== SUMMARY ====
> Uplink capacity: 326.789 Mbps
> Downlink capacity: 446.359 Mbps
> Responsiveness: High (2195 RPM)
> Idle Latency: 5.833 milli-seconds
>
> ;-)
>
[SM] I wish... just updated to the latest and greatest for this hardware (A1398):
macbook-pro:DPZ smoeller$ networkQuality
==== SUMMARY ====
Upload capacity: 7.478 Mbps
Download capacity: 2.415 Mbps
Upload flows: 16
Download flows: 20
Responsiveness: Low (90 RPM)
macbook-pro:DPZ smoeller$ networkQuality -v
==== SUMMARY ====
Upload capacity: 5.830 Mbps
Download capacity: 6.077 Mbps
Upload flows: 12
Download flows: 20
Responsiveness: Low (56 RPM)
Base RTT: 134
Start: 10/24/22, 22:47:48
End: 10/24/22, 22:48:09
OS Version: Version 12.6.1 (Build 21G217)
macbook-pro:DPZ smoeller$
Still, I only see the "Base RTT" with the -v switch and I am not sure whether that is identical to your "Idle Latency".
I guess I need to convince my employer to exchange that macbook (actually because the battery starts bulging and not because I am behind with networkQuality versions ;) )
> But, what I read is: You are suggesting that “Idle Latency” should be expressed in RPM as well? Or, Responsiveness expressed in millisecond ?
[SM] Yes, I am fine with either (or both) the idea is to make it really easy to see whether/how much "working conditions" deteriorate the responsiveness / increase the latency-under-load. At least in verbose mode it would be sweet if nwtworkQuality could expose that information.
Regards
Sebastian
>
>
> Christoph
>
>
>
>>
>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>>
>> Without competent AQM/shaping I get:
>> ==== SUMMARY ====
>> Upload capacity: 15.101 Mbps
>> Download capacity: 97.664 Mbps
>> Upload flows: 20
>> Download flows: 12
>> Responsiveness: Medium (427 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:51:50
>> End: 10/23/22, 13:52:06
>> OS Version: Version 12.6 (Build 21G115)
>> latency under load: 60000/427 = 140.52 ms
>> base RPM: 60000/16 = 3750 RPM
>> reduction RPM: 100-100*427/3750 = 88.6%
>>
>>
>> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>>
>> Regards
>> Sebastian
>>
>>
>>
>> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>>
>> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>>
>>
>>>
>>>
>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com> wrote:
>>> One of the best talks I've ever seen on how to measure customer
>>> satisfaction properly just went up after the P99 Conference.
>>>
>>> It's called Misery Metrics.
>>>
>>> After going through a deep dive as to why and how we think and act on
>>> percentiles, bins, and other statistical methods as to how we use the
>>> web and internet are *so wrong* (well worth watching and thinking
>>> about if you are relying on or creating network metrics today), it
>>> then points to the real metrics that matter to users and the ultimate
>>> success of an internet business: Timeouts, retries, misses, failed
>>> queries, angry phone calls, abandoned shopping carts and loss of
>>> engagement.
>>>
>>> https://www.p99conf.io/session/misery-metrics-consequences/
>>>
>>> The ending advice was - don't aim to make a specific percentile
>>> acceptable, aim for an acceptable % of misery.
>>>
>>> I enjoyed the p99 conference more than any conference I've attended in years.
>>>
>>> --
>>> This song goes out to all the folk that thought Stadia would work:
>>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>>> Dave Täht CEO, TekLibre, LLC
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@measurementlab.net.
>>> To view this discussion on the web visit https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>>> _______________________________________________
>>> Rpm mailing list
>>> Rpm@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/rpm
>>
>> _______________________________________________
>> Starlink mailing list
>> Starlink@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/starlink
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-24 20:57 ` Sebastian Moeller
@ 2022-10-24 23:44 ` Christoph Paasch
2022-10-25 0:08 ` rjmcmahon
2022-10-25 2:28 ` [Starlink] [tsvwg] " Neal Cardwell
0 siblings, 2 replies; 30+ messages in thread
From: Christoph Paasch @ 2022-10-24 23:44 UTC (permalink / raw)
To: Sebastian Moeller
Cc: Glenn Fishbine, Rpm, tsvwg IETF list, IETF IPPM WG,
Dave Taht via Starlink,
Measurement Analysis and Tools Working Group, discuss
[-- Attachment #1: Type: text/plain, Size: 9368 bytes --]
> On Oct 24, 2022, at 1:57 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> Hi Christoph
>
>> On Oct 24, 2022, at 22:08, Christoph Paasch <cpaasch@apple.com> wrote:
>>
>> Hello Sebastian,
>>
>>> On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink <starlink@lists.bufferbloat.net> wrote:
>>>
>>> Hi Glenn,
>>>
>>>
>>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net> wrote:
>>>>
>>>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>>>
>>>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>>>
>>>> I.e. my download speed sucks. Measure the download speed.
>>>>
>>>> My isp doesn't fix my problem. Measure what? How?
>>>>
>>>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>>>
>>>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>>>
>>>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>>>
>>> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
>>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
>>> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>>>
>>> macbook:~ user$ networkQuality -v
>>> ==== SUMMARY ====
>>> Upload capacity: 24.341 Mbps
>>> Download capacity: 91.951 Mbps
>>> Upload flows: 20
>>> Download flows: 16
>>> Responsiveness: High (2123 RPM)
>>> Base RTT: 16
>>> Start: 10/23/22, 13:44:39
>>> End: 10/23/22, 13:44:53
>>> OS Version: Version 12.6 (Build 21G115)
>>
>> You should update to latest macOS:
>>
>> $ networkQuality
>> ==== SUMMARY ====
>> Uplink capacity: 326.789 Mbps
>> Downlink capacity: 446.359 Mbps
>> Responsiveness: High (2195 RPM)
>> Idle Latency: 5.833 milli-seconds
>>
>> ;-)
>>
>
>
> [SM] I wish... just updated to the latest and greatest for this hardware (A1398):
>
> macbook-pro:DPZ smoeller$ networkQuality
> ==== SUMMARY ====
> Upload capacity: 7.478 Mbps
> Download capacity: 2.415 Mbps
> Upload flows: 16
> Download flows: 20
> Responsiveness: Low (90 RPM)
> macbook-pro:DPZ smoeller$ networkQuality -v
> ==== SUMMARY ====
> Upload capacity: 5.830 Mbps
> Download capacity: 6.077 Mbps
> Upload flows: 12
> Download flows: 20
> Responsiveness: Low (56 RPM)
> Base RTT: 134
> Start: 10/24/22, 22:47:48
> End: 10/24/22, 22:48:09
> OS Version: Version 12.6.1 (Build 21G217)
> macbook-pro:DPZ smoeller$
>
> Still, I only see the "Base RTT" with the -v switch and I am not sure whether that is identical to your "Idle Latency".
>
>
> I guess I need to convince my employer to exchange that macbook (actually because the battery starts bulging and not because I am behind with networkQuality versions ;) )
Yes, you would need macOS Ventura to get the latest and greatest.
>> But, what I read is: You are suggesting that “Idle Latency” should be expressed in RPM as well? Or, Responsiveness expressed in millisecond ?
>
> [SM] Yes, I am fine with either (or both) the idea is to make it really easy to see whether/how much "working conditions" deteriorate the responsiveness / increase the latency-under-load. At least in verbose mode it would be sweet if nwtworkQuality could expose that information.
I see - let me think about that…
Christoph
>>>
>>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>>>
>>> Without competent AQM/shaping I get:
>>> ==== SUMMARY ====
>>> Upload capacity: 15.101 Mbps
>>> Download capacity: 97.664 Mbps
>>> Upload flows: 20
>>> Download flows: 12
>>> Responsiveness: Medium (427 RPM)
>>> Base RTT: 16
>>> Start: 10/23/22, 13:51:50
>>> End: 10/23/22, 13:52:06
>>> OS Version: Version 12.6 (Build 21G115)
>>> latency under load: 60000/427 = 140.52 ms
>>> base RPM: 60000/16 = 3750 RPM
>>> reduction RPM: 100-100*427/3750 = 88.6%
>>>
>>>
>>> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>>>
>>> Regards
>>> Sebastian
>>>
>>>
>>>
>>> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>>>
>>> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>>>
>>>
>>>>
>>>>
>>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com> wrote:
>>>> One of the best talks I've ever seen on how to measure customer
>>>> satisfaction properly just went up after the P99 Conference.
>>>>
>>>> It's called Misery Metrics.
>>>>
>>>> After going through a deep dive as to why and how we think and act on
>>>> percentiles, bins, and other statistical methods as to how we use the
>>>> web and internet are *so wrong* (well worth watching and thinking
>>>> about if you are relying on or creating network metrics today), it
>>>> then points to the real metrics that matter to users and the ultimate
>>>> success of an internet business: Timeouts, retries, misses, failed
>>>> queries, angry phone calls, abandoned shopping carts and loss of
>>>> engagement.
>>>>
>>>> https://www.p99conf.io/session/misery-metrics-consequences/
>>>>
>>>> The ending advice was - don't aim to make a specific percentile
>>>> acceptable, aim for an acceptable % of misery.
>>>>
>>>> I enjoyed the p99 conference more than any conference I've attended in years.
>>>>
>>>> --
>>>> This song goes out to all the folk that thought Stadia would work:
>>>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>>>> Dave Täht CEO, TekLibre, LLC
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups "discuss" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@measurementlab.net.
>>>> To view this discussion on the web visit https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>>>> _______________________________________________
>>>> Rpm mailing list
>>>> Rpm@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/rpm
>>>
>>> _______________________________________________
>>> Starlink mailing list
>>> Starlink@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/starlink
[-- Attachment #2: Type: text/html, Size: 33062 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-24 23:44 ` Christoph Paasch
@ 2022-10-25 0:08 ` rjmcmahon
2022-10-25 0:12 ` rjmcmahon
2022-10-25 2:28 ` [Starlink] [tsvwg] " Neal Cardwell
1 sibling, 1 reply; 30+ messages in thread
From: rjmcmahon @ 2022-10-25 0:08 UTC (permalink / raw)
To: Christoph Paasch
Cc: Sebastian Moeller, Dave Taht via Starlink, tsvwg IETF list,
IETF IPPM WG, Rpm, Glenn Fishbine,
Measurement Analysis and Tools Working Group, discuss
Be careful about assuming network loads always worsen latency over
networks. Below is an example over WiFi with a rasberry pi over a
Netgear Nighthawk RAXE500 to a 1G wired linux host, without a load then
with an upstream load. I've noticed similar with some hardware
forwarding planes where a busier AP outperforms, in terms of latency, a
lightly loaded one. (Note: Iperf 2 uses responses per second (RPS) as
the SI Units of time is the second.)
rjmcmahon@ubuntu:/usr/local/src/iperf2-code$ iperf -c 192.168.1.69 -i 1
--bounceback
------------------------------------------------------------
Client connecting to 192.168.1.69, TCP port 5001 with pid 4148 (1 flows)
Write buffer size: 100 Byte
Bursting: 100 Byte writes 10 times every 1.00 second(s)
Bounce-back test (size= 100 Byte) (server hold req=0 usecs &
tcp_quickack)
TOS set to 0x0 and nodelay (Nagle off)
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[ 1] local 192.168.1.40%eth0 port 53750 connected with 192.168.1.69
port 5001 (bb w/quickack len/hold=100/0) (sock=3)
(icwnd/mss/irtt=14/1448/341) (ct=0.44 ms) on 2022-10-25 00:00:48 (UTC)
[ ID] Interval Transfer Bandwidth BB
cnt=avg/min/max/stdev Rtry Cwnd/RTT RPS
[ 1] 0.00-1.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.340/0.206/0.852/0.188 ms 0 14K/220 us 2941 rps
[ 1] 1.00-2.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.481/0.362/0.572/0.057 ms 0 14K/327 us 2078 rps
[ 1] 2.00-3.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.471/0.344/0.694/0.089 ms 0 14K/340 us 2123 rps
[ 1] 3.00-4.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.406/0.330/0.595/0.072 ms 0 14K/318 us 2465 rps
[ 1] 4.00-5.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.471/0.405/0.603/0.057 ms 0 14K/348 us 2124 rps
[ 1] 5.00-6.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.428/0.355/0.641/0.079 ms 0 14K/324 us 2337 rps
[ 1] 6.00-7.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.429/0.329/0.616/0.086 ms 0 14K/306 us 2329 rps
[ 1] 7.00-8.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.445/0.325/0.673/0.092 ms 0 14K/321 us 2248 rps
[ 1] 8.00-9.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.423/0.348/0.604/0.074 ms 0 14K/299 us 2366 rps
[ 1] 9.00-10.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.463/0.369/0.729/0.108 ms 0 14K/306 us 2159 rps
[ 1] 0.00-10.01 sec 19.7 KBytes 16.1 Kbits/sec
101=0.438/0.206/0.852/0.102 ms 0 14K/1192 us 2285 rps
[ 1] 0.00-10.01 sec BB8-PDF:
bin(w=100us):cnt(101)=3:5,4:30,5:48,6:9,7:7,8:1,9:1
(5.00/95.00/99.7%=4/7/9,Outliers=0,obl/obu=0/0)
rjmcmahon@ubuntu:/usr/local/src/iperf2-code$ iperf -c 192.168.1.69 -i 1
--bounceback --bounceback-congest=up,1
------------------------------------------------------------
Client connecting to 192.168.1.69, TCP port 5001 with pid 4152 (1 flows)
Write buffer size: 100 Byte
Bursting: 100 Byte writes 10 times every 1.00 second(s)
Bounce-back test (size= 100 Byte) (server hold req=0 usecs &
tcp_quickack)
TOS set to 0x0 and nodelay (Nagle off)
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[ 2] local 192.168.1.40%eth0 port 50462 connected with 192.168.1.69
port 5001 (sock=4) (qack) (icwnd/mss/irtt=14/1448/475) (ct=0.63 ms) on
2022-10-25 00:01:07 (UTC)
[ 1] local 192.168.1.40%eth0 port 50472 connected with 192.168.1.69
port 5001 (bb w/quickack len/hold=100/0) (sock=3)
(icwnd/mss/irtt=14/1448/375) (ct=0.67 ms) on 2022-10-25 00:01:07 (UTC)
[ ID] Interval Transfer Bandwidth Write/Err Rtry
Cwnd/RTT(var) NetPwr
[ 2] 0.00-1.00 sec 3.73 MBytes 31.3 Mbits/sec 39069/0 0
59K/133(1) us 29376
[ ID] Interval Transfer Bandwidth BB
cnt=avg/min/max/stdev Rtry Cwnd/RTT RPS
[ 1] 0.00-1.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.282/0.178/0.887/0.214 ms 0 14K/191 us 3547 rps
[ 2] 1.00-2.00 sec 3.77 MBytes 31.6 Mbits/sec 39512/0 0
59K/133(1) us 29708
[ 1] 1.00-2.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.196/0.149/0.240/0.024 ms 0 14K/161 us 5115 rps
[ 2] 2.00-3.00 sec 3.77 MBytes 31.6 Mbits/sec 39558/0 0
59K/125(8) us 31646
[ 1] 2.00-3.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.163/0.134/0.192/0.018 ms 0 14K/136 us 6124 rps
[ 2] 3.00-4.00 sec 3.77 MBytes 31.6 Mbits/sec 39560/0 0
59K/133(1) us 29744
[ 1] 3.00-4.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.171/0.149/0.216/0.019 ms 0 14K/133 us 5838 rps
[ 2] 4.00-5.00 sec 3.76 MBytes 31.6 Mbits/sec 39460/0 0
59K/131(2) us 30122
[ 1] 4.00-5.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.188/0.131/0.242/0.029 ms 0 14K/143 us 5308 rps
[ 2] 5.00-6.00 sec 3.77 MBytes 31.6 Mbits/sec 39545/0 0
59K/133(0) us 29733
[ 1] 5.00-6.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.197/0.147/0.255/0.031 ms 0 14K/149 us 5079 rps
[ 2] 6.00-7.00 sec 3.78 MBytes 31.7 Mbits/sec 39631/0 0
59K/133(1) us 29798
[ 1] 6.00-7.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.196/0.146/0.229/0.025 ms 0 14K/151 us 5102 rps
[ 2] 7.00-8.00 sec 3.77 MBytes 31.6 Mbits/sec 39497/0 0
59K/133(0) us 29697
[ 1] 7.00-8.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.190/0.147/0.225/0.028 ms 0 14K/155 us 5260 rps
[ 2] 8.00-9.00 sec 3.77 MBytes 31.6 Mbits/sec 39533/0 0
59K/126(4) us 31375
[ 1] 8.00-9.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.185/0.158/0.208/0.017 ms 0 14K/148 us 5414 rps
[ 2] 9.00-10.00 sec 3.77 MBytes 31.6 Mbits/sec 39519/0 0
59K/133(1) us 29714
[ 1] 9.00-10.00 sec 1.95 KBytes 16.0 Kbits/sec
10=0.165/0.134/0.232/0.028 ms 0 14K/131 us 6064 rps
[ 2] 0.00-10.01 sec 37.7 MBytes 31.6 Mbits/sec 394886/0 0
59K/1430(2595) us 2759
[ 1] 0.00-10.01 sec 19.7 KBytes 16.1 Kbits/sec
101=0.194/0.131/0.887/0.075 ms 0 14K/1385 us 5167 rps
[ 1] 0.00-10.01 sec BB8-PDF: bin(w=100us):cnt(101)=2:69,3:31,9:1
(5.00/95.00/99.7%=2/3/9,Outliers=0,obl/obu=0/0)
[ CT] final connect times (min/avg/max/stdev) = 0.627/0.647/0.666/27.577
ms (tot/err) = 2/0
Bob
>> On Oct 24, 2022, at 1:57 PM, Sebastian Moeller <moeller0@gmx.de>
>> wrote:
>> Hi Christoph
>>
>> On Oct 24, 2022, at 22:08, Christoph Paasch <cpaasch@apple.com>
>> wrote:
>>
>> Hello Sebastian,
>>
>> On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink
>> <starlink@lists.bufferbloat.net> wrote:
>>
>> Hi Glenn,
>>
>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm
>> <rpm@lists.bufferbloat.net> wrote:
>>
>> As a classic died in the wool empiricist, granted that you can
>> identify "misery" factors, given a population of 1,000 users, how do
>> you propose deriving a misery index for that population?
>>
>> We can measure download, upload, ping, jitter pretty much without
>> user intervention. For the measurements you hypothesize, how you
>> you automatically extract those indecies without subjective user
>> contamination.
>>
>> I.e. my download speed sucks. Measure the download speed.
>>
>> My isp doesn't fix my problem. Measure what? How?
>>
>> Human survey technology is 70+ years old and it still has problems
>> figuring out how to correlate opinion with fact.
>>
>> Without an objective measurement scheme that doesn't require human
>> interaction, the misery index is a cool hypothesis with no way to
>> link to actual data. What objective measurements can be made?
>> Answer that and the index becomes useful. Otherwise it's just
>> consumer whining.
>>
>> Not trying to be combative here, in fact I like the concept you
>> support, but I'm hard pressed to see how the concept can lead to
>> data, and the data lead to policy proposals.
>>
>> [SM] So it seems that outside of seemingly simple to test
>> throughput numbers*, the next most important quality number (or the
>> most important depending on subjective ranking) is how does latency
>> change under "load". Absolute latency is also important albeit
>> static high latency can be worked around within limits so the change
>> under load seems more relevant.
>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's
>> bounceback test offer methods to asses latency change under load**,
>> as do waveforms bufferbloat tests and even to a degree Ookla's
>> speedtest.net. IMHO something like latency increase under load or
>> apple's responsiveness measure RPM (basically the inverse of the
>> latency under load calculated on a per minute basis, so it scales in
>> the typical higher numbers are better way, unlike raw latency under
>> load numbers where smaller is better).
>> IMHO what networkQuality is missing ATM is to measure and report
>> the unloaded RPM as well as the loaded the first gives a measure
>> over the static latency the second over how well things keep working
>> if capacity gets tight. They report the base RTT which can be
>> converted to RPM. As an example:
>>
>> macbook:~ user$ networkQuality -v
>> ==== SUMMARY ====
>>
>> Upload capacity: 24.341 Mbps
>> Download capacity: 91.951 Mbps
>> Upload flows: 20
>> Download flows: 16
>> Responsiveness: High (2123 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:44:39
>> End: 10/23/22, 13:44:53
>> OS Version: Version 12.6 (Build 21G115)
>
> You should update to latest macOS:
>
> $ networkQuality
> ==== SUMMARY ====
> Uplink capacity: 326.789 Mbps
> Downlink capacity: 446.359 Mbps
> Responsiveness: High (2195 RPM)
> Idle Latency: 5.833 milli-seconds
>
> ;-)
>
> [SM] I wish... just updated to the latest and greatest for this
> hardware (A1398):
>
> macbook-pro:DPZ smoeller$ networkQuality
> ==== SUMMARY ====
>
> Upload capacity: 7.478 Mbps
> Download capacity: 2.415 Mbps
> Upload flows: 16
> Download flows: 20
> Responsiveness: Low (90 RPM)
> macbook-pro:DPZ smoeller$ networkQuality -v
> ==== SUMMARY ====
>
> Upload capacity: 5.830 Mbps
> Download capacity: 6.077 Mbps
> Upload flows: 12
> Download flows: 20
> Responsiveness: Low (56 RPM)
> Base RTT: 134
> Start: 10/24/22, 22:47:48
> End: 10/24/22, 22:48:09
> OS Version: Version 12.6.1 (Build 21G217)
> macbook-pro:DPZ smoeller$
>
> Still, I only see the "Base RTT" with the -v switch and I am not sure
> whether that is identical to your "Idle Latency".
>
> I guess I need to convince my employer to exchange that macbook
> (actually because the battery starts bulging and not because I am
> behind with networkQuality versions ;) )
>
> Yes, you would need macOS Ventura to get the latest and greatest.
>
>>> But, what I read is: You are suggesting that “Idle Latency”
>>> should be expressed in RPM as well? Or, Responsiveness expressed
>>> in millisecond ?
>>
>> [SM] Yes, I am fine with either (or both) the idea is to make it
>> really easy to see whether/how much "working conditions" deteriorate
>> the responsiveness / increase the latency-under-load. At least in
>> verbose mode it would be sweet if nwtworkQuality could expose that
>> information.
>
> I see - let me think about that…
>
> Christoph
>
>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under
>> load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM,
>> son on this link load reduces the responsiveness by 3750-2123 = 1627
>> RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with
>> competent AQM and scheduling on the router.
>>
>> Without competent AQM/shaping I get:
>> ==== SUMMARY ====
>>
>> Upload capacity: 15.101 Mbps
>> Download capacity: 97.664 Mbps
>> Upload flows: 20
>> Download flows: 12
>> Responsiveness: Medium (427 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:51:50
>> End: 10/23/22, 13:52:06
>> OS Version: Version 12.6 (Build 21G115)
>> latency under load: 60000/427 = 140.52 ms
>> base RPM: 60000/16 = 3750 RPM
>> reduction RPM: 100-100*427/3750 = 88.6%
>>
>> I understand apple's desire to have a single reported number with a
>> single qualifier medium/high/... because in the end a link is only
>> reliably usable if responsiveness under load stays acceptable, but
>> with two numbers it is easier to see what one's ISP could do to
>> help. (I guess some ISPs might already be unhappy with the single
>> number, so this needs some diplomacy/tact)
>>
>> Regards
>> Sebastian
>>
>> *) Seemingly as quite some ISPs operate their own speedtest servers
>> in their network and ignore customers not reaching the contracted
>> rates into speedtest-servers located in different ASs. As the
>> product is called internet access I a inclined to expect that my ISP
>> maintains sufficient peering/transit capacity to reach the next tier
>> of AS at my contracted rate (the EU legislative seems to agree, see
>> EU directive 2015/2120).
>>
>> **) Most do by creating load themselves and measuring throughput at
>> the same time, bounceback IIUC will focus on the latency measurement
>> and leave the load generation optional (so offers a mode to measure
>> responsiveness of a live network with minimal measurement traffic).
>> @Bob, please correct me if this is wrong.
>>
>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com> wrote:
>> One of the best talks I've ever seen on how to measure customer
>> satisfaction properly just went up after the P99 Conference.
>>
>> It's called Misery Metrics.
>>
>> After going through a deep dive as to why and how we think and act
>> on
>> percentiles, bins, and other statistical methods as to how we use
>> the
>> web and internet are *so wrong* (well worth watching and thinking
>> about if you are relying on or creating network metrics today), it
>> then points to the real metrics that matter to users and the
>> ultimate
>> success of an internet business: Timeouts, retries, misses, failed
>> queries, angry phone calls, abandoned shopping carts and loss of
>> engagement.
>>
>> https://www.p99conf.io/session/misery-metrics-consequences/
>>
>> The ending advice was - don't aim to make a specific percentile
>> acceptable, aim for an acceptable % of misery.
>>
>> I enjoyed the p99 conference more than any conference I've attended
>> in years.
>>
>> --
>> This song goes out to all the folk that thought Stadia would work:
>>
> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>> Dave Täht CEO, TekLibre, LLC
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "discuss" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to discuss+unsubscribe@measurementlab.net.
>> To view this discussion on the web visit
>>
> https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>> _______________________________________________
>> Rpm mailing list
>> Rpm@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/rpm
>>
>> _______________________________________________
>> Starlink mailing list
>> Starlink@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/starlink
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-25 0:08 ` rjmcmahon
@ 2022-10-25 0:12 ` rjmcmahon
0 siblings, 0 replies; 30+ messages in thread
From: rjmcmahon @ 2022-10-25 0:12 UTC (permalink / raw)
To: rjmcmahon
Cc: Christoph Paasch, Rpm, tsvwg IETF list, IETF IPPM WG,
Dave Taht via Starlink, Glenn Fishbine,
Measurement Analysis and Tools Working Group, discuss
Sorry, the RPi4 rebooted and this test was over wired ethernet. But the
point is the same.
Bob
> Be careful about assuming network loads always worsen latency over
> networks. Below is an example over WiFi with a rasberry pi over a
> Netgear Nighthawk RAXE500 to a 1G wired linux host, without a load
> then with an upstream load. I've noticed similar with some hardware
> forwarding planes where a busier AP outperforms, in terms of latency,
> a lightly loaded one. (Note: Iperf 2 uses responses per second (RPS)
> as the SI Units of time is the second.)
>
> rjmcmahon@ubuntu:/usr/local/src/iperf2-code$ iperf -c 192.168.1.69 -i
> 1 --bounceback
> ------------------------------------------------------------
> Client connecting to 192.168.1.69, TCP port 5001 with pid 4148 (1
> flows)
> Write buffer size: 100 Byte
> Bursting: 100 Byte writes 10 times every 1.00 second(s)
> Bounce-back test (size= 100 Byte) (server hold req=0 usecs &
> tcp_quickack)
> TOS set to 0x0 and nodelay (Nagle off)
> TCP window size: 16.0 KByte (default)
> ------------------------------------------------------------
> [ 1] local 192.168.1.40%eth0 port 53750 connected with 192.168.1.69
> port 5001 (bb w/quickack len/hold=100/0) (sock=3)
> (icwnd/mss/irtt=14/1448/341) (ct=0.44 ms) on 2022-10-25 00:00:48 (UTC)
> [ ID] Interval Transfer Bandwidth BB
> cnt=avg/min/max/stdev Rtry Cwnd/RTT RPS
> [ 1] 0.00-1.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.340/0.206/0.852/0.188 ms 0 14K/220 us 2941 rps
> [ 1] 1.00-2.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.481/0.362/0.572/0.057 ms 0 14K/327 us 2078 rps
> [ 1] 2.00-3.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.471/0.344/0.694/0.089 ms 0 14K/340 us 2123 rps
> [ 1] 3.00-4.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.406/0.330/0.595/0.072 ms 0 14K/318 us 2465 rps
> [ 1] 4.00-5.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.471/0.405/0.603/0.057 ms 0 14K/348 us 2124 rps
> [ 1] 5.00-6.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.428/0.355/0.641/0.079 ms 0 14K/324 us 2337 rps
> [ 1] 6.00-7.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.429/0.329/0.616/0.086 ms 0 14K/306 us 2329 rps
> [ 1] 7.00-8.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.445/0.325/0.673/0.092 ms 0 14K/321 us 2248 rps
> [ 1] 8.00-9.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.423/0.348/0.604/0.074 ms 0 14K/299 us 2366 rps
> [ 1] 9.00-10.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.463/0.369/0.729/0.108 ms 0 14K/306 us 2159 rps
> [ 1] 0.00-10.01 sec 19.7 KBytes 16.1 Kbits/sec
> 101=0.438/0.206/0.852/0.102 ms 0 14K/1192 us 2285 rps
> [ 1] 0.00-10.01 sec BB8-PDF:
> bin(w=100us):cnt(101)=3:5,4:30,5:48,6:9,7:7,8:1,9:1
> (5.00/95.00/99.7%=4/7/9,Outliers=0,obl/obu=0/0)
>
>
> rjmcmahon@ubuntu:/usr/local/src/iperf2-code$ iperf -c 192.168.1.69 -i
> 1 --bounceback --bounceback-congest=up,1
> ------------------------------------------------------------
> Client connecting to 192.168.1.69, TCP port 5001 with pid 4152 (1
> flows)
> Write buffer size: 100 Byte
> Bursting: 100 Byte writes 10 times every 1.00 second(s)
> Bounce-back test (size= 100 Byte) (server hold req=0 usecs &
> tcp_quickack)
> TOS set to 0x0 and nodelay (Nagle off)
> TCP window size: 16.0 KByte (default)
> ------------------------------------------------------------
> [ 2] local 192.168.1.40%eth0 port 50462 connected with 192.168.1.69
> port 5001 (sock=4) (qack) (icwnd/mss/irtt=14/1448/475) (ct=0.63 ms) on
> 2022-10-25 00:01:07 (UTC)
> [ 1] local 192.168.1.40%eth0 port 50472 connected with 192.168.1.69
> port 5001 (bb w/quickack len/hold=100/0) (sock=3)
> (icwnd/mss/irtt=14/1448/375) (ct=0.67 ms) on 2022-10-25 00:01:07 (UTC)
> [ ID] Interval Transfer Bandwidth Write/Err Rtry
> Cwnd/RTT(var) NetPwr
> [ 2] 0.00-1.00 sec 3.73 MBytes 31.3 Mbits/sec 39069/0 0
> 59K/133(1) us 29376
> [ ID] Interval Transfer Bandwidth BB
> cnt=avg/min/max/stdev Rtry Cwnd/RTT RPS
> [ 1] 0.00-1.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.282/0.178/0.887/0.214 ms 0 14K/191 us 3547 rps
> [ 2] 1.00-2.00 sec 3.77 MBytes 31.6 Mbits/sec 39512/0 0
> 59K/133(1) us 29708
> [ 1] 1.00-2.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.196/0.149/0.240/0.024 ms 0 14K/161 us 5115 rps
> [ 2] 2.00-3.00 sec 3.77 MBytes 31.6 Mbits/sec 39558/0 0
> 59K/125(8) us 31646
> [ 1] 2.00-3.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.163/0.134/0.192/0.018 ms 0 14K/136 us 6124 rps
> [ 2] 3.00-4.00 sec 3.77 MBytes 31.6 Mbits/sec 39560/0 0
> 59K/133(1) us 29744
> [ 1] 3.00-4.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.171/0.149/0.216/0.019 ms 0 14K/133 us 5838 rps
> [ 2] 4.00-5.00 sec 3.76 MBytes 31.6 Mbits/sec 39460/0 0
> 59K/131(2) us 30122
> [ 1] 4.00-5.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.188/0.131/0.242/0.029 ms 0 14K/143 us 5308 rps
> [ 2] 5.00-6.00 sec 3.77 MBytes 31.6 Mbits/sec 39545/0 0
> 59K/133(0) us 29733
> [ 1] 5.00-6.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.197/0.147/0.255/0.031 ms 0 14K/149 us 5079 rps
> [ 2] 6.00-7.00 sec 3.78 MBytes 31.7 Mbits/sec 39631/0 0
> 59K/133(1) us 29798
> [ 1] 6.00-7.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.196/0.146/0.229/0.025 ms 0 14K/151 us 5102 rps
> [ 2] 7.00-8.00 sec 3.77 MBytes 31.6 Mbits/sec 39497/0 0
> 59K/133(0) us 29697
> [ 1] 7.00-8.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.190/0.147/0.225/0.028 ms 0 14K/155 us 5260 rps
> [ 2] 8.00-9.00 sec 3.77 MBytes 31.6 Mbits/sec 39533/0 0
> 59K/126(4) us 31375
> [ 1] 8.00-9.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.185/0.158/0.208/0.017 ms 0 14K/148 us 5414 rps
> [ 2] 9.00-10.00 sec 3.77 MBytes 31.6 Mbits/sec 39519/0 0
> 59K/133(1) us 29714
> [ 1] 9.00-10.00 sec 1.95 KBytes 16.0 Kbits/sec
> 10=0.165/0.134/0.232/0.028 ms 0 14K/131 us 6064 rps
> [ 2] 0.00-10.01 sec 37.7 MBytes 31.6 Mbits/sec 394886/0 0
> 59K/1430(2595) us 2759
> [ 1] 0.00-10.01 sec 19.7 KBytes 16.1 Kbits/sec
> 101=0.194/0.131/0.887/0.075 ms 0 14K/1385 us 5167 rps
> [ 1] 0.00-10.01 sec BB8-PDF: bin(w=100us):cnt(101)=2:69,3:31,9:1
> (5.00/95.00/99.7%=2/3/9,Outliers=0,obl/obu=0/0)
> [ CT] final connect times (min/avg/max/stdev) =
> 0.627/0.647/0.666/27.577 ms (tot/err) = 2/0
>
>
> Bob
>>> On Oct 24, 2022, at 1:57 PM, Sebastian Moeller <moeller0@gmx.de>
>>> wrote:
>>> Hi Christoph
>>>
>>> On Oct 24, 2022, at 22:08, Christoph Paasch <cpaasch@apple.com>
>>> wrote:
>>>
>>> Hello Sebastian,
>>>
>>> On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink
>>> <starlink@lists.bufferbloat.net> wrote:
>>>
>>> Hi Glenn,
>>>
>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm
>>> <rpm@lists.bufferbloat.net> wrote:
>>>
>>> As a classic died in the wool empiricist, granted that you can
>>> identify "misery" factors, given a population of 1,000 users, how do
>>> you propose deriving a misery index for that population?
>>>
>>> We can measure download, upload, ping, jitter pretty much without
>>> user intervention. For the measurements you hypothesize, how you
>>> you automatically extract those indecies without subjective user
>>> contamination.
>>>
>>> I.e. my download speed sucks. Measure the download speed.
>>>
>>> My isp doesn't fix my problem. Measure what? How?
>>>
>>> Human survey technology is 70+ years old and it still has problems
>>> figuring out how to correlate opinion with fact.
>>>
>>> Without an objective measurement scheme that doesn't require human
>>> interaction, the misery index is a cool hypothesis with no way to
>>> link to actual data. What objective measurements can be made?
>>> Answer that and the index becomes useful. Otherwise it's just
>>> consumer whining.
>>>
>>> Not trying to be combative here, in fact I like the concept you
>>> support, but I'm hard pressed to see how the concept can lead to
>>> data, and the data lead to policy proposals.
>>>
>>> [SM] So it seems that outside of seemingly simple to test
>>> throughput numbers*, the next most important quality number (or the
>>> most important depending on subjective ranking) is how does latency
>>> change under "load". Absolute latency is also important albeit
>>> static high latency can be worked around within limits so the change
>>> under load seems more relevant.
>>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's
>>> bounceback test offer methods to asses latency change under load**,
>>> as do waveforms bufferbloat tests and even to a degree Ookla's
>>> speedtest.net. IMHO something like latency increase under load or
>>> apple's responsiveness measure RPM (basically the inverse of the
>>> latency under load calculated on a per minute basis, so it scales in
>>> the typical higher numbers are better way, unlike raw latency under
>>> load numbers where smaller is better).
>>> IMHO what networkQuality is missing ATM is to measure and report
>>> the unloaded RPM as well as the loaded the first gives a measure
>>> over the static latency the second over how well things keep working
>>> if capacity gets tight. They report the base RTT which can be
>>> converted to RPM. As an example:
>>>
>>> macbook:~ user$ networkQuality -v
>>> ==== SUMMARY ====
>>>
>>> Upload capacity: 24.341 Mbps
>>> Download capacity: 91.951 Mbps
>>> Upload flows: 20
>>> Download flows: 16
>>> Responsiveness: High (2123 RPM)
>>> Base RTT: 16
>>> Start: 10/23/22, 13:44:39
>>> End: 10/23/22, 13:44:53
>>> OS Version: Version 12.6 (Build 21G115)
>>
>> You should update to latest macOS:
>>
>> $ networkQuality
>> ==== SUMMARY ====
>> Uplink capacity: 326.789 Mbps
>> Downlink capacity: 446.359 Mbps
>> Responsiveness: High (2195 RPM)
>> Idle Latency: 5.833 milli-seconds
>>
>> ;-)
>>
>> [SM] I wish... just updated to the latest and greatest for this
>> hardware (A1398):
>>
>> macbook-pro:DPZ smoeller$ networkQuality
>> ==== SUMMARY ====
>>
>> Upload capacity: 7.478 Mbps
>> Download capacity: 2.415 Mbps
>> Upload flows: 16
>> Download flows: 20
>> Responsiveness: Low (90 RPM)
>> macbook-pro:DPZ smoeller$ networkQuality -v
>> ==== SUMMARY ====
>>
>> Upload capacity: 5.830 Mbps
>> Download capacity: 6.077 Mbps
>> Upload flows: 12
>> Download flows: 20
>> Responsiveness: Low (56 RPM)
>> Base RTT: 134
>> Start: 10/24/22, 22:47:48
>> End: 10/24/22, 22:48:09
>> OS Version: Version 12.6.1 (Build 21G217)
>> macbook-pro:DPZ smoeller$
>>
>> Still, I only see the "Base RTT" with the -v switch and I am not sure
>> whether that is identical to your "Idle Latency".
>>
>> I guess I need to convince my employer to exchange that macbook
>> (actually because the battery starts bulging and not because I am
>> behind with networkQuality versions ;) )
>>
>> Yes, you would need macOS Ventura to get the latest and greatest.
>>
>>>> But, what I read is: You are suggesting that “Idle Latency”
>>>> should be expressed in RPM as well? Or, Responsiveness expressed
>>>> in millisecond ?
>>>
>>> [SM] Yes, I am fine with either (or both) the idea is to make it
>>> really easy to see whether/how much "working conditions" deteriorate
>>> the responsiveness / increase the latency-under-load. At least in
>>> verbose mode it would be sweet if nwtworkQuality could expose that
>>> information.
>>
>> I see - let me think about that…
>>
>> Christoph
>>
>>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under
>>> load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM,
>>> son on this link load reduces the responsiveness by 3750-2123 = 1627
>>> RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with
>>> competent AQM and scheduling on the router.
>>>
>>> Without competent AQM/shaping I get:
>>> ==== SUMMARY ====
>>>
>>> Upload capacity: 15.101 Mbps
>>> Download capacity: 97.664 Mbps
>>> Upload flows: 20
>>> Download flows: 12
>>> Responsiveness: Medium (427 RPM)
>>> Base RTT: 16
>>> Start: 10/23/22, 13:51:50
>>> End: 10/23/22, 13:52:06
>>> OS Version: Version 12.6 (Build 21G115)
>>> latency under load: 60000/427 = 140.52 ms
>>> base RPM: 60000/16 = 3750 RPM
>>> reduction RPM: 100-100*427/3750 = 88.6%
>>>
>>> I understand apple's desire to have a single reported number with a
>>> single qualifier medium/high/... because in the end a link is only
>>> reliably usable if responsiveness under load stays acceptable, but
>>> with two numbers it is easier to see what one's ISP could do to
>>> help. (I guess some ISPs might already be unhappy with the single
>>> number, so this needs some diplomacy/tact)
>>>
>>> Regards
>>> Sebastian
>>>
>>> *) Seemingly as quite some ISPs operate their own speedtest servers
>>> in their network and ignore customers not reaching the contracted
>>> rates into speedtest-servers located in different ASs. As the
>>> product is called internet access I a inclined to expect that my ISP
>>> maintains sufficient peering/transit capacity to reach the next tier
>>> of AS at my contracted rate (the EU legislative seems to agree, see
>>> EU directive 2015/2120).
>>>
>>> **) Most do by creating load themselves and measuring throughput at
>>> the same time, bounceback IIUC will focus on the latency measurement
>>> and leave the load generation optional (so offers a mode to measure
>>> responsiveness of a live network with minimal measurement traffic).
>>> @Bob, please correct me if this is wrong.
>>>
>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com> wrote:
>>> One of the best talks I've ever seen on how to measure customer
>>> satisfaction properly just went up after the P99 Conference.
>>>
>>> It's called Misery Metrics.
>>>
>>> After going through a deep dive as to why and how we think and act
>>> on
>>> percentiles, bins, and other statistical methods as to how we use
>>> the
>>> web and internet are *so wrong* (well worth watching and thinking
>>> about if you are relying on or creating network metrics today), it
>>> then points to the real metrics that matter to users and the
>>> ultimate
>>> success of an internet business: Timeouts, retries, misses, failed
>>> queries, angry phone calls, abandoned shopping carts and loss of
>>> engagement.
>>>
>>> https://www.p99conf.io/session/misery-metrics-consequences/
>>>
>>> The ending advice was - don't aim to make a specific percentile
>>> acceptable, aim for an acceptable % of misery.
>>>
>>> I enjoyed the p99 conference more than any conference I've attended
>>> in years.
>>>
>>> --
>>> This song goes out to all the folk that thought Stadia would work:
>>>
>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>>> Dave Täht CEO, TekLibre, LLC
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it,
>>> send an email to discuss+unsubscribe@measurementlab.net.
>>> To view this discussion on the web visit
>>>
>> https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>>> _______________________________________________
>>> Rpm mailing list
>>> Rpm@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/rpm
>>>
>>> _______________________________________________
>>> Starlink mailing list
>>> Starlink@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/starlink
>> _______________________________________________
>> Rpm mailing list
>> Rpm@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/rpm
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [tsvwg] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-24 23:44 ` Christoph Paasch
2022-10-25 0:08 ` rjmcmahon
@ 2022-10-25 2:28 ` Neal Cardwell
2022-10-25 15:17 ` [Starlink] [Rpm] [tsvwg] " rjmcmahon
2022-10-31 8:59 ` [Starlink] [ippm] [tsvwg] [Rpm] " Ruediger.Geib
1 sibling, 2 replies; 30+ messages in thread
From: Neal Cardwell @ 2022-10-25 2:28 UTC (permalink / raw)
To: Christoph Paasch
Cc: Sebastian Moeller, Glenn Fishbine, Rpm, tsvwg IETF list,
IETF IPPM WG, Dave Taht via Starlink,
Measurement Analysis and Tools Working Group, discuss
[-- Attachment #1: Type: text/plain, Size: 5699 bytes --]
On Mon, Oct 24, 2022 at 7:44 PM Christoph Paasch <cpaasch=
40apple.com@dmarc.ietf.org> wrote:
>
>
> On Oct 24, 2022, at 1:57 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> Hi Christoph
>
> On Oct 24, 2022, at 22:08, Christoph Paasch <cpaasch@apple.com> wrote:
>
> Hello Sebastian,
>
> On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink <
> starlink@lists.bufferbloat.net> wrote:
>
> Hi Glenn,
>
>
> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <
> rpm@lists.bufferbloat.net> wrote:
>
> As a classic died in the wool empiricist, granted that you can identify
> "misery" factors, given a population of 1,000 users, how do you propose
> deriving a misery index for that population?
>
> We can measure download, upload, ping, jitter pretty much without user
> intervention. For the measurements you hypothesize, how you you
> automatically extract those indecies without subjective user contamination.
>
> I.e. my download speed sucks. Measure the download speed.
>
> My isp doesn't fix my problem. Measure what? How?
>
> Human survey technology is 70+ years old and it still has problems
> figuring out how to correlate opinion with fact.
>
> Without an objective measurement scheme that doesn't require human
> interaction, the misery index is a cool hypothesis with no way to link to
> actual data. What objective measurements can be made? Answer that and the
> index becomes useful. Otherwise it's just consumer whining.
>
> Not trying to be combative here, in fact I like the concept you support,
> but I'm hard pressed to see how the concept can lead to data, and the data
> lead to policy proposals.
>
>
> [SM] So it seems that outside of seemingly simple to test throughput
> numbers*, the next most important quality number (or the most important
> depending on subjective ranking) is how does latency change under "load".
> Absolute latency is also important albeit static high latency can be worked
> around within limits so the change under load seems more relevant.
> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's
> bounceback test offer methods to asses latency change under load**, as do
> waveforms bufferbloat tests and even to a degree Ookla's speedtest.net.
> IMHO something like latency increase under load or apple's responsiveness
> measure RPM (basically the inverse of the latency under load calculated on
> a per minute basis, so it scales in the typical higher numbers are better
> way, unlike raw latency under load numbers where smaller is better).
> IMHO what networkQuality is missing ATM is to measure and report the
> unloaded RPM as well as the loaded the first gives a measure over the
> static latency the second over how well things keep working if capacity
> gets tight. They report the base RTT which can be converted to RPM. As an
> example:
>
> macbook:~ user$ networkQuality -v
> ==== SUMMARY ====
>
> Upload capacity: 24.341 Mbps
> Download capacity: 91.951 Mbps
> Upload flows: 20
> Download flows: 16
> Responsiveness: High (2123 RPM)
> Base RTT: 16
> Start: 10/23/22, 13:44:39
> End: 10/23/22, 13:44:53
> OS Version: Version 12.6 (Build 21G115)
>
>
> You should update to latest macOS:
>
> $ networkQuality
> ==== SUMMARY ====
> Uplink capacity: 326.789 Mbps
> Downlink capacity: 446.359 Mbps
> Responsiveness: High (2195 RPM)
> Idle Latency: 5.833 milli-seconds
>
> ;-)
>
>
>
> [SM] I wish... just updated to the latest and greatest for this hardware
> (A1398):
>
> macbook-pro:DPZ smoeller$ networkQuality
> ==== SUMMARY ====
>
> Upload capacity: 7.478 Mbps
> Download capacity: 2.415 Mbps
> Upload flows: 16
> Download flows: 20
> Responsiveness: Low (90 RPM)
> macbook-pro:DPZ smoeller$ networkQuality -v
> ==== SUMMARY ====
>
> Upload capacity: 5.830 Mbps
> Download capacity: 6.077 Mbps
> Upload flows: 12
> Download flows: 20
> Responsiveness: Low (56 RPM)
> Base RTT: 134
> Start: 10/24/22, 22:47:48
> End: 10/24/22, 22:48:09
> OS Version: Version 12.6.1 (Build 21G217)
> macbook-pro:DPZ smoeller$
>
> Still, I only see the "Base RTT" with the -v switch and I am not sure
> whether that is identical to your "Idle Latency".
>
>
> I guess I need to convince my employer to exchange that macbook (actually
> because the battery starts bulging and not because I am behind with
> networkQuality versions ;) )
>
>
> Yes, you would need macOS Ventura to get the latest and greatest.
>
> But, what I read is: You are suggesting that “Idle Latency” should be
> expressed in RPM as well? Or, Responsiveness expressed in millisecond ?
>
>
> [SM] Yes, I am fine with either (or both) the idea is to make it really
> easy to see whether/how much "working conditions" deteriorate the
> responsiveness / increase the latency-under-load. At least in verbose mode
> it would be sweet if nwtworkQuality could expose that information.
>
>
> I see - let me think about that…
>
+1 w/ Sebastian's point here. IMHO it would be great if the responsiveness
under load and when idle were reported:
(a) symmetrically, with the same metrics for both cases, and
(b) in both RPM and ms terms for both cases
So instead of:
Responsiveness: High (2195 RPM)
Idle Latency: 5.833 milli-seconds
Perhaps something like:
Loaded Responsiveness: High (XXXX RPM)
Loaded Latency: X.XXX milli-seconds
Idle Responsiveness: High (XXXX RPM)
Idle Latency: X.XXX milli-seconds
Having both RPM and ms available for loaded and unloaded cases would seem
to make it easier to compare loaded and idle performance more directly and
in a more apples-to-apples way.
best,
neal
[-- Attachment #2: Type: text/html, Size: 22724 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [tsvwg] [M-Lab-Discuss] misery metrics & consequences
2022-10-25 2:28 ` [Starlink] [tsvwg] " Neal Cardwell
@ 2022-10-25 15:17 ` rjmcmahon
2022-10-31 8:59 ` [Starlink] [ippm] [tsvwg] [Rpm] " Ruediger.Geib
1 sibling, 0 replies; 30+ messages in thread
From: rjmcmahon @ 2022-10-25 15:17 UTC (permalink / raw)
To: Neal Cardwell
Cc: Christoph Paasch, Dave Taht via Starlink, tsvwg IETF list,
IETF IPPM WG, Rpm, Glenn Fishbine,
Measurement Analysis and Tools Working Group, discuss
[-- Attachment #1: Type: text/plain, Size: 7294 bytes --]
One sample for a subgroup, from an SPC perspective, is typically
insufficient, e.g. Shewart control charts. Below are some suggestions:
https://bookdown.org/lawson/an_introduction_to_acceptance_sampling_and_spc_with_r26/shewhart-control-charts-in-phase-i.html
o) Define the subgroup size: Initially, this is a constant number of 4
or 5 items per each subgroup taken over a short enough interval of time
so that variation among them is due only to common causes.
o) Define the Subgroup Frequency: The subgroups collected should be
spaced out in time, but collected often enough so that they can
represent opportunities for the process to change.
o) Define the number of subgroups: Generally 25 or more subgroups are
necessary to establish the characteristics of a stable process. If some
subgroups are eliminated before calculating the revised control limits
due to the discovery of assignable causes, additional subgroups may need
to be collected so that there are at least 25 subgroups used in
calculating the revised limits.
Then return the mean and variance per the control chart tables and the
subgroup size. Also, keep in mind that the subgrouping is normalizing
the samples so information is lost if the underlying distribution is not
normal. That's why we give the full histogram in iperf 2. One can
compare against normal.
https://en.wikipedia.org/wiki/Control_chart
Bob
> On Mon, Oct 24, 2022 at 7:44 PM Christoph Paasch
> <cpaasch=40apple.com@dmarc.ietf.org> wrote:
>
>> On Oct 24, 2022, at 1:57 PM, Sebastian Moeller <moeller0@gmx.de>
>> wrote:
>> Hi Christoph
>>
>> On Oct 24, 2022, at 22:08, Christoph Paasch <cpaasch@apple.com>
>> wrote:
>>
>> Hello Sebastian,
>>
>> On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink
>> <starlink@lists.bufferbloat.net> wrote:
>>
>> Hi Glenn,
>>
>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm
>> <rpm@lists.bufferbloat.net> wrote:
>>
>> As a classic died in the wool empiricist, granted that you can
>> identify "misery" factors, given a population of 1,000 users, how do
>> you propose deriving a misery index for that population?
>>
>> We can measure download, upload, ping, jitter pretty much without
>> user intervention. For the measurements you hypothesize, how you
>> you automatically extract those indecies without subjective user
>> contamination.
>>
>> I.e. my download speed sucks. Measure the download speed.
>>
>> My isp doesn't fix my problem. Measure what? How?
>>
>> Human survey technology is 70+ years old and it still has problems
>> figuring out how to correlate opinion with fact.
>>
>> Without an objective measurement scheme that doesn't require human
>> interaction, the misery index is a cool hypothesis with no way to
>> link to actual data. What objective measurements can be made?
>> Answer that and the index becomes useful. Otherwise it's just
>> consumer whining.
>>
>> Not trying to be combative here, in fact I like the concept you
>> support, but I'm hard pressed to see how the concept can lead to
>> data, and the data lead to policy proposals.
>>
>> [SM] So it seems that outside of seemingly simple to test
>> throughput numbers*, the next most important quality number (or the
>> most important depending on subjective ranking) is how does latency
>> change under "load". Absolute latency is also important albeit
>> static high latency can be worked around within limits so the change
>> under load seems more relevant.
>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's
>> bounceback test offer methods to asses latency change under load**,
>> as do waveforms bufferbloat tests and even to a degree Ookla's
>> speedtest.net [1]. IMHO something like latency increase under load
>> or apple's responsiveness measure RPM (basically the inverse of the
>> latency under load calculated on a per minute basis, so it scales in
>> the typical higher numbers are better way, unlike raw latency under
>> load numbers where smaller is better).
>> IMHO what networkQuality is missing ATM is to measure and report
>> the unloaded RPM as well as the loaded the first gives a measure
>> over the static latency the second over how well things keep working
>> if capacity gets tight. They report the base RTT which can be
>> converted to RPM. As an example:
>>
>> macbook:~ user$ networkQuality -v
>> ==== SUMMARY ====
>>
>> Upload capacity: 24.341 Mbps
>> Download capacity: 91.951 Mbps
>> Upload flows: 20
>> Download flows: 16
>> Responsiveness: High (2123 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:44:39
>> End: 10/23/22, 13:44:53
>> OS Version: Version 12.6 (Build 21G115)
>
> You should update to latest macOS:
>
> $ networkQuality
> ==== SUMMARY ====
> Uplink capacity: 326.789 Mbps
> Downlink capacity: 446.359 Mbps
> Responsiveness: High (2195 RPM)
> Idle Latency: 5.833 milli-seconds
>
> ;-)
>
> [SM] I wish... just updated to the latest and greatest for this
> hardware (A1398):
>
> macbook-pro:DPZ smoeller$ networkQuality
> ==== SUMMARY ====
>
> Upload capacity: 7.478 Mbps
> Download capacity: 2.415 Mbps
> Upload flows: 16
> Download flows: 20
> Responsiveness: Low (90 RPM)
> macbook-pro:DPZ smoeller$ networkQuality -v
> ==== SUMMARY ====
>
> Upload capacity: 5.830 Mbps
> Download capacity: 6.077 Mbps
> Upload flows: 12
> Download flows: 20
> Responsiveness: Low (56 RPM)
> Base RTT: 134
> Start: 10/24/22, 22:47:48
> End: 10/24/22, 22:48:09
> OS Version: Version 12.6.1 (Build 21G217)
> macbook-pro:DPZ smoeller$
>
> Still, I only see the "Base RTT" with the -v switch and I am not sure
> whether that is identical to your "Idle Latency".
>
> I guess I need to convince my employer to exchange that macbook
> (actually because the battery starts bulging and not because I am
> behind with networkQuality versions ;) )
>
> Yes, you would need macOS Ventura to get the latest and greatest.
>
>>> But, what I read is: You are suggesting that “Idle Latency”
>>> should be expressed in RPM as well? Or, Responsiveness expressed
>>> in millisecond ?
>>
>> [SM] Yes, I am fine with either (or both) the idea is to make it
>> really easy to see whether/how much "working conditions" deteriorate
>> the responsiveness / increase the latency-under-load. At least in
>> verbose mode it would be sweet if nwtworkQuality could expose that
>> information.
>
> I see - let me think about that…
>
> +1 w/ Sebastian's point here. IMHO it would be great if the
> responsiveness under load and when idle were reported:
>
> (a) symmetrically, with the same metrics for both cases, and
>
> (b) in both RPM and ms terms for both cases
>
> So instead of:
>
> Responsiveness: High (2195 RPM)
> Idle Latency: 5.833 milli-seconds
>
> Perhaps something like:
>
> Loaded Responsiveness: High (XXXX RPM)
> Loaded Latency: X.XXX milli-seconds
> Idle Responsiveness: High (XXXX RPM)
> Idle Latency: X.XXX milli-seconds
>
> Having both RPM and ms available for loaded and unloaded cases would
> seem to make it easier to compare loaded and idle performance more
> directly and in a more apples-to-apples way.
>
> best,
> neal
>
>
>
> Links:
> ------
> [1] http://speedtest.net
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm
[-- Attachment #2: ControlChartConstantsAndFormulae.pdf --]
[-- Type: application/pdf, Size: 59376 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [ippm] [tsvwg] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-25 2:28 ` [Starlink] [tsvwg] " Neal Cardwell
2022-10-25 15:17 ` [Starlink] [Rpm] [tsvwg] " rjmcmahon
@ 2022-10-31 8:59 ` Ruediger.Geib
1 sibling, 0 replies; 30+ messages in thread
From: Ruediger.Geib @ 2022-10-31 8:59 UTC (permalink / raw)
To: ncardwell=40google.com, cpaasch=40apple.com, moeller0
Cc: glenn, rpm, tsvwg, ippm, starlink, mat-wg, discuss
[-- Attachment #1: Type: text/plain, Size: 6384 bytes --]
Folks,
I support Neal and Sebastian, I’d also prefer visibility of a baseline/no load metric vs. an actual metric display, at least optional.
Regards,
Ruediger
Von: ippm <ippm-bounces@ietf.org> Im Auftrag von Neal Cardwell
Gesendet: Dienstag, 25. Oktober 2022 04:29
An: Christoph Paasch <cpaasch=40apple.com@dmarc.ietf.org>
Cc: Sebastian Moeller <moeller0@gmx.de>; Glenn Fishbine <glenn@breakingpointsolutions.com>; Rpm <rpm@lists.bufferbloat.net>; tsvwg IETF list <tsvwg@ietf.org>; IETF IPPM WG <ippm@ietf.org>; Dave Taht via Starlink <starlink@lists.bufferbloat.net>; Measurement Analysis and Tools Working Group <mat-wg@ripe.net>; discuss <discuss@measurementlab.net>
Betreff: Re: [ippm] [tsvwg] [Starlink] [Rpm] [M-Lab-Discuss] misery metrics & consequences
On Mon, Oct 24, 2022 at 7:44 PM Christoph Paasch <cpaasch=40apple.com@dmarc.ietf.org<mailto:40apple.com@dmarc.ietf.org>> wrote:
On Oct 24, 2022, at 1:57 PM, Sebastian Moeller <moeller0@gmx.de<mailto:moeller0@gmx.de>> wrote:
Hi Christoph
On Oct 24, 2022, at 22:08, Christoph Paasch <cpaasch@apple.com<mailto:cpaasch@apple.com>> wrote:
Hello Sebastian,
On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink <starlink@lists.bufferbloat.net<mailto:starlink@lists.bufferbloat.net>> wrote:
Hi Glenn,
On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net<mailto:rpm@lists.bufferbloat.net>> wrote:
As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
I.e. my download speed sucks. Measure the download speed.
My isp doesn't fix my problem. Measure what? How?
Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
[SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net<http://speedtest.net>. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
macbook:~ user$ networkQuality -v
==== SUMMARY ====
Upload capacity: 24.341 Mbps
Download capacity: 91.951 Mbps
Upload flows: 20
Download flows: 16
Responsiveness: High (2123 RPM)
Base RTT: 16
Start: 10/23/22, 13:44:39
End: 10/23/22, 13:44:53
OS Version: Version 12.6 (Build 21G115)
You should update to latest macOS:
$ networkQuality
==== SUMMARY ====
Uplink capacity: 326.789 Mbps
Downlink capacity: 446.359 Mbps
Responsiveness: High (2195 RPM)
Idle Latency: 5.833 milli-seconds
;-)
[SM] I wish... just updated to the latest and greatest for this hardware (A1398):
macbook-pro:DPZ smoeller$ networkQuality
==== SUMMARY ====
Upload capacity: 7.478 Mbps
Download capacity: 2.415 Mbps
Upload flows: 16
Download flows: 20
Responsiveness: Low (90 RPM)
macbook-pro:DPZ smoeller$ networkQuality -v
==== SUMMARY ====
Upload capacity: 5.830 Mbps
Download capacity: 6.077 Mbps
Upload flows: 12
Download flows: 20
Responsiveness: Low (56 RPM)
Base RTT: 134
Start: 10/24/22, 22:47:48
End: 10/24/22, 22:48:09
OS Version: Version 12.6.1 (Build 21G217)
macbook-pro:DPZ smoeller$
Still, I only see the "Base RTT" with the -v switch and I am not sure whether that is identical to your "Idle Latency".
I guess I need to convince my employer to exchange that macbook (actually because the battery starts bulging and not because I am behind with networkQuality versions ;) )
Yes, you would need macOS Ventura to get the latest and greatest.
But, what I read is: You are suggesting that “Idle Latency” should be expressed in RPM as well? Or, Responsiveness expressed in millisecond ?
[SM] Yes, I am fine with either (or both) the idea is to make it really easy to see whether/how much "working conditions" deteriorate the responsiveness / increase the latency-under-load. At least in verbose mode it would be sweet if nwtworkQuality could expose that information.
I see - let me think about that…
+1 w/ Sebastian's point here. IMHO it would be great if the responsiveness under load and when idle were reported:
(a) symmetrically, with the same metrics for both cases, and
(b) in both RPM and ms terms for both cases
So instead of:
Responsiveness: High (2195 RPM)
Idle Latency: 5.833 milli-seconds
Perhaps something like:
Loaded Responsiveness: High (XXXX RPM)
Loaded Latency: X.XXX milli-seconds
Idle Responsiveness: High (XXXX RPM)
Idle Latency: X.XXX milli-seconds
Having both RPM and ms available for loaded and unloaded cases would seem to make it easier to compare loaded and idle performance more directly and in a more apples-to-apples way.
best,
neal
[-- Attachment #2: Type: text/html, Size: 14994 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [ippm] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 11:57 ` [Starlink] [Rpm] " Sebastian Moeller
2022-10-23 12:17 ` Dave Collier-Brown
2022-10-24 20:08 ` Christoph Paasch
@ 2022-10-25 15:50 ` J Ignacio Alvarez-Hamelin
2022-10-26 8:14 ` [Starlink] [mat-wg] " Dave Taht
2022-10-25 16:07 ` [Starlink] " J Ignacio Alvarez-Hamelin
3 siblings, 1 reply; 30+ messages in thread
From: J Ignacio Alvarez-Hamelin @ 2022-10-25 15:50 UTC (permalink / raw)
To: Sebastian Moeller
Cc: Glenn Fishbine, Rpm, tsvwg IETF list, IETF IPPM WG,
Dave Taht via Starlink,
Measurement Analysis and Tools Working Group, discuss
[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]
Dear all,
After some time in silence on the IPPM list, I like to make some comments here. As we presented in the draft-ietf-ippm-route-00 (now the RFC9198), the main problem is the traffic follows heavy-tailed distributions when it is seen from the end-to-end points: the origin of most of the issues in that video. Therefore, treating it as parametric distribution is not possible, unless you are dealing with a complex distribution like the Stable distribution:
• B. Mandelbrot, “New methods in statistical economics,” Journal of political economy, vol. 71, no. 5, pp. 421–440, 1963.
• ——, “The variation of certain speculative prices,” The journal of business, vol. 36, no. 4, pp. 394–419, 1963.
(and so it will be extremely complex a high computing demands.)
This is why we propose to use quartiles to characterize delays in the RFC9198. Then, I am doing some research to understand how the delay can change with network load, using the quartiles.
I attach some measurements done during the pandemic, showing the congestion as a function of the time.
Best,
Ignacio
_______________________________________________________________
Dr. Ing. José Ignacio Alvarez-Hamelin
CONICET and Facultad de Ingeniería, Universidad de Buenos Aires
Av. Paseo Colón 850 - C1063ACV - Buenos Aires - Argentina
+54 (11) 5285 0716 / 5285 0705
e-mail: ihameli@cnet.fi.uba.ar
web: http://cnet.fi.uba.ar/ignacio.alvarez-hamelin/
_______________________________________________________________
[-- Attachment #2: PastedGraphic-1.pdf --]
[-- Type: application/pdf, Size: 851704 bytes --]
[-- Attachment #3: Type: text/plain, Size: 2 bytes --]
[-- Attachment #4: PastedGraphic-2.pdf --]
[-- Type: application/pdf, Size: 1000303 bytes --]
[-- Attachment #5: Type: text/plain, Size: 6926 bytes --]
> On 23 Oct 2022, at 08:57, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> Hi Glenn,
>
>
>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net> wrote:
>>
>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>
>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>
>> I.e. my download speed sucks. Measure the download speed.
>>
>> My isp doesn't fix my problem. Measure what? How?
>>
>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>
>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>
>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>
> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>
> macbook:~ user$ networkQuality -v
> ==== SUMMARY ====
> Upload capacity: 24.341 Mbps
> Download capacity: 91.951 Mbps
> Upload flows: 20
> Download flows: 16
> Responsiveness: High (2123 RPM)
> Base RTT: 16
> Start: 10/23/22, 13:44:39
> End: 10/23/22, 13:44:53
> OS Version: Version 12.6 (Build 21G115)
>
> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>
> Without competent AQM/shaping I get:
> ==== SUMMARY ====
> Upload capacity: 15.101 Mbps
> Download capacity: 97.664 Mbps
> Upload flows: 20
> Download flows: 12
> Responsiveness: Medium (427 RPM)
> Base RTT: 16
> Start: 10/23/22, 13:51:50
> End: 10/23/22, 13:52:06
> OS Version: Version 12.6 (Build 21G115)
> latency under load: 60000/427 = 140.52 ms
> base RPM: 60000/16 = 3750 RPM
> reduction RPM: 100-100*427/3750 = 88.6%
>
>
> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>
> Regards
> Sebastian
>
>
>
> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>
> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>
>
>>
>>
>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com> wrote:
>> One of the best talks I've ever seen on how to measure customer
>> satisfaction properly just went up after the P99 Conference.
>>
>> It's called Misery Metrics.
>>
>> After going through a deep dive as to why and how we think and act on
>> percentiles, bins, and other statistical methods as to how we use the
>> web and internet are *so wrong* (well worth watching and thinking
>> about if you are relying on or creating network metrics today), it
>> then points to the real metrics that matter to users and the ultimate
>> success of an internet business: Timeouts, retries, misses, failed
>> queries, angry phone calls, abandoned shopping carts and loss of
>> engagement.
>>
>> https://www.p99conf.io/session/misery-metrics-consequences/
>>
>> The ending advice was - don't aim to make a specific percentile
>> acceptable, aim for an acceptable % of misery.
>>
>> I enjoyed the p99 conference more than any conference I've attended in years.
>>
>> --
>> This song goes out to all the folk that thought Stadia would work:
>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>> Dave Täht CEO, TekLibre, LLC
>>
>> --
>> You received this message because you are subscribed to the Google Groups "discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@measurementlab.net.
>> To view this discussion on the web visit https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>> _______________________________________________
>> Rpm mailing list
>> Rpm@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/rpm
>
> _______________________________________________
> ippm mailing list
> ippm@ietf.org
> https://www.ietf.org/mailman/listinfo/ippm
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [mat-wg] [ippm] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-25 15:50 ` [Starlink] [ippm] " J Ignacio Alvarez-Hamelin
@ 2022-10-26 8:14 ` Dave Taht
0 siblings, 0 replies; 30+ messages in thread
From: Dave Taht @ 2022-10-26 8:14 UTC (permalink / raw)
To: J Ignacio Alvarez-Hamelin
Cc: Sebastian Moeller, Dave Taht via Starlink, tsvwg IETF list,
IETF IPPM WG, Rpm, Glenn Fishbine,
Measurement Analysis and Tools Working Group, discuss
To add some context to this enormous cross post of mine, to try and
get folk here to think further out of the box,
the FCC is due to announce how "consumer broadband labels" are
supposed to work on Nov 15th. Nobody knows
what they are going to announce.
Their first attempt was laughable, the laughs in this attempt (which
is admittedly much better), are subtler.
https://www.benton.org/blog/consumer-driven-broadband-label-design
I liked how they leveraged waveform's categories in this one.
I (cynically) loved how the delay and loss both grew in this example,
where loss should increase with less delay in most circumstances.
Perhaps consumers can be trained to look for high loss and low delay,
but I doubt it.
--
This song goes out to all the folk that thought Stadia would work:
https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
Dave Täht CEO, TekLibre, LLC
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [ippm] [Rpm] [M-Lab-Discuss] misery metrics & consequences
2022-10-23 11:57 ` [Starlink] [Rpm] " Sebastian Moeller
` (2 preceding siblings ...)
2022-10-25 15:50 ` [Starlink] [ippm] " J Ignacio Alvarez-Hamelin
@ 2022-10-25 16:07 ` J Ignacio Alvarez-Hamelin
2022-10-25 17:02 ` [Starlink] [Rpm] [ippm] " rjmcmahon
3 siblings, 1 reply; 30+ messages in thread
From: J Ignacio Alvarez-Hamelin @ 2022-10-25 16:07 UTC (permalink / raw)
To: Sebastian Moeller
Cc: Glenn Fishbine, Rpm, tsvwg IETF list, IETF IPPM WG,
Dave Taht via Starlink,
Measurement Analysis and Tools Working Group, discuss
Dear all,
After some time in silence on the IPPM list, I like to make some comments here. As we presented in the draft-ietf-ippm-route-00 (now the RFC9198), the main problem is the traffic follows heavy-tailed distributions when it is seen from the end-to-end points: the origin of most of the issues in that video. Therefore, treating it as parametric distribution is not possible, unless you are dealing with a complex distribution like the Stable distribution:
• B. Mandelbrot, “New methods in statistical economics,” Journal of political economy, vol. 71, no. 5, pp. 421–440, 1963.
• ——, “The variation of certain speculative prices,” The journal of business, vol. 36, no. 4, pp. 394–419, 1963.
(and so it will be extremely complex a high computing demands.)
This is why we propose to use quartiles to characterize delays in the RFC9198. Then, I am doing some research to understand how the delay can change with network load, using the quartiles.
You can see some measurements done during the pandemic, showing the congestion as a function of the time (24 hours maximum):
https://cnet.fi.uba.ar/ignacio.alvarez-hamelin/RIPE-Atlas-measurement-24681441_m_win_data_world_map.html
[you can zoom in and out, pan it, clicking on the Xs you can close dialogs, to reopen them click on the link]
Best,
Ignacio
___________________________________
_______________________________________________________________
Dr. Ing. José Ignacio Alvarez-Hamelin
CONICET and Facultad de Ingeniería, Universidad de Buenos Aires
Av. Paseo Colón 850 - C1063ACV - Buenos Aires - Argentina
+54 (11) 5285 0716 / 5285 0705
e-mail: ihameli@cnet.fi.uba.ar
web: http://cnet.fi.uba.ar/ignacio.alvarez-hamelin/
_______________________________________________________________
> On 23 Oct 2022, at 08:57, Sebastian Moeller <moeller0@gmx.de> wrote:
>
> Hi Glenn,
>
>
>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net> wrote:
>>
>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>
>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>
>> I.e. my download speed sucks. Measure the download speed.
>>
>> My isp doesn't fix my problem. Measure what? How?
>>
>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>
>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>
>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>
> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>
> macbook:~ user$ networkQuality -v
> ==== SUMMARY ====
> Upload capacity: 24.341 Mbps
> Download capacity: 91.951 Mbps
> Upload flows: 20
> Download flows: 16
> Responsiveness: High (2123 RPM)
> Base RTT: 16
> Start: 10/23/22, 13:44:39
> End: 10/23/22, 13:44:53
> OS Version: Version 12.6 (Build 21G115)
>
> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>
> Without competent AQM/shaping I get:
> ==== SUMMARY ====
> Upload capacity: 15.101 Mbps
> Download capacity: 97.664 Mbps
> Upload flows: 20
> Download flows: 12
> Responsiveness: Medium (427 RPM)
> Base RTT: 16
> Start: 10/23/22, 13:51:50
> End: 10/23/22, 13:52:06
> OS Version: Version 12.6 (Build 21G115)
> latency under load: 60000/427 = 140.52 ms
> base RPM: 60000/16 = 3750 RPM
> reduction RPM: 100-100*427/3750 = 88.6%
>
>
> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>
> Regards
> Sebastian
>
>
>
> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>
> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>
>
>>
>>
>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com> wrote:
>> One of the best talks I've ever seen on how to measure customer
>> satisfaction properly just went up after the P99 Conference.
>>
>> It's called Misery Metrics.
>>
>> After going through a deep dive as to why and how we think and act on
>> percentiles, bins, and other statistical methods as to how we use the
>> web and internet are *so wrong* (well worth watching and thinking
>> about if you are relying on or creating network metrics today), it
>> then points to the real metrics that matter to users and the ultimate
>> success of an internet business: Timeouts, retries, misses, failed
>> queries, angry phone calls, abandoned shopping carts and loss of
>> engagement.
>>
>> https://www.p99conf.io/session/misery-metrics-consequences/
>>
>> The ending advice was - don't aim to make a specific percentile
>> acceptable, aim for an acceptable % of misery.
>>
>> I enjoyed the p99 conference more than any conference I've attended in years.
>>
>> --
>> This song goes out to all the folk that thought Stadia would work:
>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>> Dave Täht CEO, TekLibre, LLC
>>
>> --
>> You received this message because you are subscribed to the Google Groups "discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@measurementlab.net.
>> To view this discussion on the web visit https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>> _______________________________________________
>> Rpm mailing list
>> Rpm@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/rpm
>
> _______________________________________________
> ippm mailing list
> ippm@ietf.org
> https://www.ietf.org/mailman/listinfo/ippm
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [ippm] [M-Lab-Discuss] misery metrics & consequences
2022-10-25 16:07 ` [Starlink] " J Ignacio Alvarez-Hamelin
@ 2022-10-25 17:02 ` rjmcmahon
2022-10-25 20:17 ` J Ignacio Alvarez-Hamelin
0 siblings, 1 reply; 30+ messages in thread
From: rjmcmahon @ 2022-10-25 17:02 UTC (permalink / raw)
To: J Ignacio Alvarez-Hamelin
Cc: Sebastian Moeller, Dave Taht via Starlink, tsvwg IETF list,
IETF IPPM WG, Rpm, Glenn Fishbine,
Measurement Analysis and Tools Working Group, discuss
I don't understand the information in the link. It looks like lines on a
map to some form of dials which are too small to read.
One can sample and create a Gaussian per the central limit theorem (CLT)
if the underlying process probability density functions converge, i.e.
can be integrated to 1. With that said, normalizing does lose
information and doesn't say much about tails, outliers, etc.
We should be careful in assuming only the tails matter and that all
traffic follows heavy-tailed distributions. With bufferbloat it's the
minimum of the latency PDF that shifts. Codel watches a minimum.
"Jacobson suggested that average queue length actually contains no
information at all about packet demand or network load.[3][5] He
suggested that a better metric might be the minimum queue length during
a sliding time window."
We need statistical tools that also allow for the analysis of
non-parametric distributions too. Hotelling T2 assumes the multivariate
distributions are Gaussian. Kolmogorov-Smirnov tests can be used for
non-parametric distributions. We find both are needed for SPC used by
our automation systems.
Sample subgroups of one really don't give sufficient information about
any type of distribution, parametric or non-parametric.
Bob
> Dear all,
>
> After some time in silence on the IPPM list, I like to make some
> comments here. As we presented in the draft-ietf-ippm-route-00 (now
> the RFC9198), the main problem is the traffic follows heavy-tailed
> distributions when it is seen from the end-to-end points: the origin
> of most of the issues in that video. Therefore, treating it as
> parametric distribution is not possible, unless you are dealing with a
> complex distribution like the Stable distribution:
>
> • B. Mandelbrot, “New methods in statistical economics,” Journal of
> political economy, vol. 71, no. 5, pp. 421–440, 1963.
>
> • ——, “The variation of certain speculative prices,” The journal of
> business, vol. 36, no. 4, pp. 394–419, 1963.
>
> (and so it will be extremely complex a high computing demands.)
> This is why we propose to use quartiles to characterize delays in the
> RFC9198. Then, I am doing some research to understand how the delay
> can change with network load, using the quartiles.
> You can see some measurements done during the pandemic, showing the
> congestion as a function of the time (24 hours maximum):
>
> https://cnet.fi.uba.ar/ignacio.alvarez-hamelin/RIPE-Atlas-measurement-24681441_m_win_data_world_map.html
>
> [you can zoom in and out, pan it, clicking on the Xs you can close
> dialogs, to reopen them click on the link]
>
>
>
> Best,
>
> Ignacio
>
>
> ___________________________________
>
>
> _______________________________________________________________
>
> Dr. Ing. José Ignacio Alvarez-Hamelin
> CONICET and Facultad de Ingeniería, Universidad de Buenos Aires
> Av. Paseo Colón 850 - C1063ACV - Buenos Aires - Argentina
> +54 (11) 5285 0716 / 5285 0705
> e-mail: ihameli@cnet.fi.uba.ar
> web: http://cnet.fi.uba.ar/ignacio.alvarez-hamelin/
> _______________________________________________________________
>
>
>
>> On 23 Oct 2022, at 08:57, Sebastian Moeller <moeller0@gmx.de> wrote:
>>
>> Hi Glenn,
>>
>>
>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm
>>> <rpm@lists.bufferbloat.net> wrote:
>>>
>>> As a classic died in the wool empiricist, granted that you can
>>> identify "misery" factors, given a population of 1,000 users, how do
>>> you propose deriving a misery index for that population?
>>>
>>> We can measure download, upload, ping, jitter pretty much without
>>> user intervention. For the measurements you hypothesize, how you you
>>> automatically extract those indecies without subjective user
>>> contamination.
>>>
>>> I.e. my download speed sucks. Measure the download speed.
>>>
>>> My isp doesn't fix my problem. Measure what? How?
>>>
>>> Human survey technology is 70+ years old and it still has problems
>>> figuring out how to correlate opinion with fact.
>>>
>>> Without an objective measurement scheme that doesn't require human
>>> interaction, the misery index is a cool hypothesis with no way to
>>> link to actual data. What objective measurements can be made?
>>> Answer that and the index becomes useful. Otherwise it's just
>>> consumer whining.
>>>
>>> Not trying to be combative here, in fact I like the concept you
>>> support, but I'm hard pressed to see how the concept can lead to
>>> data, and the data lead to policy proposals.
>>
>> [SM] So it seems that outside of seemingly simple to test throughput
>> numbers*, the next most important quality number (or the most
>> important depending on subjective ranking) is how does latency change
>> under "load". Absolute latency is also important albeit static high
>> latency can be worked around within limits so the change under load
>> seems more relevant.
>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's
>> bounceback test offer methods to asses latency change under load**, as
>> do waveforms bufferbloat tests and even to a degree Ookla's
>> speedtest.net. IMHO something like latency increase under load or
>> apple's responsiveness measure RPM (basically the inverse of the
>> latency under load calculated on a per minute basis, so it scales in
>> the typical higher numbers are better way, unlike raw latency under
>> load numbers where smaller is better).
>> IMHO what networkQuality is missing ATM is to measure and report the
>> unloaded RPM as well as the loaded the first gives a measure over the
>> static latency the second over how well things keep working if
>> capacity gets tight. They report the base RTT which can be converted
>> to RPM. As an example:
>>
>> macbook:~ user$ networkQuality -v
>> ==== SUMMARY ====
>> Upload capacity: 24.341 Mbps
>> Download capacity: 91.951 Mbps
>> Upload flows: 20
>> Download flows: 16
>> Responsiveness: High (2123 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:44:39
>> End: 10/23/22, 13:44:53
>> OS Version: Version 12.6 (Build 21G115)
>>
>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load,
>> while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on
>> this link load reduces the responsiveness by 3750-2123 = 1627 RPM a
>> reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM
>> and scheduling on the router.
>>
>> Without competent AQM/shaping I get:
>> ==== SUMMARY ====
>> Upload capacity: 15.101 Mbps
>> Download capacity: 97.664 Mbps
>> Upload flows: 20
>> Download flows: 12
>> Responsiveness: Medium (427 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:51:50
>> End: 10/23/22, 13:52:06
>> OS Version: Version 12.6 (Build 21G115)
>> latency under load: 60000/427 = 140.52 ms
>> base RPM: 60000/16 = 3750 RPM
>> reduction RPM: 100-100*427/3750 = 88.6%
>>
>>
>> I understand apple's desire to have a single reported number with a
>> single qualifier medium/high/... because in the end a link is only
>> reliably usable if responsiveness under load stays acceptable, but
>> with two numbers it is easier to see what one's ISP could do to help.
>> (I guess some ISPs might already be unhappy with the single number, so
>> this needs some diplomacy/tact)
>>
>> Regards
>> Sebastian
>>
>>
>>
>> *) Seemingly as quite some ISPs operate their own speedtest servers in
>> their network and ignore customers not reaching the contracted rates
>> into speedtest-servers located in different ASs. As the product is
>> called internet access I a inclined to expect that my ISP maintains
>> sufficient peering/transit capacity to reach the next tier of AS at my
>> contracted rate (the EU legislative seems to agree, see EU directive
>> 2015/2120).
>>
>> **) Most do by creating load themselves and measuring throughput at
>> the same time, bounceback IIUC will focus on the latency measurement
>> and leave the load generation optional (so offers a mode to measure
>> responsiveness of a live network with minimal measurement traffic).
>> @Bob, please correct me if this is wrong.
>>
>>
>>>
>>>
>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com> wrote:
>>> One of the best talks I've ever seen on how to measure customer
>>> satisfaction properly just went up after the P99 Conference.
>>>
>>> It's called Misery Metrics.
>>>
>>> After going through a deep dive as to why and how we think and act on
>>> percentiles, bins, and other statistical methods as to how we use the
>>> web and internet are *so wrong* (well worth watching and thinking
>>> about if you are relying on or creating network metrics today), it
>>> then points to the real metrics that matter to users and the ultimate
>>> success of an internet business: Timeouts, retries, misses, failed
>>> queries, angry phone calls, abandoned shopping carts and loss of
>>> engagement.
>>>
>>> https://www.p99conf.io/session/misery-metrics-consequences/
>>>
>>> The ending advice was - don't aim to make a specific percentile
>>> acceptable, aim for an acceptable % of misery.
>>>
>>> I enjoyed the p99 conference more than any conference I've attended
>>> in years.
>>>
>>> --
>>> This song goes out to all the folk that thought Stadia would work:
>>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>>> Dave Täht CEO, TekLibre, LLC
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it,
>>> send an email to discuss+unsubscribe@measurementlab.net.
>>> To view this discussion on the web visit
>>> https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>>> _______________________________________________
>>> Rpm mailing list
>>> Rpm@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/rpm
>>
>> _______________________________________________
>> ippm mailing list
>> ippm@ietf.org
>> https://www.ietf.org/mailman/listinfo/ippm
>
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Starlink] [Rpm] [ippm] [M-Lab-Discuss] misery metrics & consequences
2022-10-25 17:02 ` [Starlink] [Rpm] [ippm] " rjmcmahon
@ 2022-10-25 20:17 ` J Ignacio Alvarez-Hamelin
0 siblings, 0 replies; 30+ messages in thread
From: J Ignacio Alvarez-Hamelin @ 2022-10-25 20:17 UTC (permalink / raw)
To: rjmcmahon
Cc: Sebastian Moeller, Dave Taht via Starlink, tsvwg IETF list,
IETF IPPM WG, Rpm, Glenn Fishbine,
Measurement Analysis and Tools Working Group, discuss
Dear Bob,
I answer inline. Thank you for your comments.
P.D. It seems that I have no rights to send emails to the Measurement Analysis and Tools Working Group: how I can join it?
_______________________________________________________________
Dr. Ing. José Ignacio Alvarez-Hamelin
CONICET and Facultad de Ingeniería, Universidad de Buenos Aires
Av. Paseo Colón 850 - C1063ACV - Buenos Aires - Argentina
+54 (11) 5285 0716 / 5285 0705
e-mail: ihameli@cnet.fi.uba.ar
web: http://cnet.fi.uba.ar/ignacio.alvarez-hamelin/
_______________________________________________________________
> On 25 Oct 2022, at 14:02, rjmcmahon <rjmcmahon@rjmcmahon.com> wrote:
>
> I don't understand the information in the link. It looks like lines on a map to some form of dials which are too small to read.
I suppose that depends on the zoom of your browser, try to put 100% (command+ to zoom-in in chrome). You will see a 24 hour clock where some hours are green (no congestion), yellow (some congestion) or red (congestion). But all of that is just an example.
>
> One can sample and create a Gaussian per the central limit theorem (CLT) if the underlying process probability density functions converge, i.e. can be integrated to 1. With that said, normalizing does lose information and doesn't say much about tails, outliers, etc.
The Stable distribution is a generalization over all the distributions. Consider that Gaussian has finite moments, and heavy-tailed distribution not, this is where Stable one enters as a generalization.
>
> We should be careful in assuming only the tails matter and that all traffic follows heavy-tailed distributions. With bufferbloat it's the minimum of the latency PDF that shifts. Codel watches a minimum. "Jacobson suggested that average queue length actually contains no information at all about packet demand or network load.[3][5] He suggested that a better metric might be the minimum queue length during a sliding time window."
Well, it depends on which point of the network you really are, but there a lot where this is true. In the other hand, no heavy tailed distributions do not cause problems if you use statistical tools that suport anything. For example, the median (Q2) converges to the mean (fist statistical moment) for Poisson distributions.
>
> We need statistical tools that also allow for the analysis of non-parametric distributions too. Hotelling T2 assumes the multivariate distributions are Gaussian. Kolmogorov-Smirnov tests can be used for non-parametric distributions. We find both are needed for SPC used by our automation systems.
Yes, I did a lot of this kind of test, and I found very interesting things, even in the very important links (important ISP with important CDN), verifying Stable distributions with certain parameters (even some ones with heavy-tailed distributions). The problem is how to do this a low computable cost, and this why you can use quartiles.
>
> Sample subgroups of one really don't give sufficient information about any type of distribution, parametric or non-parametric.
Here I cannot understand: which are the subgroups?
>
> Bob
>> Dear all,
>> After some time in silence on the IPPM list, I like to make some
>> comments here. As we presented in the draft-ietf-ippm-route-00 (now
>> the RFC9198), the main problem is the traffic follows heavy-tailed
>> distributions when it is seen from the end-to-end points: the origin
>> of most of the issues in that video. Therefore, treating it as
>> parametric distribution is not possible, unless you are dealing with a
>> complex distribution like the Stable distribution:
>> • B. Mandelbrot, “New methods in statistical economics,” Journal of
>> political economy, vol. 71, no. 5, pp. 421–440, 1963.
>> • ——, “The variation of certain speculative prices,” The journal of
>> business, vol. 36, no. 4, pp. 394–419, 1963.
>> (and so it will be extremely complex a high computing demands.)
>> This is why we propose to use quartiles to characterize delays in the
>> RFC9198. Then, I am doing some research to understand how the delay
>> can change with network load, using the quartiles.
>> You can see some measurements done during the pandemic, showing the
>> congestion as a function of the time (24 hours maximum):
>> https://cnet.fi.uba.ar/ignacio.alvarez-hamelin/RIPE-Atlas-measurement-24681441_m_win_data_world_map.html
>> [you can zoom in and out, pan it, clicking on the Xs you can close
>> dialogs, to reopen them click on the link]
>> Best,
>> Ignacio
>> ___________________________________
>> _______________________________________________________________
>> Dr. Ing. José Ignacio Alvarez-Hamelin
>> CONICET and Facultad de Ingeniería, Universidad de Buenos Aires
>> Av. Paseo Colón 850 - C1063ACV - Buenos Aires - Argentina
>> +54 (11) 5285 0716 / 5285 0705
>> e-mail: ihameli@cnet.fi.uba.ar
>> web: http://cnet.fi.uba.ar/ignacio.alvarez-hamelin/
>> _______________________________________________________________
>>> On 23 Oct 2022, at 08:57, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> Hi Glenn,
>>>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm <rpm@lists.bufferbloat.net> wrote:
>>>> As a classic died in the wool empiricist, granted that you can identify "misery" factors, given a population of 1,000 users, how do you propose deriving a misery index for that population?
>>>> We can measure download, upload, ping, jitter pretty much without user intervention. For the measurements you hypothesize, how you you automatically extract those indecies without subjective user contamination.
>>>> I.e. my download speed sucks. Measure the download speed.
>>>> My isp doesn't fix my problem. Measure what? How?
>>>> Human survey technology is 70+ years old and it still has problems figuring out how to correlate opinion with fact.
>>>> Without an objective measurement scheme that doesn't require human interaction, the misery index is a cool hypothesis with no way to link to actual data. What objective measurements can be made? Answer that and the index becomes useful. Otherwise it's just consumer whining.
>>>> Not trying to be combative here, in fact I like the concept you support, but I'm hard pressed to see how the concept can lead to data, and the data lead to policy proposals.
>>> [SM] So it seems that outside of seemingly simple to test throughput numbers*, the next most important quality number (or the most important depending on subjective ranking) is how does latency change under "load". Absolute latency is also important albeit static high latency can be worked around within limits so the change under load seems more relevant.
>>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's bounceback test offer methods to asses latency change under load**, as do waveforms bufferbloat tests and even to a degree Ookla's speedtest.net. IMHO something like latency increase under load or apple's responsiveness measure RPM (basically the inverse of the latency under load calculated on a per minute basis, so it scales in the typical higher numbers are better way, unlike raw latency under load numbers where smaller is better).
>>> IMHO what networkQuality is missing ATM is to measure and report the unloaded RPM as well as the loaded the first gives a measure over the static latency the second over how well things keep working if capacity gets tight. They report the base RTT which can be converted to RPM. As an example:
>>> macbook:~ user$ networkQuality -v
>>> ==== SUMMARY ====
>>> Upload capacity: 24.341 Mbps
>>> Download capacity: 91.951 Mbps
>>> Upload flows: 20
>>> Download flows: 16
>>> Responsiveness: High (2123 RPM)
>>> Base RTT: 16
>>> Start: 10/23/22, 13:44:39
>>> End: 10/23/22, 13:44:53
>>> OS Version: Version 12.6 (Build 21G115)
>>> Here RPM 2133 corresponds to 60000/2123 = 28.26 ms latency under load, while the Base RTT of 16ms corresponds to 60000/16 = 3750 RPM, son on this link load reduces the responsiveness by 3750-2123 = 1627 RPM a reduction by 100-100*2123/3750 = 43.4%, and that is with competent AQM and scheduling on the router.
>>> Without competent AQM/shaping I get:
>>> ==== SUMMARY ====
>>> Upload capacity: 15.101 Mbps
>>> Download capacity: 97.664 Mbps
>>> Upload flows: 20
>>> Download flows: 12
>>> Responsiveness: Medium (427 RPM)
>>> Base RTT: 16
>>> Start: 10/23/22, 13:51:50
>>> End: 10/23/22, 13:52:06
>>> OS Version: Version 12.6 (Build 21G115)
>>> latency under load: 60000/427 = 140.52 ms
>>> base RPM: 60000/16 = 3750 RPM
>>> reduction RPM: 100-100*427/3750 = 88.6%
>>> I understand apple's desire to have a single reported number with a single qualifier medium/high/... because in the end a link is only reliably usable if responsiveness under load stays acceptable, but with two numbers it is easier to see what one's ISP could do to help. (I guess some ISPs might already be unhappy with the single number, so this needs some diplomacy/tact)
>>> Regards
>>> Sebastian
>>> *) Seemingly as quite some ISPs operate their own speedtest servers in their network and ignore customers not reaching the contracted rates into speedtest-servers located in different ASs. As the product is called internet access I a inclined to expect that my ISP maintains sufficient peering/transit capacity to reach the next tier of AS at my contracted rate (the EU legislative seems to agree, see EU directive 2015/2120).
>>> **) Most do by creating load themselves and measuring throughput at the same time, bounceback IIUC will focus on the latency measurement and leave the load generation optional (so offers a mode to measure responsiveness of a live network with minimal measurement traffic). @Bob, please correct me if this is wrong.
>>>> On Fri, Oct 21, 2022, 5:20 PM Dave Taht <dave.taht@gmail.com> wrote:
>>>> One of the best talks I've ever seen on how to measure customer
>>>> satisfaction properly just went up after the P99 Conference.
>>>> It's called Misery Metrics.
>>>> After going through a deep dive as to why and how we think and act on
>>>> percentiles, bins, and other statistical methods as to how we use the
>>>> web and internet are *so wrong* (well worth watching and thinking
>>>> about if you are relying on or creating network metrics today), it
>>>> then points to the real metrics that matter to users and the ultimate
>>>> success of an internet business: Timeouts, retries, misses, failed
>>>> queries, angry phone calls, abandoned shopping carts and loss of
>>>> engagement.
>>>> https://www.p99conf.io/session/misery-metrics-consequences/
>>>> The ending advice was - don't aim to make a specific percentile
>>>> acceptable, aim for an acceptable % of misery.
>>>> I enjoyed the p99 conference more than any conference I've attended in years.
>>>> --
>>>> This song goes out to all the folk that thought Stadia would work:
>>>> https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz
>>>> Dave Täht CEO, TekLibre, LLC
>>>> --
>>>> You received this message because you are subscribed to the Google Groups "discuss" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@measurementlab.net.
>>>> To view this discussion on the web visit https://groups.google.com/a/measurementlab.net/d/msgid/discuss/CAA93jw4w27a1EO_QQG7NNkih%2BC3QQde5%3D_7OqGeS9xy9nB6wkg%40mail.gmail.com.
>>>> _______________________________________________
>>>> Rpm mailing list
>>>> Rpm@lists.bufferbloat.net
>>>> https://lists.bufferbloat.net/listinfo/rpm
>>> _______________________________________________
>>> ippm mailing list
>>> ippm@ietf.org
>>> https://www.ietf.org/mailman/listinfo/ippm
>> _______________________________________________
>> Rpm mailing list
>> Rpm@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/rpm
^ permalink raw reply [flat|nested] 30+ messages in thread