From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb35.google.com (mail-yb1-xb35.google.com [IPv6:2607:f8b0:4864:20::b35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id A5B083CB35 for ; Tue, 11 Jan 2022 12:51:10 -0500 (EST) Received: by mail-yb1-xb35.google.com with SMTP id j83so49663275ybg.2 for ; Tue, 11 Jan 2022 09:51:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=DespLXi1I9LLT82EElp40ophpA9gqwsv52K/CY3NYGI=; b=OON8nKmR0OnhIkG5Xw2KNivk47E6aHurahO+9XzTwNva9rRs4nIMaeMcXay7LY4v/q hG0x1zDGj5gnhCT8LiQDVulqf61ihDizdcBToz8gieUc78mN7k7X7xMo6a6LEpBO/T/C grkfD50IINsY1zEgGnEkazSs3gkOQij/mzQV8NRv//9fiyflGAtKzZ9uZAje9ynQYKBR /WV6XRD0f0KNNqLITWpA1qfh3XGAQwWPYqUzuMHwEeo9rMbcytnaT4qKyACt7DLjVme6 tL42GN/xyJyGYDAKAEL+KQ/q5r3pO70rXaztF0C9GHzx40jm6tLJ9PgAxS2OL9moPYeM 0+DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=DespLXi1I9LLT82EElp40ophpA9gqwsv52K/CY3NYGI=; b=CPpRH8beKPVPX8BEL/eeLUIZyvaHsuKjcWggoR45qBW0RqcIxaCuGg/HhwJqIDLzY7 R1tmPRCLFt6acPvMZH85Kjz8Sjnx+6+YDDq7Rho8+k6IwR430lnIqS2eo7M91pl7pmaz pjqwPABsGrQ+cRGrM+h/O1OMWnFW16o1I8Np6SfXv8GG0d2BCQBbEuUzXVFxAOAK+1BV /b7CD4w2PV/wJHUIos3nTaysw6Ew8bSuQCaw432MB3LUDmnmJ3oR/b6IowJ9etB2AKCf 9hALq+ZstnD4CWNYVHVK4rKKvW7hPC5t2RgWSn6QYP/NfnD+gFEngjj5Kt/bL0XdrMxx E9/w== X-Gm-Message-State: AOAM532VEfnxfVBIVEwrTPymsaAXeHLABRJgmBJkyupEYG8Jh3m0Pk+P p4NiyUrwODCxZ/9CB9fLIgXVhqjQhChVO4VGfeM= X-Google-Smtp-Source: ABdhPJzqUyXHrDltaxpmgHf5XwtetnBXwZjx7h+03Y5NqZObt97Zjvlh1OQ41k8pFl00+YUTdNx+oGRyvLeBD3GgFIc= X-Received: by 2002:a25:8505:: with SMTP id w5mr8000916ybk.570.1641923469929; Tue, 11 Jan 2022 09:51:09 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Aaron Wood Date: Tue, 11 Jan 2022 09:50:59 -0800 Message-ID: To: Christoph Paasch Cc: Dave Taht , Rpm Content-Type: multipart/alternative; boundary="0000000000002c71ae05d55219cc" Subject: Re: [Rpm] apm metric - annoyance per minute X-BeenThere: rpm@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: revolutions per minute - a new metric for measuring responsiveness List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2022 17:51:10 -0000 --0000000000002c71ae05d55219cc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I read it as the number of events per minute (not say the number of frames longer than 20ms, but the number of events that took at least 20ms longer than the base RTT, which I think is what Dave meant by "latency excursion")= . E.g. if you're doing DNS queries, and they usually return in 50ms, and 3 of them take 83, 94, and 106 ms respectively, in a given minute, than that would be an APM of 3? ( or maybe an APM rate of 30% if you were doing 10/minute) I've found the tricky thing for metrics is sorting out the event-count vs. events-rate differences. A lot of tests that are isochronous give a constant event-rate to base on (e.g. ping's default of once per second), but other tests, like the UDP and ICMP pings in flent (at least in the past), have a rate that's based on the RTT, so as RTT goes up, the rate of events goes down, which means that it oversamples the "fast" events, and undersamples "slow" events. Further, retries for failed events muddy the waters, as the events aren't independent measurements. If a momentary drop in connectivity causes retries to happen, and each failed retry is counted, is that N failures? Or just 1 failure? I've split those out as separate metrics in some systems I've built, so that I can tease them apart. I've also done things like the distribution (histogram) of "attempts before success" or "attempts before operation failed". Usually those are dominated by "1 attempt before success", and "N attempts before operation failed" where N is the number of total attempts before just giving up. On Tue, Jan 11, 2022 at 9:34 AM Christoph Paasch via Rpm < rpm@lists.bufferbloat.net> wrote: > Hi Dave! > > > On Jan 9, 2022, at 6:57 PM, Dave Taht via Rpm > wrote: > > > > or gpm - glitch per minute > > > > defined as a latency excursion of more than 20ms. > > I kinda find that interesting :) Can you give an example? Would it count > the number of times we miss a "20ms-deadline"? So, if the RTT is 100ms, G= PM > would be 5 ? > > > Christoph > > > > > ? > > > > > > -- > > I tried to build a better future, a few times: > > https://wayforward.archive.org/?site=3Dhttps%3A%2F%2Fwww.icei.org > > > > Dave T=C3=A4ht CEO, TekLibre, LLC > > _______________________________________________ > > Rpm mailing list > > Rpm@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/rpm > > _______________________________________________ > Rpm mailing list > Rpm@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/rpm > --0000000000002c71ae05d55219cc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I read it as the number of events per minute (not say the = number of frames longer than 20ms, but the number of events that took at le= ast 20ms longer than the base RTT, which I think is what Dave meant by &quo= t;latency excursion").

E.g. if you're doing DNS= queries, and they usually return in 50ms, and 3 of them take 83, 94, and 1= 06 ms respectively, in a given minute, than that would be an APM of 3? =C2= =A0( or maybe an APM rate of 30% if you were doing 10/minute)
I've found the tricky thing for metrics is sorting out the = event-count vs. events-rate differences.=C2=A0 A lot of tests that are isoc= hronous give a constant event-rate to base on (e.g. ping's default of o= nce per second), but other tests, like the UDP and ICMP pings in flent (at = least in the past), have a rate that's based on the RTT, so as RTT goes= up, the rate of events goes down, which means that it oversamples the &quo= t;fast" events, and undersamples "slow" events.
Further, retries for failed events muddy the waters, as the ev= ents aren't independent measurements.=C2=A0 If a momentary drop in conn= ectivity causes retries to happen, and each failed retry is counted, is tha= t N failures?=C2=A0 Or just 1 failure?=C2=A0 I've split those out as se= parate metrics in some systems I've built, so that I can tease them apa= rt.=C2=A0 I've also done things like the distribution (histogram) of &q= uot;attempts before success" or "attempts before operation failed= ".=C2=A0 Usually those are dominated by "1 attempt before success= ", and "N attempts before operation failed" where N is the n= umber of total attempts before just giving up.

On Tue, Jan 11, 2022 at= 9:34 AM Christoph Paasch via Rpm <rpm@lists.bufferbloat.net> wrote:
= Hi Dave!

> On Jan 9, 2022, at 6:57 PM, Dave Taht via Rpm <rpm@lists.bufferbloat.net>= ; wrote:
>
> or gpm - glitch per minute
>
> defined as a latency excursion of more than 20ms.

I kinda find that interesting :) Can you give an example? Would it count th= e number of times we miss a "20ms-deadline"? So, if the RTT is 10= 0ms, GPM would be 5 ?


Christoph

>
> ?
>
>
> --
> I tried to build a better future, a few times:
> https://wayforward.archive.org= /?site=3Dhttps%3A%2F%2Fwww.icei.org
>
> Dave T=C3=A4ht CEO, TekLibre, LLC
> _______________________________________________
> Rpm mailing list
> Rpm@lis= ts.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm

_______________________________________________
Rpm mailing list
Rpm@lists.bu= fferbloat.net
https://lists.bufferbloat.net/listinfo/rpm
--0000000000002c71ae05d55219cc--