From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <woody77@gmail.com>
Received: from mail-yb1-xb35.google.com (mail-yb1-xb35.google.com
 [IPv6:2607:f8b0:4864:20::b35])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id A5B083CB35
 for <rpm@lists.bufferbloat.net>; Tue, 11 Jan 2022 12:51:10 -0500 (EST)
Received: by mail-yb1-xb35.google.com with SMTP id j83so49663275ybg.2
 for <rpm@lists.bufferbloat.net>; Tue, 11 Jan 2022 09:51:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=DespLXi1I9LLT82EElp40ophpA9gqwsv52K/CY3NYGI=;
 b=OON8nKmR0OnhIkG5Xw2KNivk47E6aHurahO+9XzTwNva9rRs4nIMaeMcXay7LY4v/q
 hG0x1zDGj5gnhCT8LiQDVulqf61ihDizdcBToz8gieUc78mN7k7X7xMo6a6LEpBO/T/C
 grkfD50IINsY1zEgGnEkazSs3gkOQij/mzQV8NRv//9fiyflGAtKzZ9uZAje9ynQYKBR
 /WV6XRD0f0KNNqLITWpA1qfh3XGAQwWPYqUzuMHwEeo9rMbcytnaT4qKyACt7DLjVme6
 tL42GN/xyJyGYDAKAEL+KQ/q5r3pO70rXaztF0C9GHzx40jm6tLJ9PgAxS2OL9moPYeM
 0+DQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=DespLXi1I9LLT82EElp40ophpA9gqwsv52K/CY3NYGI=;
 b=CPpRH8beKPVPX8BEL/eeLUIZyvaHsuKjcWggoR45qBW0RqcIxaCuGg/HhwJqIDLzY7
 R1tmPRCLFt6acPvMZH85Kjz8Sjnx+6+YDDq7Rho8+k6IwR430lnIqS2eo7M91pl7pmaz
 pjqwPABsGrQ+cRGrM+h/O1OMWnFW16o1I8Np6SfXv8GG0d2BCQBbEuUzXVFxAOAK+1BV
 /b7CD4w2PV/wJHUIos3nTaysw6Ew8bSuQCaw432MB3LUDmnmJ3oR/b6IowJ9etB2AKCf
 9hALq+ZstnD4CWNYVHVK4rKKvW7hPC5t2RgWSn6QYP/NfnD+gFEngjj5Kt/bL0XdrMxx
 E9/w==
X-Gm-Message-State: AOAM532VEfnxfVBIVEwrTPymsaAXeHLABRJgmBJkyupEYG8Jh3m0Pk+P
 p4NiyUrwODCxZ/9CB9fLIgXVhqjQhChVO4VGfeM=
X-Google-Smtp-Source: ABdhPJzqUyXHrDltaxpmgHf5XwtetnBXwZjx7h+03Y5NqZObt97Zjvlh1OQ41k8pFl00+YUTdNx+oGRyvLeBD3GgFIc=
X-Received: by 2002:a25:8505:: with SMTP id w5mr8000916ybk.570.1641923469929; 
 Tue, 11 Jan 2022 09:51:09 -0800 (PST)
MIME-Version: 1.0
References: <CAA93jw6XoFWcRS6FyNyncXykfr=V1Pmy2tn20B7LHbHJZoXa4w@mail.gmail.com>
 <BBC12E1D-0661-4112-9F67-9E60BBF587C2@apple.com>
In-Reply-To: <BBC12E1D-0661-4112-9F67-9E60BBF587C2@apple.com>
From: Aaron Wood <woody77@gmail.com>
Date: Tue, 11 Jan 2022 09:50:59 -0800
Message-ID: <CALQXh-OzB7wzD2imA=_gXOKmOXRfrTEopVfbNaUwa441DZDAJw@mail.gmail.com>
To: Christoph Paasch <cpaasch@apple.com>
Cc: Dave Taht <dave.taht@gmail.com>, Rpm <rpm@lists.bufferbloat.net>
Content-Type: multipart/alternative; boundary="0000000000002c71ae05d55219cc"
Subject: Re: [Rpm] apm metric - annoyance per minute
X-BeenThere: rpm@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: revolutions per minute - a new metric for measuring responsiveness
 <rpm.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/rpm>,
 <mailto:rpm-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/rpm>
List-Post: <mailto:rpm@lists.bufferbloat.net>
List-Help: <mailto:rpm-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/rpm>,
 <mailto:rpm-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Tue, 11 Jan 2022 17:51:10 -0000

--0000000000002c71ae05d55219cc
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

I read it as the number of events per minute (not say the number of frames
longer than 20ms, but the number of events that took at least 20ms longer
than the base RTT, which I think is what Dave meant by "latency excursion")=
.

E.g. if you're doing DNS queries, and they usually return in 50ms, and 3 of
them take 83, 94, and 106 ms respectively, in a given minute, than that
would be an APM of 3?  ( or maybe an APM rate of 30% if you were doing
10/minute)

I've found the tricky thing for metrics is sorting out the event-count vs.
events-rate differences.  A lot of tests that are isochronous give a
constant event-rate to base on (e.g. ping's default of once per second),
but other tests, like the UDP and ICMP pings in flent (at least in the
past), have a rate that's based on the RTT, so as RTT goes up, the rate of
events goes down, which means that it oversamples the "fast" events, and
undersamples "slow" events.

Further, retries for failed events muddy the waters, as the events aren't
independent measurements.  If a momentary drop in connectivity causes
retries to happen, and each failed retry is counted, is that N failures?
Or just 1 failure?  I've split those out as separate metrics in some
systems I've built, so that I can tease them apart.  I've also done things
like the distribution (histogram) of "attempts before success" or "attempts
before operation failed".  Usually those are dominated by "1 attempt before
success", and "N attempts before operation failed" where N is the number of
total attempts before just giving up.

On Tue, Jan 11, 2022 at 9:34 AM Christoph Paasch via Rpm <
rpm@lists.bufferbloat.net> wrote:

> Hi Dave!
>
> > On Jan 9, 2022, at 6:57 PM, Dave Taht via Rpm <rpm@lists.bufferbloat.ne=
t>
> wrote:
> >
> > or gpm - glitch per minute
> >
> > defined as a latency excursion of more than 20ms.
>
> I kinda find that interesting :) Can you give an example? Would it count
> the number of times we miss a "20ms-deadline"? So, if the RTT is 100ms, G=
PM
> would be 5 ?
>
>
> Christoph
>
> >
> > ?
> >
> >
> > --
> > I tried to build a better future, a few times:
> > https://wayforward.archive.org/?site=3Dhttps%3A%2F%2Fwww.icei.org
> >
> > Dave T=C3=A4ht CEO, TekLibre, LLC
> > _______________________________________________
> > Rpm mailing list
> > Rpm@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/rpm
>
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm
>

--0000000000002c71ae05d55219cc
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I read it as the number of events per minute (not say the =
number of frames longer than 20ms, but the number of events that took at le=
ast 20ms longer than the base RTT, which I think is what Dave meant by &quo=
t;latency excursion&quot;).<div><br></div><div>E.g. if you&#39;re doing DNS=
 queries, and they usually return in 50ms, and 3 of them take 83, 94, and 1=
06 ms respectively, in a given minute, than that would be an APM of 3? =C2=
=A0( or maybe an APM rate of 30% if you were doing 10/minute)</div><div><br=
></div><div>I&#39;ve found the tricky thing for metrics is sorting out the =
event-count vs. events-rate differences.=C2=A0 A lot of tests that are isoc=
hronous give a constant event-rate to base on (e.g. ping&#39;s default of o=
nce per second), but other tests, like the UDP and ICMP pings in flent (at =
least in the past), have a rate that&#39;s based on the RTT, so as RTT goes=
 up, the rate of events goes down, which means that it oversamples the &quo=
t;fast&quot; events, and undersamples &quot;slow&quot; events.</div><div><b=
r></div><div>Further, retries for failed events muddy the waters, as the ev=
ents aren&#39;t independent measurements.=C2=A0 If a momentary drop in conn=
ectivity causes retries to happen, and each failed retry is counted, is tha=
t N failures?=C2=A0 Or just 1 failure?=C2=A0 I&#39;ve split those out as se=
parate metrics in some systems I&#39;ve built, so that I can tease them apa=
rt.=C2=A0 I&#39;ve also done things like the distribution (histogram) of &q=
uot;attempts before success&quot; or &quot;attempts before operation failed=
&quot;.=C2=A0 Usually those are dominated by &quot;1 attempt before success=
&quot;, and &quot;N attempts before operation failed&quot; where N is the n=
umber of total attempts before just giving up.</div></div><br><div class=3D=
"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Jan 11, 2022 at=
 9:34 AM Christoph Paasch via Rpm &lt;<a href=3D"mailto:rpm@lists.bufferblo=
at.net">rpm@lists.bufferbloat.net</a>&gt; wrote:<br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;bo=
rder-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">=
Hi Dave!<br>
<br>
&gt; On Jan 9, 2022, at 6:57 PM, Dave Taht via Rpm &lt;<a href=3D"mailto:rp=
m@lists.bufferbloat.net" target=3D"_blank">rpm@lists.bufferbloat.net</a>&gt=
; wrote:<br>
&gt; <br>
&gt; or gpm - glitch per minute<br>
&gt; <br>
&gt; defined as a latency excursion of more than 20ms.<br>
<br>
I kinda find that interesting :) Can you give an example? Would it count th=
e number of times we miss a &quot;20ms-deadline&quot;? So, if the RTT is 10=
0ms, GPM would be 5 ?<br>
<br>
<br>
Christoph<br>
<br>
&gt; <br>
&gt; ?<br>
&gt; <br>
&gt; <br>
&gt; -- <br>
&gt; I tried to build a better future, a few times:<br>
&gt; <a href=3D"https://wayforward.archive.org/?site=3Dhttps%3A%2F%2Fwww.ic=
ei.org" rel=3D"noreferrer" target=3D"_blank">https://wayforward.archive.org=
/?site=3Dhttps%3A%2F%2Fwww.icei.org</a><br>
&gt; <br>
&gt; Dave T=C3=A4ht CEO, TekLibre, LLC<br>
&gt; _______________________________________________<br>
&gt; Rpm mailing list<br>
&gt; <a href=3D"mailto:Rpm@lists.bufferbloat.net" target=3D"_blank">Rpm@lis=
ts.bufferbloat.net</a><br>
&gt; <a href=3D"https://lists.bufferbloat.net/listinfo/rpm" rel=3D"noreferr=
er" target=3D"_blank">https://lists.bufferbloat.net/listinfo/rpm</a><br>
<br>
_______________________________________________<br>
Rpm mailing list<br>
<a href=3D"mailto:Rpm@lists.bufferbloat.net" target=3D"_blank">Rpm@lists.bu=
fferbloat.net</a><br>
<a href=3D"https://lists.bufferbloat.net/listinfo/rpm" rel=3D"noreferrer" t=
arget=3D"_blank">https://lists.bufferbloat.net/listinfo/rpm</a><br>
</blockquote></div>

--0000000000002c71ae05d55219cc--