From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id A7BC23CB37 for ; Mon, 31 Oct 2022 17:20:05 -0400 (EDT) Received: by mail-pl1-x631.google.com with SMTP id j12so11861007plj.5 for ; Mon, 31 Oct 2022 14:20:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=dWyigSt4N+JWT9XUj5TBB3/XAU0goMxfXxkInuzSJcc=; b=gGvNVf67NJVFm+4sAp+xRWW2Mxm917QwfRfdegSLFzJZCSX6Aqx0BSBsvtPO6RjDRe amB9g5jibChqlxfN9kES5sNwMaFyiia+LXe0rHKHJgwTeE7H1A7yg5Bg0o+AIkww7kZN JxT9pQSuFF2Rh0xvVN8BTimzPvLK+HwJfUiIM0SNVHIJ7PZpp5Y97dmY7/c6TXm8r8IH nQKx8r5vgvO7dkKEscgxgsceiAh0G5xWOuDRVsPl/EOE89ioQtlhmmFlNZCkMbKcyf0F lbunP9lC1N57BAhDBa4Rau0AmOmd4YzLXvb/7DzVP1sVhZETUjlng7AxM8G4xZ8DU2aY Wbhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dWyigSt4N+JWT9XUj5TBB3/XAU0goMxfXxkInuzSJcc=; b=Lb7SDjzCW/69ra7vc89LDu9FzNYkQUkEpHBCqg6Cfvz0aEe34jW9X4s9VU7nC1teL+ i1AavyVvvasEScyLGs+yvOO/74AMC6LeoMXtK8lYM3b4GnFelBrEN41a1oeANWmlKo0u kPE3vDXYB1UrWhOMhsaGNCFscEPGelqiC51dBPw2VZusvIj1PYugtNY4AZqBJXcYddkh hVtt5m4AJH6PJ2fG3sxMdtN7MVGnCbPMT3M0r2PwG5ZZFXa9Tboadr1FqrJdATq/6BZb eVLXJhnAfO2m09dK/4yUDFoGOuNcOkvs71DdVdKCIIRVF0EVlv1EU4Fgo2HmPA9fH52s q+cA== X-Gm-Message-State: ACrzQf2M+vVYgnBUaSx72+AftvTaJ3Dg/BnLLTtrmN5lZTbWsYzXts4a rSBsAmueCbJdD0Q1EHCJQ/nwuXcJW/R3BmKOgxIQpuVZM4Q= X-Google-Smtp-Source: AMsMyM78rm1gbzwNkbayux12oapDJwpmkwnsRUEQTaWa+yX9IG5cKLHRMtX8IRWsMM7T8YGmzOfqtcF5022ihWgMFnA= X-Received: by 2002:a17:90b:806:b0:213:473e:6fab with SMTP id bk6-20020a17090b080600b00213473e6fabmr16294884pjb.245.1667251204302; Mon, 31 Oct 2022 14:20:04 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Herbert Wolverson Date: Mon, 31 Oct 2022 16:19:53 -0500 Message-ID: Cc: libreqos Content-Type: multipart/alternative; boundary="000000000000c8a9e605ec5b2b44" Subject: Re: [LibreQoS] Integration system, aka fun with graph theory X-BeenThere: libreqos@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Many ISPs need the kinds of quality shaping cake can do List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2022 21:20:05 -0000 --000000000000c8a9e605ec5b2b44 Content-Type: text/plain; charset="UTF-8" I'd agree with color coding (when it exists - no rush, IMO) being configurable. >From the "how much delay are we adding" discussion earlier, I thought I'd do a little bit of profiling of the BPF programs themselves. This is with the latest round of performance updates ( https://github.com/thebracket/cpumap-pping/issues/2), so it's not measuring anything in production. I simply added a call to get the clock at the start, and again at the end - and log the difference. Measuring both XDP and TC BPF programs. (Execution goes (packet arrives)->(XDP cpumap sends it to the right CPU)->(egress)->(TC sends it to the right classifier, on the correct CPU and measures RTT latency). This is adding about two clock checks and a debug log entry to execution time, so measuring it is slowing it down. The results are interesting, and mostly tell me to try a different measurement system. I'm seeing a pretty wide variance. Hammering it with an iperf session and a queue capped at 5 gbit/s: most of the TC timings were 40 nanoseconds - not a packet that requires extra tracking, already in cache, so proceed. When the TCP RTT tracker fired and recorded a performance event, it peaked at 5,900 nanoseconds. So the tc xdp program seems to be adding a worst-case of 0.0059 ms to packet times. The XDP side of things is typically in the 300-400 nanosecond range, I saw a handful of worst-case numbers in the 3400 nanosecond range. So the XDP side is adding 0.00349 ms. So - assuming worst case (and keeping the overhead added by the not-so-great monitoring), we're adding *0.0093 ms* to packet transit time with the BPF programs. With a much more sedate queue (ceiling 500 mbit/s), I saw much more consistent numbers. The vast majority of XDP timings were in the 75-150 nanosecond range, and TC was a consistent 50-55 nanoseconds when it didn't have an update to perform - peaking very occasionally at 1500 nanoseconds. Only adding 0.00155 ms to packet times is pretty good. It definitely performs best on long streams, probably because the previous lookups are all in cache. This is also making me question the answer I found to "how long does it take to read the clock?" I'd seen ballpark estimates of 53 nanoseconds. Given that this reads the clock twice, that can't be right. (I'm *really* not sure how to measure that one) Again - not a great test (I'll have to learn the perf system to do this properly - which in turn opens up the potential for flame graphs and some proper tracing). Interesting ballpark, though. On Mon, Oct 31, 2022 at 10:56 AM dan wrote: > > > On Sun, Oct 30, 2022 at 8:21 PM Dave Taht via LibreQoS < > libreqos@lists.bufferbloat.net> wrote: > >> How about the idea of "metaverse-ready" metrics, with one table that is >> preseem-like and another that's >> >> blue = < 8ms >> green = < 20ms >> yellow = < 50ms >> orange = < 70ms >> red = > 70ms >> > > These need configurable. There are a lot of wisps that would have > everything orange/red. We're considering anything under 100ms good on the > rural plans. Also keep in mind that if you're tracking latence via pping > etc, then you need some buffer in there for the internet at large. <70ms > to Amazon is one thing, they're very well connected, but <70ms to most of > the internet isn't probably very realistic and would make most charts look > like poop. > --000000000000c8a9e605ec5b2b44 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I'd agree with color coding (when it exists - no = rush, IMO) being configurable.

From the "= how much delay are we adding" discussion earlier, I thought I'd do= a little bit of profiling of the BPF programs themselves. This is with the= latest round of performance updates (https://github.com/thebracket/cpumap-pping/issue= s/2), so it's not measuring anything in production. I simply added = a call to get the clock at the start, and again at the end - and log the di= fference. Measuring both XDP and TC BPF programs. (Execution goes (packet a= rrives)->(XDP cpumap sends it to the right CPU)->(egress)->(TC sen= ds it to the right classifier, on the correct CPU and measures RTT latency)= . This is adding about two clock checks and a debug log entry to execution = time, so measuring it is slowing it down.

The = results are interesting, and mostly tell me to try a different measurement = system. I'm seeing a pretty wide variance. Hammering it with an iperf s= ession and a queue capped at 5 gbit/s: most of the TC timings were 40 nanos= econds - not a packet that requires extra tracking, already in cache, so pr= oceed. When the TCP RTT tracker fired and recorded a performance event, it = peaked at 5,900 nanoseconds. So the tc xdp program seems to be adding a wor= st-case of 0.0059 ms to packet times. The XDP side of things is typically i= n the 300-400 nanosecond range, I saw a handful of worst-case numbers in th= e 3400 nanosecond range. So the XDP side is adding 0.00349 ms. So - assumin= g worst case (and keeping the overhead added by the not-so-great monitoring= ), we're adding 0.0093 ms to packet transit time with the BPF pr= ograms.

With a much more sedate queue (ceiling 500= mbit/s), I saw much more consistent numbers. The vast majority of XDP timi= ngs were in the 75-150 nanosecond range, and TC was a consistent 50-55 nano= seconds when it didn't have an update to perform - peaking very occasio= nally at 1500 nanoseconds. Only adding 0.00155 ms to packet times is pretty= good.

It definitely performs best on long streams= , probably because the previous lookups are all in cache. This is also maki= ng me question the answer I found to "how long does it take to read th= e clock?" I'd seen ballpark estimates of 53 nanoseconds. Given tha= t this reads the clock twice, that can't be right. (I'm *really* no= t sure how to measure that one)

Again - not a = great test (I'll have to learn the perf system to do this properly - wh= ich in turn opens up the potential for flame graphs and some proper tracing= ). Interesting ballpark, though.

On Mon, Oct 31, 2022 at 10:56 AM = dan <dandenson@gmail.com> = wrote:


On Sun, Oct 30, 2022 at 8:21 PM Dave Taht via= LibreQoS <libreqos@lists.bufferbloat.net> wrote:
How about the id= ea of "metaverse-ready" metrics, with one table that is preseem-l= ike and another that's

blue =3D=C2=A0 <= 8ms
green =3D < 20ms
yellow =3D < 50ms
orange=C2=A0 =3D < 70ms
red =3D > 70ms

These need configurable.=C2=A0 There are a lot of wisps that = would have everything orange/red.=C2=A0 We're considering anything unde= r 100ms good on the rural plans.=C2=A0 =C2=A0Also keep in mind that if you&= #39;re tracking latence via pping etc, then you need some buffer in there f= or the internet at large.=C2=A0 <70ms to Amazon is one thing, they'r= e very well connected, but <70ms to most of the internet isn't proba= bly very realistic and would make most charts look like poop.=C2=A0
--000000000000c8a9e605ec5b2b44--