From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb2c.google.com (mail-yb1-xb2c.google.com [IPv6:2607:f8b0:4864:20::b2c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id A166C3CB37 for ; Mon, 31 Oct 2022 19:32:11 -0400 (EDT) Received: by mail-yb1-xb2c.google.com with SMTP id z192so15539695yba.0 for ; Mon, 31 Oct 2022 16:32:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=OvEC0emcljnwDtC4bFSkl8FU6pZ+VHy4abhvM3TzAII=; b=NKnKIJfVSvHKg86/Cc+UG+ZVgrad5FyTj20SxJaPtsEeR0Gl0UMAQTPIkTZNeIyYG3 Vlnh3fJ7dZwK+wJnVnd1xRA+7u15unu/t7uVVRXgTPLV2TfcmiT++cbnM31B7kP1WMfk F2+/YeG0wKq/cfbLh+gO10+av0sHuOuFgqppkerNAKLjkpIgeozSZ7wRV1Un4NoW4+/5 1AzaKEGLW/++hkavs1yUNWh1K3UvkVW3mTzCcyAASChir7mCRmP+tn5YxkyJhvTKoF6X HaQ2jGSvMuNnowB0bOTxBvOb0FBPx44DfcGhfkAKdZUexoh9RGKkrokqyOvUajolHH6s hRaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OvEC0emcljnwDtC4bFSkl8FU6pZ+VHy4abhvM3TzAII=; b=GtJ2x8vvN5tFt3zyYN+iv18UhTewdNffTTwE1DgbebjctNaL0vMcqEbAgq9t9viAvK qTr9W9c60rGc+Me2vP1uMkDPeWjDaoRIb3nanCXeuxTBrATNWKhRL1i5nnVO+Weu7iYH RIXvARXshKKMn5Pm/6K9N0xI78SYvPSL8CLK8F5hfL/UMIZiVPh0X+7+HJuk/Y6REqLB zD81I56VEPsfIH9j2wmYr80UAOizjlOYQeVIJbM2SheuQm4vRjzZA+ToeYpR9JP0keJ7 5B3jE01fXmtYnkqtTWolHu6stwJ9HW8ZfYYGoGlOFzGtpR7sxl7lp1JM2luFslfWMV/d DqTQ== X-Gm-Message-State: ACrzQf3laqFtIHJTPiOqfVf+iAOWVlFnhEGNVcZlAa25vlUhKZQLuOZT Lmt7BzZ4PNCXlw8EyLjK83z/b8zu4eorzLg2Z1s= X-Google-Smtp-Source: AMsMyM4zRAu2MQ12JOwNgtQHjPRVLmAz9AFlO8GwwMo9jSbDAn2QUllZkxrtalLgYkPeNAZkN/1i/pL7U0ZPJpYDQrE= X-Received: by 2002:a25:2d63:0:b0:6ca:3fe:3f2d with SMTP id s35-20020a252d63000000b006ca03fe3f2dmr15214130ybe.90.1667259130762; Mon, 31 Oct 2022 16:32:10 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: dan Date: Mon, 31 Oct 2022 17:31:59 -0600 Message-ID: To: =?UTF-8?Q?Robert_Chac=C3=B3n?= Cc: Herbert Wolverson , libreqos Content-Type: multipart/alternative; boundary="0000000000003cda5105ec5d0489" Subject: Re: [LibreQoS] Integration system, aka fun with graph theory X-BeenThere: libreqos@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Many ISPs need the kinds of quality shaping cake can do List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2022 23:32:11 -0000 --0000000000003cda5105ec5d0489 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable preseems numbers ar -074 green, 75-124 yellow, 125-200 red, and they just consolidate everything >200 to 200, basically so there's no 'terrible' color lol. I think these numbers are reasonable for standard internet service these days. for a 'default' value anyway. >100ms isn't bad service for most people, and most wisps will have a LOT of traffic coming through with >100ms from the far reaches of the internet. Maybe just reasonable defaults like preseem uses for integrated 'generic' tracking, but then have a separate graph hitting some target services. ie, try to get game servers on there, AWS, Cloudflare, Azure, Google cloud. Show a radar graphic or similar. On Mon, Oct 31, 2022 at 3:57 PM Robert Chac=C3=B3n via LibreQoS < libreqos@lists.bufferbloat.net> wrote: > > I'd agree with color coding (when it exists - no rush, IMO) being > configurable. > > Thankfully it will be configurable, and easily, through the InfluxDB > interface. > Any operator will be able to click the Gear icon above the tables and set > the thresholds to whatever is desired. > I've set it to include both a standard table and "metaverse-ready" table > based on Dave's threshold recommendations. > > - Standard (Preseem like) > - green =3D < 75 ms > - yellow =3D < 100 ms > - red =3D > 100 ms > - Metaverse-Ready > - blue =3D < 8ms > - green =3D < 20ms > - yellow =3D < 50ms > - orange =3D < 70ms > - red =3D > 70ms > > Are the defaults here reasonable at least? Should we change the Standard > table thresholds a bit? > > > Only adding 0.00155 ms to packet times is pretty good. > > Agreed! That's excellent. Great work on this so far it's looking like > you're making tremendous progress. > > On Mon, Oct 31, 2022 at 3:20 PM Herbert Wolverson via LibreQoS < > libreqos@lists.bufferbloat.net> wrote: > >> I'd agree with color coding (when it exists - no rush, IMO) being >> configurable. >> >> From the "how much delay are we adding" discussion earlier, I thought I'= d >> do a little bit of profiling of the BPF programs themselves. This is wit= h >> the latest round of performance updates ( >> https://github.com/thebracket/cpumap-pping/issues/2), so it's not >> measuring anything in production. I simply added a call to get the clock= at >> the start, and again at the end - and log the difference. Measuring both >> XDP and TC BPF programs. (Execution goes (packet arrives)->(XDP cpumap >> sends it to the right CPU)->(egress)->(TC sends it to the right classifi= er, >> on the correct CPU and measures RTT latency). This is adding about two >> clock checks and a debug log entry to execution time, so measuring it is >> slowing it down. >> >> The results are interesting, and mostly tell me to try a different >> measurement system. I'm seeing a pretty wide variance. Hammering it with= an >> iperf session and a queue capped at 5 gbit/s: most of the TC timings wer= e >> 40 nanoseconds - not a packet that requires extra tracking, already in >> cache, so proceed. When the TCP RTT tracker fired and recorded a >> performance event, it peaked at 5,900 nanoseconds. So the tc xdp program >> seems to be adding a worst-case of 0.0059 ms to packet times. The XDP si= de >> of things is typically in the 300-400 nanosecond range, I saw a handful = of >> worst-case numbers in the 3400 nanosecond range. So the XDP side is addi= ng >> 0.00349 ms. So - assuming worst case (and keeping the overhead added by = the >> not-so-great monitoring), we're adding *0.0093 ms* to packet transit >> time with the BPF programs. >> >> With a much more sedate queue (ceiling 500 mbit/s), I saw much more >> consistent numbers. The vast majority of XDP timings were in the 75-150 >> nanosecond range, and TC was a consistent 50-55 nanoseconds when it didn= 't >> have an update to perform - peaking very occasionally at 1500 nanosecond= s. >> Only adding 0.00155 ms to packet times is pretty good. >> >> It definitely performs best on long streams, probably because the >> previous lookups are all in cache. This is also making me question the >> answer I found to "how long does it take to read the clock?" I'd seen >> ballpark estimates of 53 nanoseconds. Given that this reads the clock >> twice, that can't be right. (I'm *really* not sure how to measure that o= ne) >> >> Again - not a great test (I'll have to learn the perf system to do this >> properly - which in turn opens up the potential for flame graphs and som= e >> proper tracing). Interesting ballpark, though. >> >> On Mon, Oct 31, 2022 at 10:56 AM dan wrote: >> >>> >>> >>> On Sun, Oct 30, 2022 at 8:21 PM Dave Taht via LibreQoS < >>> libreqos@lists.bufferbloat.net> wrote: >>> >>>> How about the idea of "metaverse-ready" metrics, with one table that i= s >>>> preseem-like and another that's >>>> >>>> blue =3D < 8ms >>>> green =3D < 20ms >>>> yellow =3D < 50ms >>>> orange =3D < 70ms >>>> red =3D > 70ms >>>> >>> >>> These need configurable. There are a lot of wisps that would have >>> everything orange/red. We're considering anything under 100ms good on = the >>> rural plans. Also keep in mind that if you're tracking latence via pp= ing >>> etc, then you need some buffer in there for the internet at large. <70= ms >>> to Amazon is one thing, they're very well connected, but <70ms to most = of >>> the internet isn't probably very realistic and would make most charts l= ook >>> like poop. >>> >> _______________________________________________ >> LibreQoS mailing list >> LibreQoS@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/libreqos >> > > > -- > Robert Chac=C3=B3n > CEO | JackRabbit Wireless LLC > Dev | LibreQoS.io > > _______________________________________________ > LibreQoS mailing list > LibreQoS@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/libreqos > --0000000000003cda5105ec5d0489 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
preseems numbers ar -074 green, 75-124 yellow, 125-200 red= , and they just consolidate everything=C2=A0>200 to 200, basically so th= ere's no 'terrible' color lol.=C2=A0 I think these numbers are = reasonable for standard internet service these days.=C2=A0 for a 'defau= lt' value anyway.=C2=A0 =C2=A0>100ms isn't bad service for most = people, and most wisps will have a LOT of traffic coming through with >1= 00ms from the far reaches of the internet.

Maybe just reasonable def= aults like preseem uses for integrated 'generic' tracking, but then= have a separate graph hitting some target services.=C2=A0 ie, try to get g= ame servers on there, AWS, Cloudflare, Azure, Google cloud.=C2=A0 Show a ra= dar graphic or similar.

On Mon, Oct 31, 2022 at 3:57 PM Robert Chac=C3= =B3n via LibreQoS <lib= reqos@lists.bufferbloat.net> wrote:
> I'd agree with co= lor coding (when it exists - no rush, IMO) being configurable.
Thankfully it will be configurable, and easily, through the In= fluxDB interface.
Any operator will be able to click the Gear ico= n above the tables and set the thresholds to whatever is desired.
I've set it to includ= e both a standard table and "metaverse-ready" table based on Dave= 's threshold recommendations.
  • Standard (Preseem like)=
    • green =3D < 75 ms
    • yellow =3D < 100 ms
    • red =3D > 100 ms
  • Metaverse-Ready
    • blue =3D =C2=A0< 8ms
    • green =3D < 20ms
    • yellow =3D &l= t; 50ms
    • orange =C2=A0=3D < 70ms
    • red =3D > 70ms
Are the defaults here reasonable at least? Sh= ould we change the Standard table thresholds a bit?

> Only adding 0.00155 ms to packet times is pretty good.
Agreed! That's excellent. Great work on this so far it'= ;s looking like you're making tremendous progress.

<= div class=3D"gmail_quote">
On Mon, Oct= 31, 2022 at 3:20 PM Herbert Wolverson via LibreQoS <libreqos@lists.bufferbloat= .net> wrote:
I'd agree with color coding (when it exists -= no rush, IMO) being configurable.

From the &q= uot;how much delay are we adding" discussion earlier, I thought I'= d do a little bit of profiling of the BPF programs themselves. This is with= the latest round of performance updates (https://github.com/thebrac= ket/cpumap-pping/issues/2), so it's not measuring anything in produ= ction. I simply added a call to get the clock at the start, and again at th= e end - and log the difference. Measuring both XDP and TC BPF programs. (Ex= ecution goes (packet arrives)->(XDP cpumap sends it to the right CPU)-&g= t;(egress)->(TC sends it to the right classifier, on the correct CPU and= measures RTT latency). This is adding about two clock checks and a debug l= og entry to execution time, so measuring it is slowing it down.

The results are interesting, and mostly tell me to try a = different measurement system. I'm seeing a pretty wide variance. Hammer= ing it with an iperf session and a queue capped at 5 gbit/s: most of the TC= timings were 40 nanoseconds - not a packet that requires extra tracking, a= lready in cache, so proceed. When the TCP RTT tracker fired and recorded a = performance event, it peaked at 5,900 nanoseconds. So the tc xdp program se= ems to be adding a worst-case of 0.0059 ms to packet times. The XDP side of= things is typically in the 300-400 nanosecond range, I saw a handful of wo= rst-case numbers in the 3400 nanosecond range. So the XDP side is adding 0.= 00349 ms. So - assuming worst case (and keeping the overhead added by the n= ot-so-great monitoring), we're adding 0.0093 ms to packet transi= t time with the BPF programs.

With a much more sed= ate queue (ceiling 500 mbit/s), I saw much more consistent numbers. The vas= t majority of XDP timings were in the 75-150 nanosecond range, and TC was a= consistent 50-55 nanoseconds when it didn't have an update to perform = - peaking very occasionally at 1500 nanoseconds. Only adding 0.00155 ms to = packet times is pretty good.

It definitely perform= s best on long streams, probably because the previous lookups are all in ca= che. This is also making me question the answer I found to "how long d= oes it take to read the clock?" I'd seen ballpark estimates of 53 = nanoseconds. Given that this reads the clock twice, that can't be right= . (I'm *really* not sure how to measure that one)

Again - not a great test (I'll have to learn the perf system to= do this properly - which in turn opens up the potential for flame graphs a= nd some proper tracing). Interesting ballpark, though.

<= div class=3D"gmail_quote">
On Mon, Oct= 31, 2022 at 10:56 AM dan <dandenson@gmail.com> wrote:


On S= un, Oct 30, 2022 at 8:21 PM Dave Taht via LibreQoS <libreqos@lists.bufferbloat.= net> wrote:
How about the idea of "metaverse-ready" = metrics, with one table that is preseem-like and another that's

blue =3D=C2=A0 < 8ms
green =3D < 20ms=
yellow =3D < 50ms
orange=C2=A0 =3D < 70ms
<= div>red =3D > 70ms

These need configura= ble.=C2=A0 There are a lot of wisps that would have everything orange/red.= =C2=A0 We're considering anything under 100ms good on the rural plans.= =C2=A0 =C2=A0Also keep in mind that if you're tracking latence via ppin= g etc, then you need some buffer in there for the internet at large.=C2=A0 = <70ms to Amazon is one thing, they're very well connected, but <7= 0ms to most of the internet isn't probably very realistic and would mak= e most charts look like poop.=C2=A0
_______________________________________________
LibreQoS mailing list
LibreQo= S@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/libreqos


--
Robert Chac=C3=B3n

<= /div>
_______________________________________________
LibreQoS mailing list
LibreQo= S@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/libreqos
--0000000000003cda5105ec5d0489--