From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id B05133CB37 for ; Mon, 31 Oct 2022 19:45:14 -0400 (EDT) Received: by mail-wm1-x330.google.com with SMTP id l32so8018301wms.2 for ; Mon, 31 Oct 2022 16:45:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=y6epusEm0dNZ5M5kPiNUMFMWjBFbGP3VwhqrmyGVbtw=; b=Hol5MAaFQu27jEL5HDl9mRyrq1myPpWer9B117b+kinO7/QrkiA7U5gz1Df1bYLx+w mVY7I5gyxRpnY1RXZ6yqKNEBNR7fawxMR6hY6dYuw9FrZXWI/2+N4FaL2TrrQuzv8vGS B51RkXPu29/PbKtcaw8XtPcsuxGrJqdzdryIp2n9/8ogIZKpJ1y9Ksz8huFsKkUOP5r+ TBCSmYe3vLibLy0ezImz5ViOarSj2iFIIumsk2dKFP5QM0isuthizdHvW0oAFn1/eHWa OrpLzOjK30wjeLR9VB/5MWeoZzZcxnDOjZaXkqbod4FotTXjwh8jevPfaSSrYPqrgbp2 0LOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y6epusEm0dNZ5M5kPiNUMFMWjBFbGP3VwhqrmyGVbtw=; b=yoNSS/TDQS+Gsfm7kp2EB+6vaZny6GaHoeQs2Vkbv6UI7tfcLTaCb12aKmEeGsnwXa KJ+2dQsBkCyYJIEjkIle46jDYWzBBit0Loh45He46lHsh8US3dNcS5jgPT4luZNPJECa 53pD3bIxtb6UJlvNP/pngb7N+4SoRdrmf5Os9/yDeEqIiPs4kXSSzTRD5DmNBbozW4T8 ljPPvAwABxOkrqweqP7x6t4XGCyNbMyOK9vBOVuWzyxwuxUdSqQ10jl+aUYF7cdZwmye oOvdHqEGqoV5TI1RcbYwbB1jcIJHUloiH08ZECJXdXd7/S3oNrQWfA+mqsVFufi9qNzD /IwA== X-Gm-Message-State: ACrzQf1fR8pMA5uugh/6MlLVlPQpYTkJ8vpOS1YUjrz8Ply+mK1ECr1N +KGvh0ZIgVKRrIawINqGbokLqz8pPb5dKcMdUd4= X-Google-Smtp-Source: AMsMyM5NEybOVE3YG72oKyoTAjP04FXH2MIDdyfQ/oVuIacf0bat7xRpAAipt/gyP/VyLL5oUPJ/Idts4FBoa9plauI= X-Received: by 2002:a7b:c3c4:0:b0:3c4:785a:36d7 with SMTP id t4-20020a7bc3c4000000b003c4785a36d7mr20272665wmj.138.1667259913373; Mon, 31 Oct 2022 16:45:13 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Dave Taht Date: Mon, 31 Oct 2022 16:45:01 -0700 Message-ID: To: dan Cc: =?UTF-8?Q?Robert_Chac=C3=B3n?= , libreqos Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [LibreQoS] Integration system, aka fun with graph theory X-BeenThere: libreqos@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Many ISPs need the kinds of quality shaping cake can do List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2022 23:45:14 -0000 On Mon, Oct 31, 2022 at 4:32 PM dan via LibreQoS wrote: > > preseems numbers ar -074 green, 75-124 yellow, 125-200 red, and they just= consolidate everything >200 to 200, basically so there's no 'terrible' col= or lol. I am sorry to hear those numbers are considered to be good. My numbers are based on human factors research, some of which are cited here: https://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queuing-= traditional-aqm-is-not-enough/ > I think these numbers are reasonable for standard internet service these = days. for a 'default' value anyway. >100ms isn't bad service for most pe= ople, and most wisps will have a LOT of traffic coming through with >100ms = from the far reaches of the internet. I'm puzzled, actually. Given the rise of CDNs I would expect most internet connections to the ISP to have far less than 60ms latency at this point. Google, is typically 2ms away from most fiber in the eu, for example. Very few transactions go to the far reaches of the planet anymore, but I do lack real world data on that. > > Maybe just reasonable defaults like preseem uses for integrated 'generic'= tracking, but then have a separate graph hitting some target services. ie= , try to get game servers on there, AWS, Cloudflare, Azure, Google cloud. = Show a radar graphic or similar. My thought for slices of the data (2nd tier support and CTO level) would be ISP infrastructure (aquamarine, less than 3ms) First hop infrastructure (blue, less than 8ms) ISP -> customer - 10-20ms (green) for wired, much worse for wifi customer to world - ideally, sub 50ms. I can certainly agree that the metaverse metrics are scary given the state of things you describe, but the 8ms figure is the bare minimum to have an acceptible experience in that virtual world. > > On Mon, Oct 31, 2022 at 3:57 PM Robert Chac=C3=B3n via LibreQoS wrote: >> >> > I'd agree with color coding (when it exists - no rush, IMO) being conf= igurable. >> >> Thankfully it will be configurable, and easily, through the InfluxDB int= erface. >> Any operator will be able to click the Gear icon above the tables and se= t the thresholds to whatever is desired. >> I've set it to include both a standard table and "metaverse-ready" table= based on Dave's threshold recommendations. >> >> Standard (Preseem like) >> >> green =3D < 75 ms >> yellow =3D < 100 ms >> red =3D > 100 ms >> >> Metaverse-Ready aquamarine <=3D 3ms >> blue =3D < 8ms >> green =3D < 20ms >> yellow =3D < 50ms >> orange =3D < 70ms >> red =3D > 70ms mordor-red =3D >100ms >> Are the defaults here reasonable at least? Should we change the Standard= table thresholds a bit? Following exactly preseems current breakdown seems best for the "preseem" table. Calling it "standard", kind of requires actual standards. >> >> > Only adding 0.00155 ms to packet times is pretty good. >> >> Agreed! That's excellent. Great work on this so far it's looking like yo= u're making tremendous progress. >> >> On Mon, Oct 31, 2022 at 3:20 PM Herbert Wolverson via LibreQoS wrote: >>> >>> I'd agree with color coding (when it exists - no rush, IMO) being confi= gurable. >>> >>> From the "how much delay are we adding" discussion earlier, I thought I= 'd do a little bit of profiling of the BPF programs themselves. This is wit= h the latest round of performance updates (https://github.com/thebracket/cp= umap-pping/issues/2), so it's not measuring anything in production. I simpl= y added a call to get the clock at the start, and again at the end - and lo= g the difference. Measuring both XDP and TC BPF programs. (Execution goes (= packet arrives)->(XDP cpumap sends it to the right CPU)->(egress)->(TC send= s it to the right classifier, on the correct CPU and measures RTT latency).= This is adding about two clock checks and a debug log entry to execution t= ime, so measuring it is slowing it down. >>> >>> The results are interesting, and mostly tell me to try a different meas= urement system. I'm seeing a pretty wide variance. Hammering it with an ipe= rf session and a queue capped at 5 gbit/s: most of the TC timings were 40 n= anoseconds - not a packet that requires extra tracking, already in cache, s= o proceed. When the TCP RTT tracker fired and recorded a performance event,= it peaked at 5,900 nanoseconds. So the tc xdp program seems to be adding a= worst-case of 0.0059 ms to packet times. The XDP side of things is typical= ly in the 300-400 nanosecond range, I saw a handful of worst-case numbers i= n the 3400 nanosecond range. So the XDP side is adding 0.00349 ms. So - ass= uming worst case (and keeping the overhead added by the not-so-great monito= ring), we're adding 0.0093 ms to packet transit time with the BPF programs. >>> >>> With a much more sedate queue (ceiling 500 mbit/s), I saw much more con= sistent numbers. The vast majority of XDP timings were in the 75-150 nanose= cond range, and TC was a consistent 50-55 nanoseconds when it didn't have a= n update to perform - peaking very occasionally at 1500 nanoseconds. Only a= dding 0.00155 ms to packet times is pretty good. >>> >>> It definitely performs best on long streams, probably because the previ= ous lookups are all in cache. This is also making me question the answer I = found to "how long does it take to read the clock?" I'd seen ballpark estim= ates of 53 nanoseconds. Given that this reads the clock twice, that can't b= e right. (I'm *really* not sure how to measure that one) >>> >>> Again - not a great test (I'll have to learn the perf system to do this= properly - which in turn opens up the potential for flame graphs and some = proper tracing). Interesting ballpark, though. >>> >>> On Mon, Oct 31, 2022 at 10:56 AM dan wrote: >>>> >>>> >>>> >>>> On Sun, Oct 30, 2022 at 8:21 PM Dave Taht via LibreQoS wrote: >>>>> >>>>> How about the idea of "metaverse-ready" metrics, with one table that = is preseem-like and another that's >>>>> >>>>> blue =3D < 8ms >>>>> green =3D < 20ms >>>>> yellow =3D < 50ms >>>>> orange =3D < 70ms >>>>> red =3D > 70ms >>>> >>>> >>>> These need configurable. There are a lot of wisps that would have eve= rything orange/red. We're considering anything under 100ms good on the rur= al plans. Also keep in mind that if you're tracking latence via pping etc= , then you need some buffer in there for the internet at large. <70ms to A= mazon is one thing, they're very well connected, but <70ms to most of the i= nternet isn't probably very realistic and would make most charts look like = poop. >>> >>> _______________________________________________ >>> LibreQoS mailing list >>> LibreQoS@lists.bufferbloat.net >>> https://lists.bufferbloat.net/listinfo/libreqos >> >> >> >> -- >> Robert Chac=C3=B3n >> CEO | JackRabbit Wireless LLC >> Dev | LibreQoS.io >> >> _______________________________________________ >> LibreQoS mailing list >> LibreQoS@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/libreqos > > _______________________________________________ > LibreQoS mailing list > LibreQoS@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/libreqos --=20 This song goes out to all the folk that thought Stadia would work: https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-69813666656= 07352320-FXtz Dave T=C3=A4ht CEO, TekLibre, LLC