From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id E75723B29D for ; Wed, 2 Nov 2022 17:13:16 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1667423591; bh=RxDIn4t70O6X+yLFE1bRfHxPgGLyPWV7WF9UUYQ9j0M=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=gmvTwvgX1d89Bfz946B1dSdPsASgtYxbCMO0Q6edVID+rPFFgGJ2oe2bd32vrJOSp buZacIHta2VyZniDmHm7oqUtD9REpnmyNDqfz1o67Xrx+cUETOYrkEUvszY0RQBz1z mDSubeooBZZqKyphirBpU2n7j5q9FQgruHyBTYLHNIrEzakvsQYdxI8uFG6QNTQ+0x ihng18gqCOZ3YlcHk/36Zp4yuVIyZRWbqnqNx0MNl4uDmGNty/9fLDMawbzHHuz9Jc bQUTqosIZH2n0fJI5OLiYuSFZqMFfGWgxlBYkYhIj5H7D5xMfXLYZAC6pzLpyw742x clHwx4tzJxYtg== X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a Received: from smtpclient.apple ([77.10.108.195]) by mail.gmx.net (mrgmx104 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MwQXN-1p96Je39iH-00sPEZ; Wed, 02 Nov 2022 22:13:11 +0100 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) From: Sebastian Moeller In-Reply-To: Date: Wed, 2 Nov 2022 22:13:10 +0100 Cc: rjmcmahon , Rpm , Ruediger.Geib@telekom.de, ippm@ietf.org Content-Transfer-Encoding: quoted-printable Message-Id: <57B43486-BB05-4D38-8EDB-07478C433B31@gmx.de> References: <0a8cc31c7077918bf84fddf9db50db02@rjmcmahon.com> <344f2a33b6bcae4ad4390dcb96f92589@rjmcmahon.com> <261B90F5-FD4E-46D5-BEFE-6BF12D249A28@gmx.de> <9519aceac2103db90e363b5c9f447d12@rjmcmahon.com> To: =?utf-8?Q?Dave_T=C3=A4ht?= X-Mailer: Apple Mail (2.3696.120.41.1.1) X-Provags-ID: V03:K1:KExlwi/m5AGX2JLumD8lCWsV91w6lEaZx1k9OhsmsA7q1cmR8tK eIL69plBG3yOP2wx314AihmqLjlHy/ZXG2Sima3c+cHPNWjYS//6Ov984VnIo+nhnFE0Ibx olKc4k6OIFrT+GYHvqojJ5OhFaRN4V54Y5NDBQ4orXi2HZLB8Xo65XnSnJxhgvCZ/aTIsE3 xj0/3f6/dVWPVoB2fNHxw== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:I646CtfRBfk=;Lp/NLUNryBP9+TY3GKMY+FcYmMm Gys2l9lIutUyhRyXaSJSrZIjebo6+Rjd+eZij3UXc/LWWm4RLL7oJBqrIwsk9MnyLKq+AlULv t8l70FHATZcnhEgdZBP5dX3g1T4i0H1v1JKZFFjMVzhweUT0J9sxZ+VNwKNVDLl3Q7pj2h4SD fK0+c0NiixbrH8/tzFeG1zhAhoZpacLcJ6H+KXA0Cixcy93UWmV/UFwTwf78dj5CmFmdig481 BpQQ9TjgXjcftQvvq5w/+7UVmrNGBz8wyrWdognn54SorOgIwzaXbYAjjjmAC+3NXvntjXSxT GH1EZcoLo2LSuu2GcffyHvJJeIZ51DM+Nqcr7YK34WJV0kBlK5OXCEUI0nYCYqIAmNEA3sBBr OwukXFAI2BeWLvj/GwMgjqvnZ35V8cNC93m/DAbGuf+GuBf3Xpak1gHXWV9Cd5eMf91MBdSn9 NH07zw4FZW+dnxK2hfcimcKPqa5QUgsV2tQY2lwshPBOgZAltpmiRfnMO69ouJnhgwKAKJHX+ WmmdgIqLzkE1XaBMok4Rz3I8fmNXiJmAT5aqxR/9hPVUaVyYQfO8FkG6tJsc8G2TxBB2gqK4Q M+PWkcgBnQSbrL6ckiX/rRvxZw3rkR8/Uh38L4/fTIXIvDS48oBDwniM0vfgB8oC5iZ8ELyji rrr+LHZbepwfSPQKz0aJc0MHFpEQgXHDPYo7TEwt3NZ8ZbDVBgjkxslDYZf4FyrfMTrF2bu/4 E4z3Vqrqqm6Bk7z/D9qwdLpmi6AdsnJrTMQWdgrR+c70rRieygSi359cfV4XKnY6pnkbfcPlL uYh8+rjTpXufJ9rh6BkvyuvEVu+wTrpb8N0BHRXdzm0WLIvDO21BltxZvydRdi9W9X7twOKzr MBOjhFoy9tSa6stNGXrmOBIen38+IjxegRBD5v9WRoUo6RgaxKiObfnT+LynjFR6k6VXY+pjh SjLcDzueaaQIE4g1h3M/nNczlMQ= Subject: Re: [Rpm] [ippm] lightweight active sensing of bandwidth and buffering X-BeenThere: rpm@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: revolutions per minute - a new metric for measuring responsiveness List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Nov 2022 21:13:17 -0000 > On Nov 2, 2022, at 20:44, Dave Taht via Rpm = wrote: >=20 > On Wed, Nov 2, 2022 at 12:29 PM rjmcmahon via Rpm > wrote: >>=20 >> Most measuring bloat are ignoring queue build up phase and rather = start >> taking measurements after the bottleneck queue is in a standing = state. >=20 > +10. It's the slow start transient that is holding things back. [SM] =46rom my naive perspective slow start has a few facets: a) it is absolutely the right approach conceptually with no reliable = prior knowledge of the capacity the best we can do is to probe it by = increasing the sending rate over time (and the current exponential = growth is already plenty aggressive*) b) Since this needs feedback from the remote endpoint at best we can = figure out 1 RTT later whether we sent at acceptable rate c) we want to go as fast as reasonable d) but not any faster ;) e) the trickiest part is really deciding when to leave slow start's = aggressive rate increase per RTT regime so that the >=3D 1 RTT "blind" = spot does not lead to too big a transient queue spike. f) The slow start phase ca be relatively short, so the less averaging we = need to do to decide when to drop out of slow start the better as = averaging costs time. IMHO the best way forward would be to switch from bit-banging the queue = state (as in L4S' design) over multiple packets to sending a multi-bit = queue occupancy signal per-packet, then the sender should be able to = figure out the rate of queueing change as a function of sending rate = change and predict a reasonable time to switch to congestion avoidance = without having to wait for a drop (assuming the remote end signals these = queue occupancy signals back to the sender in a timely fashion)... (This is really just transfering the ideas from Arslan, Serhat, and Nick = McKeown. =E2=80=98Switches Know the Exact Amount of Congestion=E2=80=99. = In Proceedings of the 2019 Workshop on Buffer Sizing, 1=E2=80=936, 2019. = to slow-start). *) Two factors to play with is size of the starting batch (aka initial = window) and the actual factor of increase per RTT, people statred = playing with the first but so far seem reasonable enough not to touch = the second ;) > If we > could, for example > open up the 110+ objects and flows web pages require all at once, and > let 'em rip, instead of 15 at a time, without destroying the network, > web PLT would get much better. >=20 >> My opinion, the best units for bloat is packets for UDP or bytes for >> TCP. Min delay is a proxy measurement. >=20 > bytes, period. bytes =3D time. Sure most udp today is small packets = but > quic and videconferencing change that. >=20 >>=20 >> Little's law allows one to compute this though does assume the = network >> is in a stable state over the measurement interval. In the real = world, >> this probably is rarely true. So we, in test & measurement = engineering, >> force the standing state with some sort of measurement co-traffic and >> call it "working conditions" or equivalent. ;) >=20 > There was an extremely long, nuanced debate about little's law and > where it applies, last year, here: >=20 > https://lists.bufferbloat.net/pipermail/cake/2021-July/005540.html >=20 > I don't want to go into it, again. >=20 >>=20 >> Bob >>> Bob, Sebastian, >>>=20 >>> not being active on your topic, just to add what I observed on >>> congestion: >>> - starts with an increase of jitter, but measured minimum delays = still >>> remain constant. Technically, a queue builds up some of the time, = but >>> it isn't present permanently. >>> - buffer fill reaches a "steady state", called bufferbloat on access = I >>> think; technically, OWD increases also for the minimum delays, = jitter >>> now decreases (what you've described that as "the delay magnitude" >>> decreases or "minimum CDF shift" respectively, if I'm correct). I'd >>> expect packet loss to occur, once the buffer fill is on steady = state, >>> but loss might be randomly distributed and could be of a low >>> percentage. >>> - a sudden rather long load burst may cause a jump-start to >>> "steady-state" buffer fill. The above holds for a slow but steady = load >>> increase (where the measurement frequency determines the timescale >>> qualifying "slow"). >>> - in the end, max-min delay or delay distribution/jitter likely = isn't >>> an easy to handle single metric to identify congestion. >>>=20 >>> Regards, >>>=20 >>> Ruediger >>>=20 >>>=20 >>>> On Nov 2, 2022, at 00:39, rjmcmahon via Rpm >>>> wrote: >>>>=20 >>>> Bufferbloat shifts the minimum of the latency or OWD CDF. >>>=20 >>> [SM] Thank you for spelling this out explicitly, I only worked = on a >>> vage implicit assumption along those lines. However what I want to >>> avoid is using delay magnitude itself as classifier between high and >>> low load condition as that seems statistically uncouth to then show >>> that the delay differs between the two classes;). >>> Yet, your comment convinced me that my current load threshold = (at >>> least for the high load condition) probably is too small, exactly >>> because the "base" of the high-load CDFs coincides with the base of >>> the low-load CDFs implying that the high-load class contains too = many >>> samples with decent delay (which after all is one of the goals of = the >>> whole autorate endeavor). >>>=20 >>>=20 >>>> A suggestion is to disable x-axis auto-scaling and start from zero. >>>=20 >>> [SM] Will reconsider. I started with start at zero, end then = switched >>> to an x-range that starts with the delay corresponding to 0.01% for >>> the reflector/condition with the lowest such value and stops at = 97.5% >>> for the reflector/condition with the highest delay value. My = rationale >>> is that the base delay/path delay of each reflector is not all that >>> informative* (and it can still be learned from reading the x-axis), >>> the long tail > 50% however is where I expect most differences so I >>> want to emphasize this and finally I wanted to avoid that the actual >>> "curvy" part gets compressed so much that all lines more or less >>> coincide. As I said, I will reconsider this >>>=20 >>>=20 >>> *) We also maintain individual baselines per reflector, so I could >>> just plot the differences from baseline, but that would essentially >>> equalize all reflectors, and I think having a plot that easily shows >>> reflectors with outlying base delay can be informative when = selecting >>> reflector candidates. However once we actually switch to OWDs = baseline >>> correction might be required anyways, as due to colck differences = ICMP >>> type 13/14 data can have massive offsets that are mostly indicative = of >>> un synched clocks**. >>>=20 >>> **) This is whyI would prefer to use NTP servers as reflectors with >>> NTP requests, my expectation is all of these should be reasonably >>> synced by default so that offsets should be in the sane range.... >>>=20 >>>=20 >>>>=20 >>>> Bob >>>>> For about 2 years now the cake w-adaptive bandwidth project has = been >>>>> exploring techniques to lightweightedly sense bandwidth and >>>>> buffering problems. One of my favorites was their discovery that = ICMP >>>>> type 13 got them working OWD from millions of ipv4 devices! >>>>> They've also explored leveraging ntp and multiple other methods, = and >>>>> have scripts available that do a good job of compensating for 5g = and >>>>> starlink's misbehaviors. >>>>> They've also pioneered a whole bunch of new graphing techniques, >>>>> which I do wish were used more than single number summaries >>>>> especially in analyzing the behaviors of new metrics like rpm, >>>>> samknows, ookla, and >>>>> RFC9097 - to see what is being missed. >>>>> There are thousands of posts about this research topic, a new post = on >>>>> OWD just went by here. >>>>> https://forum.openwrt.org/t/cake-w-adaptive-bandwidth/135379/793 >>>>> and of course, I love flent's enormous graphing toolset for >>>>> simulating and analyzing complex network behaviors. >>>> _______________________________________________ >>>> Rpm mailing list >>>> Rpm@lists.bufferbloat.net >>>> https://lists.bufferbloat.net/listinfo/rpm >>>=20 >>> _______________________________________________ >>> ippm mailing list >>> ippm@ietf.org >>> https://www.ietf.org/mailman/listinfo/ippm >> _______________________________________________ >> Rpm mailing list >> Rpm@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/rpm >=20 >=20 >=20 > --=20 > This song goes out to all the folk that thought Stadia would work: > = https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665= 607352320-FXtz > Dave T=C3=A4ht CEO, TekLibre, LLC > _______________________________________________ > Rpm mailing list > Rpm@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/rpm