From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 98D003B29D for ; Thu, 13 Oct 2022 10:17:27 -0400 (EDT) Received: by mail-wm1-x333.google.com with SMTP id v130-20020a1cac88000000b003bcde03bd44so3350658wme.5 for ; Thu, 13 Oct 2022 07:17:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=T11TRmC4lqjsQot6nHzlHbCbtG5f/AurdmyaIc68uHA=; b=hVlul2EjLJZYmZJP4PU0dZwiOlIigask5sZFeXlLlxPwFyXD/CmLv2razLb4inAR8f 3mFQB8g+f62igoEKfTBnIMCGlyoMW/9swshsDrkbxysuc15IQDvhKmoGWnolOkRVgDy9 wfFSZoWtgN3IwyNEiaumj1Cf/igsgxADI2cCcEGAm1Fe+oQqXDxzjIXMYaqSiMD0xqd6 mkOnxiX4hoz7DGI2sXzPx/E+nV3TnGPzxJgDaExtP9fYd0EZCTOqHYJNdaVJ/gKCeugD rhx3V5sk4l+6iZPUeEMAquKne0USTksfM54zya/dBDqcQ9gmWZ8ruzDDyf6mctBXay04 YWCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T11TRmC4lqjsQot6nHzlHbCbtG5f/AurdmyaIc68uHA=; b=UpmQxMKclFfSNJIfszTtw20KrdDATAQXCdZf1bmc6+K5fDTIOvdF01YNg7MXlV9kHm wzWshNPDP2dkUav0gWUhXD9nXGPLBe/f1AqxNSUyMumDoJBmiiqD5ZZvpyQVL7YWii35 HxC4R5uukTgmdcprbTJRUOlD977umM+GvhNPljtv4eFmQQlCb2mszCZ1ffUWmJMW7ERy vBC18Kweh0cRdt+f+RxkojrzOMc0+i0mnGFQKmwFy0iBsgHuWeDqunRU8oi/V1aWs9OW 5nND2v48VrKCDDQr7S/qgeC2hcJlAaHEaG28vlsNKsNGF2PoBsqGGLpPb1pZSUVp8zG2 gJAg== X-Gm-Message-State: ACrzQf2gI5LO/Tyn6vij3mGji3AqQqVE1pczvYt9Ez1EoEJt0xneN+Jw hTYmxW+xp6gipcfYyTVpsx40F8Q11Qg2o9iWtsY81nVPlH8= X-Google-Smtp-Source: AMsMyM5kYx5yQSO6DvoNChwCEEXCqBf9npBFVJh9qWrdsIV588kFpLBM2gHxy0/9blY/bw/GDbC6+WCDO27RfZ7zzUw= X-Received: by 2002:a05:600c:1da2:b0:3b4:856a:162c with SMTP id p34-20020a05600c1da200b003b4856a162cmr6569468wms.28.1665670646347; Thu, 13 Oct 2022 07:17:26 -0700 (PDT) MIME-Version: 1.0 References: <8E7E8800-E411-4098-AFEC-4B24FA34335C@gmx.de> In-Reply-To: From: Dave Taht Date: Thu, 13 Oct 2022 07:17:11 -0700 Message-ID: To: Maximilian Bachl Cc: bloat Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] Fair queuing detection for congestion control X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Oct 2022 14:17:27 -0000 On Wed, Oct 12, 2022 at 10:35 AM Maximilian Bachl wrote: > > Building upon the ideas and advice I received, I simplified the whole con= cept and updated the preprint (https://arxiv.org/abs/2206.10561). The new a= pproach is somewhat similar to what you propose in point 3). True negative = rate (correctly detecting the absence of FQ) is now >99%; True positive rat= e is >95% (correctly detecting the presence of FQ (fq_codel and fq)). It ca= n also detect if the bottleneck link changes during a flow from FQ to non-F= Q and vice versa. That is really marvelous detection work, worth leveraging. > A new concept is that each application can choose its maximum allowed del= ay independently if there's FQ. A cloud gaming application might choose to = not allow more than 5 ms to keep latency minimal, while a video chat applic= ation might allow 25 ms to achieve higher throughput. Thus, each applicatio= n can choose its own tradeoff between throughput and delay. Also, applicati= ons can measure how large the base delay is and, if the base delay is very = low (because the other host is close by), they can allow more queuing delay= . For example, if the base delay between two hosts is just 5 ms, it could b= e ok to add another 45 ms of queuing to have a combined delay of 50 ms. Bec= ause the allowed queuing delay is quite high, throughput is maximized. As promising as this addition is to quic, I have to take umbrage with the idea that "an application can pick the right amount of buffering." First: The ideal amount of network buffering is... zero. Why would an application want to have excess buffering? There isn't much of a tradeoff between throughput and delay. FQ nowadays (nearly) everywhere makes it possible for delay based transports to "just work". Once FQ is found... an application can quickly probe for the right rate and then just motor along at some rate (well) below that. A VR or AR application, especially, becomes immune to the jitter and latency induced by other flows on the link, and mostly immune to the sudden bandwidth changes you can get from wireless links. You can probe for more bandwidth periodically via a flow you don't care ab= out. There's a pretty big knee in the bandwidth curve for wifi, I'll admit (aggregation is responsible for 60% or so of the bandwidth), but even then you only need an extra 5ms... and if your application doesn't need all that bandwidth, it's better to target 0. Secondly, the AQM in fq_codel and cake aim for a 5ms target. It's presently a bit larger in the wifi implementations (20ms in the field, 8ms in testing), so if you aim for buffering larger than that, you will get drops or marks from those algorithms starting at 100ms after you consistently exceed the target. You can (and probably should) be using lossless ECN marks instead, which (if you really want buffering for some reason), will just send an ever increasing number of marks back to the sender if they exceed the locally configured target, which I guess is a useful signal, but at least it doesn't drop packets. The circumstances where an application might want more than 5ms of delay from a FQ'd network seem few. It's putting the cart before the hearse. https://www.linkedin.com/posts/maxiereynolds_capacityeurope-datacenters-sub= seaconnectivity-activity-6986319233676713984-LwY3?utm_source=3Dshare&utm_me= dium=3Dmember_desktop > > > On Sun, Jul 3, 2022 at 4:49 PM Dave Taht wrote: >> >> Hey, good start to my saturday! >> >> 1) Apple's fq_"codel" implementation did not actually implement the >> codel portion of the algorithm when I last checked last year. Doesn't >> matter what you set the target to. >> >> 2) fq_codel has a detectable (IMHO, have not tried) phase where the >> "sparse flow optimization" allows non queue building flows to bypass >> the queue building >> flows entirely. See attached. fq-pie, also. Cake also has this, but >> with the addition of per host FQ. >> >> However to detect it, requires sending packets on an interval smaller >> than the codel quantum. Most (all!?) TCP implementations, even the >> paced ones, send 2 1514 packets back to back, so you get an ack back >> on servicing either the first or second one. Sending individual TCP >> packets paced, and bunching them up selectively should also oscillate >> around the queue width. (width =3D number of queue building flows, >> depth, the depth of the queue). The codel quantum defaults to 1514 >> bytes but is frequently autoscaled to less at low bandwidths. >> >> 3) It is also possible, (IMHO), to send a small secondary flow >> isochronously as a "clock" and observe the width and depth of the >> queue that way. >> >> 4) You can use a fq_codel RFC3168 compliant implementation to send >> back a CE, which is (presently) a fairly reliable signal of fq_codel >> on the path. A reduction in *pacing* different from what the RFC3168 >> behavior is (reduction by half), would be interesting. >> >> Thx for this today! A principal observation of the BBR paper was that >> you cannot measure for latency and bandwidth *at the same time* in a >> single and you showing, in a FQ'd environment, that you can, I don't >> remember seeing elsewhere (but I'm sure someone will correct me). >> >> On Sun, Jul 3, 2022 at 7:16 AM Maximilian Bachl via Bloat >> wrote: >> > >> > Hi Sebastian, >> > >> > Thank you for your suggestions. >> > >> > Regarding >> > a) I slightly modified the algorithm to make it work better with the s= mall 5 ms threshold. I updated the paper on arXiv; it should be online by T= uesday morning Central European Time. Detection accuracy for Linux's fq_cod= el is quite high (high 90s) but it doesn't work that well with small bandwi= dths (<=3D10 Mbit/s). >> > b) that's a good suggestion. I'm thinking how to do it best since also= every experiment with every RTT/bandwidth was repeated and I'm not sure ho= w to make a CDF that includes the RTTs/bandwidths and the repetitions. >> > c) I guess for every experiment with pfifo, the resulting accuracy is = a true negative rate, while for every experiment with fq* the resulting acc= uracy is a true positive rate. I updated the paper to include these terms t= o make it clearer. Summarizing, the true negative rate is 100%, the true po= sitive rate for fq is >=3D 95% and for fq_codel it's also in that range exc= ept for low bandwidths. >> > >> > In case you're interested in reliable FQ detection but not in the comb= ination of FQ detection and congestion control, I co-authored another paper= which uses a different FQ detection method, which is more robust but has t= he disadvantage of causing packet loss (Detecting Fair Queuing for Better C= ongestion Control (https://arxiv.org/abs/2010.08362)). >> > >> > Regards, >> > Max >> > _______________________________________________ >> > Bloat mailing list >> > Bloat@lists.bufferbloat.net >> > https://lists.bufferbloat.net/listinfo/bloat >> >> >> >> -- >> FQ World Domination pending: https://blog.cerowrt.org/post/state_of_fq_c= odel/ >> Dave T=C3=A4ht CEO, TekLibre, LLC --=20 This song goes out to all the folk that thought Stadia would work: https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-69813666656= 07352320-FXtz Dave T=C3=A4ht CEO, TekLibre, LLC