From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 039993B29D for ; Sat, 26 Feb 2022 12:13:40 -0500 (EST) Received: by mail-ej1-x62e.google.com with SMTP id qx21so16762557ejb.13 for ; Sat, 26 Feb 2022 09:13:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=TWXikmvxTY7q0LwRiKcjtEwcQu8D9UPeb70arl92Ngg=; b=piFkq4idWCUNeqb4Jj0v4Ri5BjQxqaSYaroClgFeh8waL7ZZjjGffnxI4m7pietqjy MgEhMZWDrsutma1R691iOrI0fpyHYfYsHaoZcD0NSY9YK2auvqYFb8Gxt3XabmE2fQIr hAiFdYzL7k73g5VD6vjeAuLgVqg2OPPYhG+CWl35N0XMqaXa9qDqdnVB0siKkD5bOUZO geb8QkHef8PSLjfuFDEe+9JIlZy6mQVYnYAgCTAz6FiZBt2q4K3KNl877KhwPqA0A8Mt +dIIs65y/coSYpIflTdL7S4jQXZIucBJ1FXqDkWu7BlrAPUCtC2GxPAirE4BCU54czDC C0ZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=TWXikmvxTY7q0LwRiKcjtEwcQu8D9UPeb70arl92Ngg=; b=kF0Y9RPdQJgkcqBLGFbkBxiGtB8Cu7/peO9YHedAFM2bnxiDN9g9kjb6PX1rvtyQ7Q LSLOPc4VA5sQGq20U/XIunXdoHdj1FfBarhCGvNBILBoqHv1+fEbZ1ng2S2ds0CFVyNE 2JtqqRj2djLzH8Oj8pGlgpiFYf6vouUsx1M6HxeOSYqL42NlzV8D2+S86vRQoJPHP9Yv u6U4pwJ4sCGaexaCiksMoLaH76P3DpdzQ8taspcXPuGkbRazGJnqBIfrRr4orMFHSMcQ H6hyW1p9DE8OjFI7pShMR2AgnqGxMZZf1EXBpVNQ2Iay2N9Y9yL3gcIUa13zgOY7Vbxy Ng1Q== X-Gm-Message-State: AOAM532WDqiNRr5pXWIp4VCy+FX0ehY5l+Ro9GucYkFmILLCB9tiR7qY 21PNMkJb7CPyuac2LSlOmzHMpPKssBRZVpyVAGo= X-Google-Smtp-Source: ABdhPJzqyB7eVrFfxzb2/4F3U8VsfTc3/cXqosznZGvgNQwscnqRN8E8cPG4VMt24fgZvdkM4W0/cadDhFrOgIbdSis= X-Received: by 2002:a17:906:d9ce:b0:6ce:6a06:c01 with SMTP id qk14-20020a170906d9ce00b006ce6a060c01mr10010850ejb.666.1645895619344; Sat, 26 Feb 2022 09:13:39 -0800 (PST) MIME-Version: 1.0 References: <5114db28-89ac-1eae-b846-22ae37391c6c@bobbriscoe.net> In-Reply-To: <5114db28-89ac-1eae-b846-22ae37391c6c@bobbriscoe.net> From: Dave Taht Date: Sat, 26 Feb 2022 12:13:26 -0500 Message-ID: To: Bob Briscoe Cc: "De Schepper, Koen (Nokia - BE/Antwerp)" , tsvwg IETF list , codel@lists.bufferbloat.net Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [Codel] FQ-CoDel response to unresponsive traffic (was: Related to "Non-L4S traffic abusing the L-queue" discussion during the interim) X-BeenThere: codel@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: CoDel AQM discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Feb 2022 17:13:41 -0000 At one level you are interpreting an observed behavior as "tail drop" - which may well be possible somewhere in the stack, but it's not clear if you were running a post 2016 kernel which is what added the drop_batch facility. https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/= ?id=3D9d18562a227 This drops from the head, not the tail. I was not satisfied with this solution btw, and in some later patch added an increment to the codel count in drop_batch so as to pass "bad things are happening elsewhere" back over to the main portion of the algorithm. I'm still very unsatisfied with the concept of a fixed and user configurable drop_batch length, rather than something that autotuned. elsewhere in the fq_codel_fast repo I experimented with eliminating the queue search, but accepting that small but constant cpu overhead for a optimizing for what is perceived to be (and may not be!) a rarely hit condition, or accepting the cost of the search when it happens, remains to be seen. So, while trying to disregard your conclusion this was tail drop, I am happy that you have clearly identified (with a kernel version), and described a test (yay!) that tickles a count caching problem and proposed some solutions here: https://bobbriscoe.net/projects/latency/CoDel-delta-bug.pdf cc-ing the codel list. On Sat, Feb 26, 2022 at 8:45 AM Bob Briscoe wrote: > > Dave, > > I will keep reminding everyone that this shift of topic to FQ-CoDel is > distracting from the task at hand: > "Is Jonathan going to confirm that his 'throughput bonus' and 'fast > lane' accusations against DualQ are baseless because his experiment was > broken?" > > Nonetheless, response on FQ-CoDel is below, tagged [BB]... > > On 25/02/2022 21:06, Dave Taht wrote: > > while I do not want to spend much time nitpicking this document... > > > > "causing most of the time tail-drop" stood out. codel, fq_codel, cake > > all do head drop, and always have. > > [BB] For the list, we're talking about Figure 5 here: > https://l4steam.github.io/overload-results/ > > I'm nearly certain that the cap at 600 ms is tail drop. > Cause: The control law increases head drop so slowly that the flow-queue > containing the unresponsive flow eventually fills the buffer allocated > to the whole qdisc. Then I believe it moves into what Jonathan calls > 'tallest sunflower' drop mode (tail drop focused on the longest flow-queu= e). > > To help prove this, here's an experiment Asad ran for me last Oct on > FQ-CoDel with an unresponsive flow rate just greater than the link rate. > https://bobbriscoe.net/projects/latency/CoDel-delta-bug.pdf#page=3D4 > We were testing very slight overload, so it would stay in head drop > mode, without hitting the need for tail drop. The plot shows a similar > series of humps in the queue, but without the cut-off due to tail drop. > So it's fairly conclusive that Koen's Fig 5 is showing tail drop. > > I'll answer your question (on the SANE list) about why the humps repeat, > but that's a trivial bug compared to the time CoDel takes in the first > place. > It's a design flaw, not a bug. > The so-called 'control' law never even measures the queue it is meant to > be controlling. > Here's some history: > > * On 12-Nov-2013 I reported that to Kathie and Van as CoDel designers, > cc the AQM list: > https://mailarchive.ietf.org/arch/msg/aqm/l4H1QdRl8B-E5FWpJh4w50B_nQE/ > * No response by anyone for over 18 months, until... > * 07-Jun-2015: Toke confirmed my analysis empirically (see it, via same > thread above) > Toke's plot: > https://kau.toke.dk/ietf/codel-drop-rate/codel-drop-rate.svg > * On 30-Sep-2015 you (DaveT) said "cake uses a better curve for CoDel > but we still need to do more testing in the lab" > As far as I understand it, that missed the point: CAKE's curve is > still extremely slow, but somewhat faster than CoDel. > But, CAKE's control law still never measures the queue it is meant > to be controlling. > * 25-Feb-2022: You say you don't want to spend much time nitpicking > Koen's experiment. > If not you, someone needs to grasp this nettle, given FQ-CoDel is > the default qdisc in the Linux mainline. > > > > Bob > > -- > ________________________________________________________________ > Bob Briscoe http://bobbriscoe.net/ > --=20 I tried to build a better future, a few times: https://wayforward.archive.org/?site=3Dhttps%3A%2F%2Fwww.icei.org Dave T=C3=A4ht CEO, TekLibre, LLC