From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-x22e.google.com (mail-oi0-x22e.google.com [IPv6:2607:f8b0:4003:c06::22e]) by lists.bufferbloat.net (Postfix) with ESMTPS id A02993ED4E for ; Wed, 23 Dec 2015 09:58:27 -0500 (EST) Received: by mail-oi0-x22e.google.com with SMTP id l9so96189145oia.2 for ; Wed, 23 Dec 2015 06:58:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=4IjypCXakVBxGABVsGC8eW36d+cCzhCRke5nVSmKFnM=; b=f76cjD/mXVdqAzgDc23zhrdTmhqYjMOsZ0tW9/JImJejyop62rz/WBPt18gTZ9DXZG fEfj2EbtSG+RvDd4+0eqVYCGEnaFQnsC0RH2+72y4lwhxB+4IWS0+ZySincUPA+6yxTQ LiiwpW1NQnppVDKevUTLkzrx1LzSdW986yIzL6mg/zH9uHDnvFC/9qoSB79BShhK05Cu BUAFROqb+Wde6DzIdwx6kJ/CIkJyhdjof7z1iSUs4kUF2Cfq0VVReEXpU4TjvziA1a4i Um0cQfs78Dss66iUj6sATUCVCWyrWHn7Ia0grlQf/cl+crAyXPqvCmtECsohm7RyhaCe fJyg== MIME-Version: 1.0 X-Received: by 10.202.213.78 with SMTP id m75mr5539757oig.56.1450882706563; Wed, 23 Dec 2015 06:58:26 -0800 (PST) Received: by 10.202.187.3 with HTTP; Wed, 23 Dec 2015 06:58:26 -0800 (PST) In-Reply-To: <021986E5-5915-4D60-A391-6C6151AA6EBC@gmail.com> References: <6F86FBB0-AA69-44F3-82D0-31465906974D@gmx.de> <56657A82.2080601@darbyshire-bryant.me.uk> <95560E7E-DEAF-40D8-B704-CEA38A0CDE62@gmx.de> <85A5A21C-E468-4F81-8A36-0F1AD6C84435@gmx.de> <5679CF0F.4080506@darbyshire-bryant.me.uk> <021986E5-5915-4D60-A391-6C6151AA6EBC@gmail.com> Date: Wed, 23 Dec 2015 15:58:26 +0100 Message-ID: From: Dave Taht To: Jonathan Morton Cc: Kevin Darbyshire-Bryant , moeller0 , cake@lists.bufferbloat.net Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] second system syndrome X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Dec 2015 14:58:27 -0000 On Wed, Dec 23, 2015 at 2:06 PM, Jonathan Morton wr= ote: > >> On 23 Dec, 2015, at 14:41, Dave Taht wrote: >> >> Are you actually testing your codel changes at longer RTTs? > > The latest set has, so far, only been tested on an ordinary Internet link= . It produced an immediately noticeable improvement on ingress there, impl= ying that it=E2=80=99s controlling the upstream queue better. > > I=E2=80=99m reasonably confident, however, that it=E2=80=99ll stand up to= more varied tests as well. Fundamentally, it returns to the standard Code= l trigger mechanism under most circumstances, since that=E2=80=99s been sho= wn to work well. > > It also adds a new wrinkle that only appears when the queue is growing ve= ry quickly, to trigger the signalling early when it=E2=80=99s abundantly cl= ear that it *will* inevitably trigger later. I think this will particularl= y help cope with TCP slow-start or, on a long-RTT path, with RTT-independen= t TCPs like CUBIC. > > Feel free to run your own tests. I need to sleep. The testbed is closed for the holidays, and unless toke logs in, lab testing will not resume until jan 12th. I might, today do a bit on your latest code, but I am mostly trying to finish the server moves before taking off for the holidays myself. It does strike me that you are perhaps overly concerned about slow start behavior in general. The load spike exhibited by many of the "rrul" derived tests in the flent suite are artifacts of the side effects of starting too many flows at almost exactly the same time. In a normal congested scenario we would have a saturated link, with stablized codel values with a set of flows, and short flows coming and going on a regular basis. We have a few tests that do something saner, like the tcp_2up_delay test, as well as the web tests, which takes some setup to have running, as does the rrul_voip test. Some of the lab results are colored by using an older version of netperf which does not restart the udp flows after a loss for 250ms - the voip tests are a better indicator of what loss is like for isochronous flows, and honestly I wish we used isochronous rather than ping based tests for all of the rrul stuff. Certainly there are issues with doing the massive overload tests like the 50down one, notably with admission control (some tests simply can't start with that much congestion), and with ecn (ecn clogs up the pipe way worse and makes it even harder to start a new flow as the various aqm algorithms scale up to a higher "drop" rate than desirable - ecn has mass, I've always said) I put out the list of existing flent servers earlier (which are a bit underconfigured), in the hope that more would do measurements rather than go by feel - and also of use are the new queue depth things which toke and I put into flent over the last month or so. > - Jonathan Morton