From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <justin@althea.net>
Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com
 [66.111.4.29])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 219663B2A4
 for <cake@lists.bufferbloat.net>; Thu,  3 Oct 2019 13:52:39 -0400 (EDT)
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
 by mailout.nyi.internal (Postfix) with ESMTP id D1E9221D25;
 Thu,  3 Oct 2019 13:52:38 -0400 (EDT)
Received: from imap2 ([10.202.2.52])
 by compute4.internal (MEProxy); Thu, 03 Oct 2019 13:52:38 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=althea.net; h=
 mime-version:message-id:in-reply-to:references:date:from:to:cc
 :subject:content-type; s=fm1; bh=PzJBaoP90nmkf/1eyTBiNwmBirJZ5GU
 FNhmuGQFlbgw=; b=N6TJH/aNFM9TG7xWtasCOybN/JiHhh6AyAiiAMGggIpfVL+
 eNqlL0OqghBOJ4684qciNL6fyeqXG1UjSLQ8KrEyl+nMw3Hb2RGI/ciUzImQDgEB
 kZLPSQdQpfy6EANIyDcvQbsep+3hbULvXHglfAVNqgIqvs5tLmRhrznlj8fDgTwF
 HMvjGJZOTY3OPqC5ZqTK7kjwKuQWilin1QoZL8K51UMslB26ig0qhxSKGw8mcNHY
 Gth0h4Wcq5V3QhlL8CYY+bTSEfUuWJGZq+6RozwY7ODfwPwfNVmDhOrl4WCU9eML
 njqTg+ePWWdAm4ydabl6W8xl2MmOQPQ/45r2PGQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-type:date:from:in-reply-to
 :message-id:mime-version:references:subject:to:x-me-proxy
 :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=PzJBao
 P90nmkf/1eyTBiNwmBirJZ5GUFNhmuGQFlbgw=; b=D9u2pzvSyB5MWAcDGjLvN+
 ukurWR7EHJPRAJJYaJotBeylmxPy+YjLX8j9qmxwfxehYSPpI8TY/A/VLg5yPpUX
 s2A7SzILvVSzm4Jad4X+0mcII5LND5VVrBVBCDHV1aCPS6xXcIlRWunFIBoIJxDl
 aRJlaNP3wFhYH4Gd1Kz6dX8PBxYCVYEYoU+Hjd7FryiKUilx+Ac0+4mTEnhGQy4n
 GeuN941F5xDQrPRZz00OFYNn1Hz82hT6xOGMVz3hzUejBzDHnJGXbfWEyVzLufKu
 q7f1POtpr6HpfuDVQ+K6tWybp2pIjaFRyHOCAz4IK9/WIWM3Fz7ZOhoI1tQpOxKQ
 ==
X-ME-Sender: <xms:ZjWWXUEqLn2n0GX0lqvzzP9ljE9nvEzW8JqlJtqjcUw6bgU5Q7K15Q>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedufedrgeekgdduudejucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen
 uceurghilhhouhhtmecufedttdenucenucfjughrpefofgggkfgjfhffhffvufgtsehttd
 ertderredtnecuhfhrohhmpedflfhushhtihhnucfmihhlphgrthhrihgtkhdfuceojhhu
 shhtihhnsegrlhhthhgvrgdrnhgvtheqnecuffhomhgrihhnpehgihhthhhusgdrtghomh
 enucfrrghrrghmpehmrghilhhfrhhomhepjhhushhtihhnsegrlhhthhgvrgdrnhgvthen
 ucevlhhushhtvghrufhiiigvpedt
X-ME-Proxy: <xmx:ZjWWXW-bbN4R9KW1XkuYcIlxJHVNEm9YuJZk5sp_K_g75_dAiZPwPA>
 <xmx:ZjWWXa27-2N-lnoI009nOrkFaWO-fupLaykd1grAPGbuT9jqff0Vfw>
 <xmx:ZjWWXXkWeQkfS-7kjEz4zQziGOnvAgyrjc13dJcM5rxiIMQbtWJxTA>
 <xmx:ZjWWXSAr_OW2jvI9O8p00KXEcliEJOTT4Ek8qBhmH02OstRWGgZaQw>
Received: by mailuser.nyi.internal (Postfix, from userid 501)
 id 517E1E00B4; Thu,  3 Oct 2019 13:52:38 -0400 (EDT)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.1.7-359-g64bf1af-fmstable-20191002v2
Mime-Version: 1.0
Message-Id: <a856e8a0-b454-465d-9112-8fab3eabade2@www.fastmail.com>
In-Reply-To: <E975CC03-A531-4450-A896-5C3921A9D063@gmail.com>
References: <ac3db996-6769-4e38-b19b-eaa08ac40cd5@www.fastmail.com>
 <2825CE14-2109-4580-A086-9701F4D3ADF0@gmail.com>
 <18b1c174-b88d-4664-9aa8-9c42925fc14c@www.fastmail.com>
 <E0C914DE-893D-4B14-9179-F10EB595DE0C@gmail.com>
 <9a90111b-2389-4dc6-8409-18c40f895540@www.fastmail.com>
 <43F02160-E691-4393-A0C0-8AB4AD962700@gmail.com>
 <a75b079e-8a29-46f1-9890-7eb73e18b0f2@www.fastmail.com>
 <E975CC03-A531-4450-A896-5C3921A9D063@gmail.com>
Date: Thu, 03 Oct 2019 13:52:15 -0400
From: "Justin Kilpatrick" <justin@althea.net>
To: "Jonathan Morton" <chromatix99@gmail.com>
Cc: cake@lists.bufferbloat.net
Content-Type: text/plain
Subject: Re: [Cake] Fighting bloat in the face of uncertinty
X-BeenThere: cake@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Cake - FQ_codel the next generation <cake.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cake>
List-Post: <mailto:cake@lists.bufferbloat.net>
List-Help: <mailto:cake-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Thu, 03 Oct 2019 17:52:39 -0000

I've developed a rough version of this and put it into production Monday. After a few tweaks we're seeing a ~10x reduction in the magnitude of latency spikes at high usage times. 

https://github.com/althea-net/althea_rs/blob/master/rita/src/rita_common/network_monitor/mod.rs#L288

The average and standard deviation of latency to a given neighbor is scraped from Babel and when the standard deviation exceeds 10x  the average we reduce the throughput of the connection by 20%.

It's not theoretically sound yet because I still need to expose single direction latency in Babel rather than only round trip. Bloat caused by the other side of the link currently causes connections to be reduced all the way down to the throughput minimum unnecessarily. 

It would also be advantageous to observe what throughput we've recorded for the last 5 seconds and put a threshold there. Rather than doing any probing ourselves we can just observe if the user was saturating the connection or if it was a transient radio problem. 

If anyone else is interested in using this I can split it off from our application and into a stand alone (if somewhat bulky) binary without much trouble. 

-- 
  Justin Kilpatrick
  justin@althea.net

On Sun, Sep 8, 2019, at 1:27 PM, Jonathan Morton wrote:
> >> You could also set it back to 'internet' and progressively reduce the 
> >> bandwidth parameter, making the Cake shaper into the actual bottleneck. 
> >> This is the correct fix for the problem, and you should notice an 
> >> instant improvement as soon as the bandwidth parameter is correct.
> > 
> > Hand tuning this one link is not a problem. I'm searching for a set of settings that will provide generally good performance across a wide range of devices, links, and situations. 
> > 
> > From what you've indicated so far there's nothing as effective as a correct bandwidth estimation if we consider the antenna (link) a black box. Expecting the user to input expected throughput for every link and then managing that information is essentially a non-starter. 
> > 
> > Radio tuning provides some improvement, but until ubiquiti starts shipping with Codel on non-router devices I don't think there's a good solution here. 
> > 
> > Any way to have the receiving device detect bloat and insert an ECN?
> 
> That's what the qdisc itself is supposed to do.
> 
> > I don't think the time spent in the intermediate device is detectable at the kernel level but we keep track of latency for routing decisions and could detect bloat with some accuracy, the problem is how to respond.
> 
> As long as you can detect which link the bloat is on (and in which 
> direction), you can respond by reducing the bandwidth parameter on that 
> half-link by a small amount.  Since you have a cooperating network, 
> maintaining a time standard on each node sufficient to observe one-way 
> delays seems feasible, as is establishing a normal baseline latency for 
> each link.
> 
> The characteristics of the bandwidth parameter being too high are easy 
> to observe.  Not only will the one-way delay go up, but the received 
> throughput in the same direction at the same time will be lower than 
> configured.  You might use the latter as a hint as to how far you need 
> to reduce the shaped bandwidth.
> 
> Deciding when and by how much to *increase* bandwidth, which is 
> presumably desirable when link conditions improve, is a more difficult 
> problem when the link hardware doesn't cooperate by informing you of 
> its status.  (This is something you could reasonably ask Ubiquiti to 
> address.)
> 
> I would assume that link characteristics will change slowly, and run an 
> occasional explicit bandwidth probe to see if spare bandwidth is 
> available.  If that probe comes through without exhibiting bloat, *and* 
> the link is otherwise loaded to capacity, then increase the shaper by 
> an amount within the probe's capacity of measurement - and schedule a 
> repeat.
> 
> A suitable probe might be 100x 1500b packets paced out over a second, 
> bypassing the shaper.  This will occupy just over 1Mbps of bandwidth, 
> and can be expected to induce 10ms of delay if injected into a 
> saturated 100Mbps link.  Observe the delay experienced by each packet 
> *and* the quantity of other traffic that appears between them.  Only if 
> both are favourable can you safely open the shaper, by 1Mbps.
> 
> Since wireless links can be expected to change their capacity over 
> time, due to eg. weather and tree growth, this seems to be more 
> generally useful than a static guess.  You could deploy a new link with 
> a conservative "guess" of say 10Mbps, and just probe from there.
> 
>  - Jonathan Morton