From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <chromatix99@gmail.com>
Received: from mail-la0-x236.google.com (mail-la0-x236.google.com
	[IPv6:2a00:1450:4010:c03::236])
	(using TLSv1 with cipher RC4-SHA (128/128 bits))
	(Client CN "smtp.gmail.com",
	Issuer "Google Internet Authority G2" (verified OK))
	by huchra.bufferbloat.net (Postfix) with ESMTPS id 7466021F224
	for <codel@lists.bufferbloat.net>; Mon, 30 Mar 2015 10:28:18 -0700 (PDT)
Received: by lagg8 with SMTP id g8so109514839lag.1
	for <codel@lists.bufferbloat.net>; Mon, 30 Mar 2015 10:28:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=content-type:mime-version:subject:from:in-reply-to:date:cc
	:content-transfer-encoding:message-id:references:to;
	bh=8UAxDQ5IQXBypRYWh0YQ6dfPG7RbG3uggICa43hieks=;
	b=jDCfvN1Zk45TH0jzPkkLZE1ArmXzSp/5jjTTEkLNG5huDRn9JTZ1OiHa5w7mN7UJhS
	aawFnNq5BTTZgIDMqUndLhcDYovTVQpWrjn6u2Et6WqXpP5RcaICIW7h0xkdDxkaQ5kz
	LYNk8uYEgIYB+1QD/YvqswEsZU4t6qxrWgaNObP8rgpyd0O1Rcr48RmtgcdR69keswla
	MqGkFgWO3LrW9oiFHH0f00FBPD3KJgFnHClUOCa6iwoYEexzUZyTS/76Z83mJzAs3Ayw
	YiJeMui5qzsdbZ5u2DRL3RRQstuzMrgrXw9/yNafMNoRvY2Ph829scoo4oUEHrpPCu0d
	ugmg==
X-Received: by 10.152.45.37 with SMTP id j5mr27883636lam.31.1427736495661;
	Mon, 30 Mar 2015 10:28:15 -0700 (PDT)
Received: from bass.home.chromatix.fi (87-95-163-98.bb.dnainternet.fi.
	[87.95.163.98])
	by mx.google.com with ESMTPSA id dz1sm2140312lbc.47.2015.03.30.10.28.12
	(version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
	Mon, 30 Mar 2015 10:28:13 -0700 (PDT)
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\))
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <51665FD7-629A-4B8C-B258-2E9AF8E3B5D0@gmx.de>
Date: Mon, 30 Mar 2015 20:28:11 +0300
Content-Transfer-Encoding: quoted-printable
Message-Id: <D9E4FCCD-3281-4556-8402-90D9A828BE30@gmail.com>
References: <7081A75C-899A-4DB7-8D77-935A37B362D8@gmail.com>
	<5509957B.30600@pollere.com>
	<491C7497-BE3E-452B-A797-C7FC1102E7ED@gmail.com>
	<CAA93jw7JVjvj9QYCLYT+UHOLDPSmdYB4uAp8wf+B=TL9pOrxrA@mail.gmail.com>
	<CAA93jw4PZ2ARfbeVhoGJTXodu1qy+q11W-Pug3P5OLd=feTzkg@mail.gmail.com>
	<B6540AAC-023D-433B-92B6-B435AA165874@gmx.de>
	<750FA673-E20D-4D48-9386-097D32CD31FB@gmail.com>
	<51665FD7-629A-4B8C-B258-2E9AF8E3B5D0@gmx.de>
To: Sebastian Moeller <moeller0@gmx.de>
X-Mailer: Apple Mail (2.2070.6)
Cc: "codel@lists.bufferbloat.net" <codel@lists.bufferbloat.net>
Subject: Re: [Codel] The next slice of cake
X-BeenThere: codel@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: CoDel AQM discussions <codel.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/codel>,
	<mailto:codel-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/codel>
List-Post: <mailto:codel@lists.bufferbloat.net>
List-Help: <mailto:codel-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/codel>,
	<mailto:codel-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Mon, 30 Mar 2015 17:28:47 -0000

I made a lot of progress over the weekend, and got cake3 running on a =
sacrificial box today.  No crashes yet!

The set-associative hashing is in, currently hard-coded to 8-way, which =
should be more than adequate for typical home workloads (especially if =
BitTorrent is marked as background).  I=92m now adding statistics =
outputs so that we can all see how well it works under real loads, =
distinguishing between direct hits (the fast path, where the hash points =
directly to a correctly allocated queue), indirect hits (where a search =
through the set locates an allocated queue), misses (where an empty =
queue was allocated) and collisions (where all the queues were already =
occupied by other flows).  Note that =93allocation=94 in this case =
simply means updating the tag on the queue to match the packet, not =
allocating memory.  These statistics should help with tuning the number =
of flow queues in each class.

The new Diffserv logic is also in and apparently working well.  I=92m =
able to observe both the priority-balanced and bandwidth-balanced =
behaviours under appropriate circumstances.  It is, however, fairly =
sensitive to the baseline quantum value from which the outer DRR weights =
are derived.  Setting it to the MTU lets the high-priority classes burst =
too much, temporarily starving the lower classes.  It seems to work a =
lot better using 256 as a baseline, but I hope the overhead of going =
around the DRR loop several times per packet isn=92t too high.  It might =
be sane to vary the baseline quantum based on the bandwidth setting, but =
we can experiment with that later.

> We keep track of how many bytes of system memory are consumed by a =
packet in 'skb->truesize'.

The buffer-occupancy queue limit is in, using the field quoted above.  =
Previously, it would just count packets.  As with previous versions of =
cake (and fq_codel), it checks the limit immediately after enqueuing =
each packet, and drops one packet from the head of the longest queue (by =
byte count) if it=92s violated.

Calculating the appropriate size to use here was an interesting problem. =
 I start by estimating the BDP (from the shaper bandwidth and the Codel =
interval), quadrupling that and enforcing a lower limit of 64KB.  If =
it=92s set to =93unlimited bandwidth=94, I set the size to an arbitrary =
1MB.  Then I look at the packet limit, multiply that by the MTU and =
clamp the resulting size down to that - providing a straightforward =
mechanism to set a limit on memory usage.

This all comes with a *fifth* almost-copy of codel.h, which incorporates =
the more efficient dequeue callback from codel4.h, but puts back the =
separate treatment of interval and target.  Cake auto-tunes both of them =
to related values, but the relationship isn=92t as simple as scaling one =
from the other, and I=92m not comfortable with the mental contortions =
involved in the new reasoning.

I also added a lookup table for the first few inverse square roots, =
because I found (using a spreadsheet) that these are very inaccurate if =
you only do one Newton iteration on them - something like 1.0, 0.500, =
0.563, 0.488 instead of 1.0, 0.707, 0.577, 0.500.  Nebulous problems =
with =93insufficiently tight control=94 might possibly be traced to =
this.  With the first fifteen entries calculated up-front using four =
iterations and stored, accuracy is vastly improved and, in practice, =
almost all square-root calculations will now reduce to a simple table =
lookup.  The memory cost is trivial, and Newton is good enough for =
exceptionally large drop counts.

Along with the hashing statistics, I=92m also adding three measures of =
latency to replace the one from the first version.  This no longer =
relies on data provided by Codel (so there=92s no space penalty per flow =
for collecting it, and the inner loop in the stats dump is also =
eliminated), but performs the necessary calculation just after dequeuing =
each packet, storing the aggregated values per class instead of per =
flow.  Each measure is an EWMA (using optimised weights which reduce to =
shifts and adds), but two of them are biased - one towards high =
latencies (so it=92ll mostly track the fattest flows), and one towards =
low latencies (so it=92ll mostly track sparse flows).

Still lots of little things to do=85

 - Jonathan Morton=