From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lb0-x232.google.com (mail-lb0-x232.google.com [IPv6:2a00:1450:4010:c04::232]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 3911921F205 for ; Mon, 30 Mar 2015 20:23:03 -0700 (PDT) Received: by lbdc10 with SMTP id c10so3064164lbd.2 for ; Mon, 30 Mar 2015 20:23:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=qlCG9wb+FU/2r/XT2gZHlVM/jW+ma+4wSFCNw8cZjNA=; b=jnLaRq6Mqj2Nt45Aquk1vvdcGRDd7IGX5mXKeeBmQkMKjJnErfxuhSBI5iLjx54JnW oFvQUXlD2m4+98v6f5A6F2KH0y+idVvrj+pF056s9fq/DmbU/TZV2vwnfEZTicRmqoMX 0P7cov+sx9JQ4Ousr0R5gzYG9XRg9LW4rpOmJAd+muhJQj4kvw03Dah3HmhXP2HxMen2 SajXRDfDWgTu9yslc4cCjsi7tfrhPbU1yEW6v4Yw3ZPriFcqn6L2vM2LE0Wi6yKiyQDS 5BMFqLfDLNtjJUFtB9zl6JYcK27L46FHUXSNIZuJ143sMWdulXF5q+noMaGZuNKYLmoC rpXg== X-Received: by 10.112.132.35 with SMTP id or3mr17323589lbb.85.1427772181801; Mon, 30 Mar 2015 20:23:01 -0700 (PDT) Received: from bass.home.chromatix.fi (188-67-15-150.bb.dnainternet.fi. [188.67.15.150]) by mx.google.com with ESMTPSA id h10sm2359122laa.41.2015.03.30.20.22.59 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 30 Mar 2015 20:23:00 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Jonathan Morton In-Reply-To: Date: Tue, 31 Mar 2015 06:22:58 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <8753FC87-8C86-4EFE-A4B6-6EDB82AE5D26@gmail.com> References: <7081A75C-899A-4DB7-8D77-935A37B362D8@gmail.com> <5509957B.30600@pollere.com> <491C7497-BE3E-452B-A797-C7FC1102E7ED@gmail.com> <750FA673-E20D-4D48-9386-097D32CD31FB@gmail.com> <51665FD7-629A-4B8C-B258-2E9AF8E3B5D0@gmx.de> To: Sebastian Moeller X-Mailer: Apple Mail (2.2070.6) Cc: "codel@lists.bufferbloat.net" Subject: Re: [Codel] The next slice of cake X-BeenThere: codel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: CoDel AQM discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2015 03:23:32 -0000 > On 30 Mar, 2015, at 21:23, Sebastian Moeller wrote: >=20 >>> We keep track of how many bytes of system memory are consumed by a = packet in 'skb->truesize'. >>=20 >> The buffer-occupancy queue limit is in, using the field quoted above. = Previously, it would just count packets. As with previous versions of = cake (and fq_codel), it checks the limit immediately after enqueuing = each packet, and drops one packet from the head of the longest queue (by = byte count) if it=92s violated. >=20 > Does this mean the largest size the queue actually occupies is limit = plus the last packet=92s true size temporarily? What about the case a = full MTU packet comes in while the eligible packet at the head of the = longest queue is say 64 byte? Yes - I could make it capable of dropping multiple packets to cope = better with the latter case. The former case is not a problem really, since the packet is already in = memory when we=92re given it to enqueue; to fix it we=92d have to = anticipate the largest possible arriving packet and pre-emptively leave = space for it. > I thought the main argument was, that interval needs to be roughly in = the right range and target is supposed to be between 5 to 10 percent of = interval. Yes, but why? I have seen other arguments, both recently and otherwise, which modify = target to accommodate the MAC grant latency of some medium, while = interval is still described as a rough estimate of the RTT. The existence of both explanations gives me epic cognitive-dissonance = headaches, which I can=92t resolve *unless* I am able to vary target and = interval independently. Since the cost to do so appears to be small, I = put that facility back in. >>> We keep track of how many bytes of system memory are consumed by a = packet in 'skb->truesize'. >>=20 >> The buffer-occupancy queue limit is in, using the field quoted above. = Previously, it would just count packets. As with previous versions of = cake (and fq_codel), it checks the limit immediately after enqueuing = each packet, and drops one packet from the head of the longest queue (by = byte count) if it=92s violated. >=20 > did this include the overload protection fallback to dropping rather > than ecn-ing packets? Codel already has such a protection mechanism, but I haven=92t yet = converted it to byte mode. > We explored the sqrt cache idea in the very first implementations of > codel which you can revisit in the earliest, exciting days of the > existence of the codel mailing list. >=20 > I am not allergic to trying it again. What we had found then was the > count variable was increasing without bound (with the original > published ns2 version), for which we substituted another modification > that works well at low bandwidths but less well at higher ones. >=20 > My sole mathematical contribution to the invsqrt approximation in > linux codel was the discovery that newton's method had much larger > errors than 2% when run in reverse (eg, when decreasing count by more > than 1, as various forms of the algorithm have done while trying to > come up with a better estimate for the the resumption portion of the > algorithm) A quick check in that spreadsheet suggests that this inaccuracy only = occurs for low values of count, in basically the same places as running = it forward is also inaccurate. So having the pre-calculated lookup = table for those first values works, and Newton is still used directly = for larger counts if they occur. > Note that I tried hard in codel4 (I think it was) to reduce the > size of the codel stats to a single 32byte cache-line on 32 bit arches > - down from 48 or so bytes - and succeeded with one word to spare. I > am not allergic to keeping lots of statistics but we are also aiming > for speed on lame processors here=85 I=92ve managed to add these statistics without increasing the size of = the per-flow data, only the per-class data. The weakness of the CPUs = involved is why I=92m aiming for shifts and adds (which an ARM in = particular can do extremely efficiently, but a MIPS isn=92t hurt much = either) rather than multiplies in the fast path. There=92s a stray = multiply in the hash function which I still want to excise. It might also be possible, later on, to add a kernel config switch to = drop out the stats computations. It probably won=92t make very much = difference, but when every little helps... - Jonathan Morton