From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2.tohojo.dk (mail2.tohojo.dk [77.235.48.147]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 90BBE21F545 for ; Tue, 3 Nov 2015 09:05:17 -0800 (PST) X-Virus-Scanned: amavisd-new at mail2.tohojo.dk DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=201310; t=1446570313; bh=mDXw0bTlajnKkw3Z8ShdPLrNG3AUpwYEjClKWUu+oAM=; h=From:To:Cc:Subject:References:Date:In-Reply-To; b=n72dctAC1pB4k/hMPL0nB2ZHD3bgJYQgXqgscPruMu/Ab8+FV427zZfL8okuHg998 iyB9xfREPMH5PcGxQQmVDEpbc0LQ+4CP1ZBgC5kix8GgIOthHFpZ1R1hr2sWUqXzT1 uPcK0BdGJTijGgJy2lFnCu17yqzN7nVZNyZHE15U= Sender: toke@toke.dk Received: by alrua-karlstad.karlstad.toke.dk (Postfix, from userid 1000) id 677E44E6B9E; Tue, 3 Nov 2015 18:05:12 +0100 (CET) From: =?utf-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= To: Jonathan Morton References: <87pozspckj.fsf@toke.dk> <6A2609D9-7747-487B-9484-ECC69C50DE96@gmx.de> <874mh3pai9.fsf@toke.dk> <50C2A7B7-1B81-41E1-B534-CA449296FE77@gmail.com> Date: Tue, 03 Nov 2015 18:05:12 +0100 In-Reply-To: <50C2A7B7-1B81-41E1-B534-CA449296FE77@gmail.com> (Jonathan Morton's message of "Tue, 3 Nov 2015 18:43:12 +0200") X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <87a8qvc8tz.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: cake@lists.bufferbloat.net Subject: Re: [Cake] Long-RTT broken again X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Nov 2015 17:05:40 -0000 Jonathan Morton writes: > Cake does the queue accounting in bytes, and calculates 15MB (by > default) as the upper limit. It=E2=80=99s *not* meant to be a packet buf= fer. Ah, good. > note the different behaviour of the upload and download streams in the > results given. This is not a result of ingress/egress shaping, though; the upstream and downstream shaping is done on each side of the bottleneck on separate boxes. > The only way this could behave like a =E2=80=9Cpacket buffer=E2=80=9D ins= tead of a > byte-accounted queue is if there is a fixed size allocation per > packet, regardless of the size of said packet. There are hints that > this might actually be the case, and that the allocation is a hugely > wasteful (for an ack) 2KB. (This also means that it=E2=80=99s not a 10240 > packet buffer, but about 7500.) Right, well, in that case fixing the calculation to use the actual packet size would make sense in any case? > But in a bidirectional TCP scenario with ECN, only about a third of > the packets should be acks (ignoring the relatively low number of ICMP > and UDP probes); ECN causes an ack to be sent immediately, but then > normal delayed-ack processing should resume. This makes 6KB allocated > per ~3KB transmitted. The effective buffer size is thus 7.5MB, which > is still compliant with the traditional rule of thumb (BDP / > sqrt(flows)), given that there are four bulk flows each way. > > This effect is therefore not enough to explain the huge deficit Toke > measured. The calculus also changes by only a small factor if we > ignore delayed acks, making 8KB allocated per 3KB transmitted. Well, there are also several UDP measurement flows and a ping, which all send really small packets; so it's not just the acks. > So, again - what=E2=80=99s going on? Are there any clues in packet traces > with sequence analysis? Not really; the qdisc stats show ~6000 packets dropped over the 300 second test; and lots and lots of overlimits. So my guess is that the queue is in fact overflowing. Will post a trace when I get a chance tomorrow. > I=E2=80=99ll put in a configurable memory limit anyway, but I really do w= ant > to understand why this is happening. As I said before: a configurable limit is not a fix for this; we need the default behaviour to be sane. -Toke