From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <chromatix99@gmail.com>
Received: from mail-34-ewr.dyndns.com (mxout-077-ewr.mailhop.org
	[216.146.33.77])
	by lists.bufferbloat.net (Postfix) with ESMTP id 6C6852E04A9
	for <bloat@lists.bufferbloat.net>; Thu, 24 Mar 2011 05:40:37 -0700 (PDT)
Received: from scan-32-ewr.mailhop.org (scan-32-ewr.local [10.0.141.238])
	by mail-34-ewr.dyndns.com (Postfix) with ESMTP id DABD47083E3
	for <bloat@lists.bufferbloat.net>; Thu, 24 Mar 2011 12:40:36 +0000 (UTC)
X-Spam-Score: -1.0 (-)
X-Mail-Handler: MailHop by DynDNS
X-Originating-IP: 209.85.215.43
Received: from mail-ew0-f43.google.com (mail-ew0-f43.google.com
	[209.85.215.43])
	by mail-34-ewr.dyndns.com (Postfix) with ESMTP id 0B01170C2E5
	for <bloat@lists.bufferbloat.net>; Thu, 24 Mar 2011 12:40:31 +0000 (UTC)
Received: by ewy20 with SMTP id 20so3058813ewy.16
	for <bloat@lists.bufferbloat.net>; Thu, 24 Mar 2011 05:40:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:subject:mime-version:content-type:from
	:in-reply-to:date:cc:content-transfer-encoding:message-id:references
	:to:x-mailer; bh=O+xQJGv8zpy76jYmEPgIQgu5ECIB9SS0Hu/umre600Y=;
	b=qBo09w39n5btblPprXyvZ7l+cJByNqsWukTlo0A95YII0u2MYS49QO/AbTiGrBhQfl
	Ea2pmAfIC3tFeB6aIenQTw2vRsv86pRcKVzORWm7lBmC1j0zCTANTXXExo5dxC9pvhpp
	QbDjy2CrWO1ctOuElZOHsiBP+Y/P9jHBpVsRU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=subject:mime-version:content-type:from:in-reply-to:date:cc
	:content-transfer-encoding:message-id:references:to:x-mailer;
	b=Xp6EKtGK3/aB1M8Ogt//E+G9bJJ27JfqzfdFSQz7X0DmPLG/HUeH4uYFp6HnXj7aRQ
	quW3KDX8/BYFZGMEg+P6a7RIvs5A7pqHUd1IUseqHGrLYdwMZU8seg/YPTGUGHnygWdJ
	eCsWi5MAaVUCvZ2OZu+EhOQc/itbPZHd/zOrE=
Received: by 10.14.124.74 with SMTP id w50mr3004812eeh.34.1300970431277;
	Thu, 24 Mar 2011 05:40:31 -0700 (PDT)
Received: from [192.168.239.42] (xdsl-83-150-84-172.nebulazone.fi
	[83.150.84.172])
	by mx.google.com with ESMTPS id q53sm4473615eeh.18.2011.03.24.05.40.28
	(version=TLSv1/SSLv3 cipher=OTHER);
	Thu, 24 Mar 2011 05:40:29 -0700 (PDT)
Mime-Version: 1.0 (Apple Message framework v1082)
Content-Type: text/plain; charset=us-ascii
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <7imxklz5vu.fsf@lanthane.pps.jussieu.fr>
Date: Thu, 24 Mar 2011 14:40:27 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <160809C8-284C-4463-97FE-0E2F03C08589@gmail.com>
References: <AB70D849-EC48-438E-B009-E22223CAF5E2@gmail.com>
	<7imxklz5vu.fsf@lanthane.pps.jussieu.fr>
To: Juliusz Chroboczek <jch@pps.jussieu.fr>
X-Mailer: Apple Mail (2.1082)
Cc: bloat@lists.bufferbloat.net
Subject: Re: [Bloat] Thoughts on Stochastic Fair Blue
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
	<mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Thu, 24 Mar 2011 12:40:37 -0000


On 24 Mar, 2011, at 3:03 am, Juliusz Chroboczek wrote:

> (I'm the original author of sch_sfb.)
>=20
>> Having read some more documents and code, I have some extra insight =
into
>> SFB that might prove helpful.  Note that I haven't actually tried it
>> yet, but it looks good anyway.  In control-systems parlance, this is
>> effectively a multichannel I-type controller, where RED is
>> a single-channel P-type controller.
>=20
> Methinks that it would be worthwile to implement plain BLUE, in order =
to
> see how it compares.  (Of course, once Jim comes down from Mount Sinai
> and hands us RED-Lite, it might also be worth thinking about SFRed.)

I'd be interested to see if you can make a BLUE implementation which =
doesn't throw a wobbler with lossy child qdiscs.  Because there's only =
one queue, you should be able to query the child's queue length instead =
of maintaining it internally.

I'd *also* be interested in an SFB implementation which also has the =
packet-reordering characteristics of SFQ built-in, so that applying =
child qdiscs to it would be unnecessary.  I'm just about to try putting =
this combination together manually on a live network.

Finally, it might also be interesting and useful to add bare-bones ECN =
support to the existing "dumb" qdiscs, such as SFQ and the FIFO family.  =
Simply start marking (and dropping non-supporting flows) when the queue =
is more than half full.

>> My first thought after reading just the paper was that =
unconditionally
>> dropping the packets which increase the marking probability was =
suspect.
>> It should be quite possible to manage a queue using just ECN, without
>> any packet loss, in simple cases such as a single bulk TCP flow.  =
Thus
>> I am pleased to see that the SFB code in the debloat tree has =
separate
>> thresholds for increasing the marking rate and tail-dropping.  They =
are
>> fairly close together, but they are at least distinct.
>=20
> I hesitated for a long time before doing that, and would dearly like =
to
> see some conclusive experimental data that this is a good idea.  The
> trouble is -- when the drop rate is too low, we risk receiving a burst
> of packets from a traditional TCP sender.  Having the drop threshold
> larger than the increase threshold will get such bursts into our
> buffer.  I'm not going to explain on this particular list why such
> bursts are ungood ;-)

Actually, we *do* need to support momentary bursts of packets, although =
with ECN we should expect these excursions to be smaller and less =
frequent than without it.  The primary cause of a large packet burst is =
presumably from packet loss recovery, although some broken TCPs can =
produce them with no provocation.

At the bare minimum, you need to support ECN-marking the first =
triggering packet rather than dropping it.  The goal here is to have =
functioning congestion control without packet loss (a concept which =
should theoretically please the Cisco crowd).  With BLUE as described in =
the paper, a packet would always be dropped before ECN marking started, =
and that is what I was concerned about.  With even a small extra buffer =
on top, the TCP has some chance to back off before loss occurs.

With packet reordering like SFQ, the effects of bursts of packets on a =
single flow are (mostly) isolated to that flow.  I think it's better to =
accommodate them than to squash them, especially as dropping packets =
will lead to more bursts as the sending TCP tries to compensate and =
recover.

> The other, somewhat unrelated, issue you should be aware of is that
> ECN marking has some issues in highly congested networks [1]; this is
> the reason why sch_sfb will start dropping after the mark probability
> has gone above 1/2.

I haven't had time to read the paper thoroughly, but I don't argue with =
this - if the marking probability goes above 1/2 then you probably have =
an unresponsive flow anyway.  I can't imagine any sane TCP responding so =
aggressively to the equivalent of a 50% packet loss.

>> the length of the queue - which does not appear to be self-tuned by
>> the flow rate.  However, the default values appear to be sensible.
>=20
> Please clarify.

The consensus seems to be that queue length should depend on bandwidth - =
if we assume that link latency is negligible, then the RTT is usually =
dominated by the general Internet, assumed constant at 100ms.  OTOH, =
there is another school of thought which says that queue length must =
*also* depend on the number of flows, with a greater number of flows =
causing a shortening in optimum queue length (because the bandwidth and =
thus burst size from an individual flow is smaller).

But tuning the queue length might not actually be necessary, provided =
the qdisc is sufficiently sophisticated in other ways.  We shall see.

>> The major concern with this arrangement is the incompatibility with
>> qdiscs that can drop packets internally, since this is not =
necessarily
>> obvious to end-user admins.
>=20
> Agreed.  More generally, Linux' qdisc setup is error-prone, and
> certainly beyond the abilities of the people we're targeting; we need =
to
> get a bunch of reasonable defaults into distributions.  (Please start
> with OpenWRT, whose qos-scripts package[2] is used by a fair number of
> people.)

Something better than pfifo_fast is definitely warranted by default, =
except on the tiniest embedded devices which cannot cope with the memory =
requirements.  But those are always a corner case.

>> I also thought of a different way to implement the hash rotation.
>> Instead of shadowing the entire set of buckets, simply replace the =
hash
>> on one row at a time.  This requires that the next-to-minimum values =
for
>> q_len and p_mark are used, rather than the strict minima.  It is =
still
>> necessary to calculate two hash values for each packet, but the =
memory
>> requirements are reduced at the expense of effectively removing one =
row
>> from the Bloom filter.
>=20
> Interesting idea.

 - Jonathan