From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <chromatix99@gmail.com>
Received: from mail-lb0-x235.google.com (mail-lb0-x235.google.com
 [IPv6:2a00:1450:4010:c04::235])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 466683B260;
 Fri, 20 May 2016 08:18:15 -0400 (EDT)
Received: by mail-lb0-x235.google.com with SMTP id ww9so34743906lbc.2;
 Fri, 20 May 2016 05:18:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to;
 bh=6+TGNs2Hj768Y+MzQLe6KVNy/wiHsd8bZEbBGA0vp70=;
 b=M/B8LQ4Ek6SL+ubtnrr3nQftBrT+pHHLdtkisT8Zq5ZG5bkNYyCJGloEYTciNVGBm7
 z9BsW2qyzT3XiuznmiJap8w/JWOirLdqY6XZPmbvnGHhIALbC4zvaocgW1rdZu1ibr2k
 IqmUkZ9AAw/hzVCGJunCnaW075Bj081aIH3vo82LVgvgoLARESVfhKStRbL+LjVsKjlj
 lMZS34IYu4l7zfd60eU8cbQNBuTuzly03qKva5cUjYO7DJvlXoBbMurf9YN49agU7Xy6
 Ts9KBffoZI8uoW02Tj3oR0KSn4vOwWgV5wAxfstTfwpYYdhP5Bp66rQcWta8g8dkF79C
 UaOw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to;
 bh=6+TGNs2Hj768Y+MzQLe6KVNy/wiHsd8bZEbBGA0vp70=;
 b=SG0u3CJErV3MmKnkLmfQtE2Qsoq2fiaY2tDUdP0jZoEPtJUaSRkjHgR/HF3Fs+jMgi
 Y4FPj3TlLHG7aRdapZlTdIYhG463OP5YrJhsyjxRgVxZDb9/O7tQnzSy+JBkGY5m3p4p
 KCyj8He1FpgJ6r4i5d0YK1pGlga7WKor2EkFfw58lop/XqgZHQAiIp8g4p3IwSWul/4J
 sl1AE5gtW9NYAzPhK46hpyfZEAEA82CeEDId94kflll6iG9fnzOQX4bhqBywufzbFD46
 614ABK57bWnuVcL8FgqtnGBx4YxShc7XsG+q75VVxp2PlYvuZgSref/mbptHsMWd1zxE
 e1rw==
X-Gm-Message-State: AOPr4FWuzx6xTIjtLz+267DjZrwCuwyUp25rrgl+3nylrp+wX87jeIrMBDmohgTg1Sd5AQ==
X-Received: by 10.112.50.107 with SMTP id b11mr1033928lbo.15.1463746694039;
 Fri, 20 May 2016 05:18:14 -0700 (PDT)
Received: from bass.home.chromatix.fi (188-67-138-144.bb.dnainternet.fi.
 [188.67.138.144])
 by smtp.gmail.com with ESMTPSA id f129sm1797346lff.10.2016.05.20.05.18.12
 (version=TLSv1/SSLv3 cipher=OTHER);
 Fri, 20 May 2016 05:18:13 -0700 (PDT)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <22371476-B45C-4E81-93C0-D39A67639EA0@gmx.de>
Date: Fri, 20 May 2016 15:18:11 +0300
Cc: cake@lists.bufferbloat.net,
 codel@lists.bufferbloat.net
Content-Transfer-Encoding: quoted-printable
Message-Id: <A714A489-BA53-4FCF-BECC-1C092619C556@gmail.com>
References: <CEE3BB80-E6D4-4851-8406-54DBCC0B36AB@gmail.com>
 <22371476-B45C-4E81-93C0-D39A67639EA0@gmx.de>
To: moeller0 <moeller0@gmx.de>
X-Mailer: Apple Mail (2.3124)
Subject: Re: [Cake] Proposing COBALT
X-BeenThere: cake@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Cake - FQ_codel the next generation <cake.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cake>
List-Post: <mailto:cake@lists.bufferbloat.net>
List-Help: <mailto:cake-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2016 12:18:15 -0000

>> One of the major reasons why Codel fails on UDP floods is that its =
drop schedule is time-based.  This is the correct behaviour for TCP =
flows, which respond adequately to one congestion signal per RTT, =
regardless of the packet rate.  However, it means it is easily =
overwhelmed by high-packet-rate unresponsive (or anti-responsive, as =
with TCP acks) floods, which an attacker or lab test can easily produce =
on a high-bandwidth ingress, especially using small packets.
>=20
> In essence I agree, but want to point out that the protocol itself =
does not really matter but rather the observed behavior of a flow. =
Civilized UDP applications (that expect their data to be carried over =
the best-effort internet) will also react to drops similar to decent TCP =
flows, and crappy TCP implementations might not. I would guess with the =
maturity of TCP stacks misbehaving TCP flows will be rarer than =
misbehaving UDP flows (which might be for example well-behaved =
fixed-rate isochronous flows that simply should never have been sent =
over the internet).

Codel properly handles both actual TCP flows and other flows supporting =
TCP-friendly congestion control.  The intent of COBALT is for BLUE to =
activate whenever Codel clearly cannot cope, rather than on a =
protocol-specific basis.  This happens to dovetail neatly with the way =
BLUE works anyway.

>> BLUE=E2=80=99s up-trigger should be on a packet drop due to overflow =
(only) targeting the individual subqueue managed by that particular BLUE =
instance.  It is not correct to trigger BLUE globally when an overall =
overflow occurs.  Note also that BLUE has a timeout between triggers, =
which should I think be scaled according to the estimated RTT.
>=20
> That sounds nice in that no additional state is required. But with the =
current fq_codel I believe, the packet causing the memory limit overrun, =
is not necessarily from the flow that actually caused the problem to =
beginn with, and I doesn=E2=80=99t fq_codel actuall search the fattest =
flow and drops from there. But I guess that selection procedure could be =
run with blue as as well.

Yes, both fq_codel and Cake search for the longest extant queue and drop =
packets from that on overflow.  It is this longest queue which would =
receive the BLUE up-trigger at that point, which is not necessarily the =
queue for the arriving packet.

>> BLUE=E2=80=99s down-trigger is on the subqueue being empty when a =
packet is requested from it, again on a timeout.  To ensure this occurs, =
it may be necessary to retain subqueues in the DRR list while BLUE=E2=80=99=
s drop probability is nonzero.
>=20
> Question, doesn=E2=80=99t this mean the affected flow will be =
throttled quite harshly? Will blue slowly decrease the drop probability =
p if the flow behaves? If so, blue could just disengage if p drops below =
a threshold?

Given that within COBALT, BLUE will normally only trigger on =
unresponsive flows, an aggressive up-trigger response from BLUE is in =
fact desirable.  Codel is far too meek to handle this situation; we =
should not seek to emulate it when designing a scheme to work around its =
limitations.

BLUE=E2=80=99s down-trigger decreases the drop probability by a smaller =
amount (say 1/4000) than the up-trigger increases it (say 1/400).  These =
figures are the best-performing configuration from the original paper, =
which is very readable, and behaviour doesn=E2=80=99t seem to be =
especially sensitive to the precise values (though only =
highly-aggregated traffic was considered, and probably on a long =
timescale).  For an actual implementation, I would choose convenient =
binary fractions, such as 1/256 up and 1/4096 down, and a relatively =
short trigger timeout.

If the relative load from the flow decreases, BLUE=E2=80=99s action will =
begin to leave the subqueue empty when serviced, causing BLUE=E2=80=99s =
drop probability to fall off gradually, potentially until it reaches =
zero.  At this point the subqueue is naturally reset and will react =
normally to subsequent traffic using it.

The BLUE paper: =
http://www.eecs.umich.edu/techreports/cse/99/CSE-TR-387-99.pdf

>> Note that this does nothing to improve the situation regarding =
fragmented packets.  I think the correct solution in that case is to =
divert all fragments (including the first) into a particular queue =
dependent only on the host pair, by assuming zero for src and dst ports =
and a =E2=80=9Cspecial=E2=80=9D protocol number. =20
>=20
> I believe the RFC recommends using the SRC IP, DST IP, Protocol, =
Identity tuple, as otherwise all fragmented flows between a host pair =
will hash into the same bucket=E2=80=A6

I disagree with that recommendation, because the Identity field will be =
different for each fragmented packet, even if many such packets belong =
to the same flow.  This would spread these packets across many subqueues =
and give them an unfair advantage over normal flows, which is the =
opposite of what we want.

Normal traffic does not include large numbers of fragmented packets (I =
would expect a mere handful from certain one-shot request-response =
protocols which can produce large responses), so it is better to shunt =
them to a single queue per host-pair.

 - Jonathan Morton