From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dave.taht@gmail.com>
Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com
 [IPv6:2607:f8b0:4864:20::d2f])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id EA2EF3CB35
 for <ecn-sane@lists.bufferbloat.net>; Tue, 18 Jun 2019 00:32:22 -0400 (EDT)
Received: by mail-io1-xd2f.google.com with SMTP id u13so26772551iop.0
 for <ecn-sane@lists.bufferbloat.net>; Mon, 17 Jun 2019 21:32:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc:content-transfer-encoding;
 bh=QWTUhIZuKXE3w7zD/hmWilwZ7iAdYYG0ERoeCQxDpLo=;
 b=CSDy95RzFZefOpi+4m4yhPszNMSkKXDRs27GrottTPpf+YvaQK0fq4AF0sBJ3cJyat
 +HmU3XLH+GBW8AuEa4j/R42z4VUqNUJ2vZgJdelaZ1FcnRFw69gSNaoi3uLApl2U64fz
 3lh2+T1QyhgDTlBrBwRrrlapLlXi4QEY0YaPNPRFkMMZF0Fw9hFU++g0TYSbB5ACPg3E
 DGPq0/cY1zLOzyMXXNXiYV9UtoKr0NfRjg7hBONxUJlZHhNHhGk+9eThHKB9FT8eyyI2
 gyRUUM6FizqyKEN/NF+u/RCt+QFXMB2zAhaCvvUBipYr3GRrvGHkevDgN0CA0BvCbp+g
 Dwwg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc:content-transfer-encoding;
 bh=QWTUhIZuKXE3w7zD/hmWilwZ7iAdYYG0ERoeCQxDpLo=;
 b=N2tdeinG1kbAtz1k0tC3QjSkut60MH2rF093ZFABU0ocsxVscgxPN1ymKCMGAHB6tW
 UV67Yw2/EmGwpuy9P/KQaqFo/QbTfb69qv75SGC157Fhd026L+xYUNOite+RMAVn/lc0
 E+9oX/C1rGvsfWQTN5MfmfM6XRPTh8Aj3R1/prIut3BFIw4Tas4W4nDrVoH+yFn0OFI2
 nFvBOXzA+p5XkvNFehuNWb6i/TQLOIo0jX43c4rnjMMCaeo8nIDlXTmHDQPgUpvFA6zc
 QQCELjsCK0XhlQREXAHQC9s5L+eDwpQJ6o4MQbMZD/45ai0UQEUvaStyA19Erf2M9P7E
 4VCw==
X-Gm-Message-State: APjAAAUpFL6ewub7RTY8WxRfvi1ed8k+TSAb0w+UajL6EV7q8TOFJP0z
 pLZa5k+5T0ff2j+/swMWdAApBJ5z3HbrwaUsEleUTq2j
X-Google-Smtp-Source: APXvYqyWtBVuIJVd4cCd60XRKdJWjo4+Vf5nB6ukDpHYETex6+OuPf9VtCeHDvLTNe/0alK+mGfUUIvGRAotRgsvFsY=
X-Received: by 2002:a02:a384:: with SMTP id y4mr85097199jak.77.1560832342224; 
 Mon, 17 Jun 2019 21:32:22 -0700 (PDT)
MIME-Version: 1.0
References: <CAA93jw6ZXnWStsr=CkVyP4V5Rvf8bnjTR91BPrU3Tr4D+M=ELg@mail.gmail.com>
 <1560630743.930819555@apps.rackspace.com>
In-Reply-To: <1560630743.930819555@apps.rackspace.com>
From: Dave Taht <dave.taht@gmail.com>
Date: Mon, 17 Jun 2019 21:32:10 -0700
Message-ID: <CAA93jw78d-gEUTjNzxDk59+f2W-HoqceeDqq=jG9c2HcQYj+SA@mail.gmail.com>
To: "David P. Reed" <dpreed@deepplum.com>
Cc: ECN-Sane <ecn-sane@lists.bufferbloat.net>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Ecn-sane] I think a defense of fq_x and co-design of new
 transports might be good
X-BeenThere: ecn-sane@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Discussion of explicit congestion notification's impact on the
 Internet <ecn-sane.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/ecn-sane>
List-Post: <mailto:ecn-sane@lists.bufferbloat.net>
List-Help: <mailto:ecn-sane-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/ecn-sane>,
 <mailto:ecn-sane-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Tue, 18 Jun 2019 04:32:23 -0000

On Sat, Jun 15, 2019 at 1:32 PM David P. Reed <dpreed@deepplum.com> wrote:
>
> Most web servers I see (like NGINX configurations recommended) do not see=
m to be in slow start much of the time.

I don't quite understand what you mean?

>
>
> I'd like to see some actual data, rather than hand waving or references t=
o 10 year old papers.

This paper contains a realistic survey of what the major CDN folk are
doing and is well worth a read:

https://arxiv.org/pdf/1905.07152.pdf

I also do track this stuff - requests per page essentially plateaued in 201=
6.

https://httparchive.org/reports/page-weight?start=3D2016_03_01&end=3Dlatest=
&view=3Dlist#reqTotal

total kilobytes is WAY below what the first cablelabs (2011) study
predicted for this era (6MB)
https://httparchive.org/reports/page-weight?start=3D2016_03_01&end=3Dlatest=
&view=3Dlist#bytesTotal


>
>
>
> Google is moving rapidly to protocols that run on UDP and have vestigial =
congestion control, if any. (and AFAICT, no research whatever regarding con=
gestion behavior under load that saturates the last mile link.)
>
>
>
> It bugs the heck out of me that the congestion control community doesn't =
look at the "real world", just simulations and benchmarks that are of dubio=
us reality.
>
>
>
>
>
> On Saturday, June 15, 2019 12:57pm, "Dave Taht" <dave.taht@gmail.com> sai=
d:
>
> > it would be a good paper to write. This is a draft of points I'd like
> > to cover, not an attempt at a more formal email,
> > I just needed to get this much out of my system, on the ecn-sane list.
> >
> > # about fq_x
> >
> > fq_x (presently fq_codel, fq_pie, sch_cake) have pretty much the same
> > fq algorithm. It has one new characteristic
> > compared to all the prior FQ ones - truly sparse flows see no queue at
> > all, otherwise the observed queue size is f,
> > where f =3D the number of queue building flows. If you have 3 full size
> > packets queued, you have 3f. No transport currently takes advantage of
> > this fairly tiny difference between "no queue" and "f queue".
> >
> > We use bytes, rather than packets, also, in our calculations as that
> > translates to time.
> >
> > I'm perpetually throwing around a statistic like "95% of all flows
> > never get out of slow start", that most are sender limited,
> > and so on, and thus (especially if paced) get 0 delay all the time in
> > FQ_x, or "0 first packet + pf" for the burst of packets.
> >
> > this is an essential, fine difference in measurement that can be
> > tracked receiver side unique to fq_x.
> >
> > ... where all it takes with a single queue, with AQM on, is one greedy
> > flow, to induce L latency on all flows, which in the case of pie/codel
> > is > 16/5ms - with plenty of jitter until things settle down. ( I wish
> > there was a way to express in a variable that it has a bounded range
> > of some sort, a ~16ms isn't good, >16ms or 16+ms neither )
> >
> > dualpi retains that >16ms characteristic for normal flows, and a
> > claimed 1ms for dualpi, which is... IMHO simply impossible in a wide
> > range of circumstances, but I'd just as soon try to focus on improving
> > FQ_x and co-designed transports in a more ideal world for a while, on
> > this thread.
> >
> > For purposes of exposition, let's assume that fq_x is the dominant AQM
> > algorithm in the world, the only one with
> > a proven and oft enabled, and *deterministic*, RFC3168 CE response on
> > overload, where a loss is assumed equivalent to a mark.
> >
> > In terms of co-designing a transport for it, a transport can then
> > assume that a CE mark is coming from FQ_x. Knowing that,
> > there are new curves that can be followed in various phases of the
> > evolution of a flow.
> >
> > Abstractly:
> >
> > 0 delay - we have capacity to spare, grow the window
> > "some delay" - we have a queue of "f", and thus a thinner setpoint obse=
rvable.
> > mild jitter between a recent arrival and the rest of the burst (the
> > sparse flow optimization)
> >
> > # Benefits of FQ_x
> >
> > FQ_x is robust against abuse. A single flow cannot overwhelm it. Some
> > level of service is guaranteed for the vast
> > majority of flows (excepting collisions) in the number of flows configu=
red.
> > FQ_x is also robust against different treatments of drop (bbr without
> > ecn) and CE (l4s)
> > FQ_x allows for delay based and hybrid delay based (like BBR) to "just
> > work", without any ecn support at all. The additional support in "x"
> > pushes queue lengths for drop based algorithms back to where the most
> > common TCPs can shift back
> > into classic slow start and congestion avoidance modes, instead of
> > being bound (as they are often today) in rwind, etc.
> > FQ_x is (add more)
> >
> > # Some observations regarding a CE mark
> >
> > Packet loss is a weak signal of a variety of events.
> >
> > A CE mark is a currently a strong signal you are in FQ_x - the odds
> > are good, this will be the event that kicks the transport out of slow
> > start. Now knowing you got a CE mark, gives you a chance to optimize,
> > knowing that your queue length is not a fifo, but relative to "f". In
> > BBR's case in particular, resetting the bandwidth and pacing rate to
> > the lowest recently observed (in the last 100 ms) "RTT - a little" is
> > better than the classic RFC3168 response of halving.
> >
> > One thing that bugs me about RTT based measurements is when the return
> > path is inflated - in FQ_x it's a decent assumption that both sides of
> > the path have FQ, so the ack return path is far less inflated, but in
> > pie/dualpi/codel it certainly can be for a variety of reasons. This is
> > why the rrul test exists. ack thinning does help also. the amount of
> > potential
> > jitter in the return path is enormous, and one benchmark I've not yet
> > seen from anyone on that side.
> >
> > moving sideways:
> >
> > I happen to like (in terms of determinism) an even stronger signal
> > than RFC3168, "loss and mark", where a combination of loss and marks
> > is even more meaningful than either, and thus the sender should back
> > off even harder (or, the receiver pretend it got CE in two different
> > RTTs). when we have queue sizes elsewhere measured in seconds, and a
> > colossal bufferbloat mess in general, anything that moves a link below
> > capacity would be great. The deterministic "loss and mark" feature was
> > in cake until a year or two back but I never got around much to
> > mucking with a transport's interpretation of it.
> >
> > # The SCE concept in addition to that
> >
> > With or without SCE, just that much, just that normal CE signal, is
> > enough to evolve a transport towards more sensitive
> > delay based signaling. It could be added to cubic, for example...
> >
> > Anyway...
> >
> > We have two public implementations of SCE under test - the cake one
> > uses a ramp, the fq_codel_fast one just uses
> > a setpoint where we have a consistently measurable queue (1ms), and
> > that setpoint is different
> > for wifi (1-2 TXOPs)
> >
> > SCE (presently) kicks in almost immediately upon building a queue.
> > Often, immediately! with IW10 at low bandwidths, (without initial
> > spreading, pacing or chirping). There is also the bulkyness of
> > draining the oft-large rx ring and the effects
> > of NAPI interrupt mitigation to deal with - which is usually around 1ms=
.
> >
> > Thus it is an extremely strong signal both that there is a queue, and
> > that fq_x is present. SCE requires support at the receiver - not the
> > sender - in order to work at all. The receiver can decide what to do
> > with it. My own first experimental preference was to kick tcp out of
> > slow start on receipt of any SCE mark, but afterwards in congestion
> > avoidance as a much more gradual signal, or even ignore it entirely.
> > I'm grumpy enough about IW10 to still consider that, but as the
> > current
> > sch_fq code does indeed pace the next burst, perhaps ignoring SCE on
> > the first few packets of a connection is useful to consider, also.
> >
> > There is plenty of work on all the congestion avoidance mode stuff
> > (reusing nonce sum, accecn, etc), but the key point
> > (for me) was signalling and thinking hard about the fact that fq_x was
> > present and that f governed the behavior of the queues. Knowing this,
> > growth and signalling patterns such as ELR, dctcp etc, can change.
> >
> > # Benefits of SCE
> >
> > * Plenty of stuff to write here that has been written elsewhere
> >
> > * Backward compatible
> > * gradual upgrade
> > * easy change to fq_x
> > * SCE re-enables the possibility of low priority congestion control
> > for background tcp flows
> >
> >
> > --
> >
> > Dave T=C3=A4ht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-205-9740
> > _______________________________________________
> > Ecn-sane mailing list
> > Ecn-sane@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/ecn-sane
> >


--=20

Dave T=C3=A4ht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740