[Ecn-sane] I think a defense of fq_x and co-design of new transports might be good

Dave Taht dave.taht at gmail.com
Tue Jun 18 00:32:10 EDT 2019


On Sat, Jun 15, 2019 at 1:32 PM David P. Reed <dpreed at deepplum.com> wrote:
>
> Most web servers I see (like NGINX configurations recommended) do not seem to be in slow start much of the time.

I don't quite understand what you mean?

>
>
> I'd like to see some actual data, rather than hand waving or references to 10 year old papers.

This paper contains a realistic survey of what the major CDN folk are
doing and is well worth a read:

https://arxiv.org/pdf/1905.07152.pdf

I also do track this stuff - requests per page essentially plateaued in 2016.

https://httparchive.org/reports/page-weight?start=2016_03_01&end=latest&view=list#reqTotal

total kilobytes is WAY below what the first cablelabs (2011) study
predicted for this era (6MB)
https://httparchive.org/reports/page-weight?start=2016_03_01&end=latest&view=list#bytesTotal


>
>
>
> Google is moving rapidly to protocols that run on UDP and have vestigial congestion control, if any. (and AFAICT, no research whatever regarding congestion behavior under load that saturates the last mile link.)
>
>
>
> It bugs the heck out of me that the congestion control community doesn't look at the "real world", just simulations and benchmarks that are of dubious reality.
>
>
>
>
>
> On Saturday, June 15, 2019 12:57pm, "Dave Taht" <dave.taht at gmail.com> said:
>
> > it would be a good paper to write. This is a draft of points I'd like
> > to cover, not an attempt at a more formal email,
> > I just needed to get this much out of my system, on the ecn-sane list.
> >
> > # about fq_x
> >
> > fq_x (presently fq_codel, fq_pie, sch_cake) have pretty much the same
> > fq algorithm. It has one new characteristic
> > compared to all the prior FQ ones - truly sparse flows see no queue at
> > all, otherwise the observed queue size is f,
> > where f = the number of queue building flows. If you have 3 full size
> > packets queued, you have 3f. No transport currently takes advantage of
> > this fairly tiny difference between "no queue" and "f queue".
> >
> > We use bytes, rather than packets, also, in our calculations as that
> > translates to time.
> >
> > I'm perpetually throwing around a statistic like "95% of all flows
> > never get out of slow start", that most are sender limited,
> > and so on, and thus (especially if paced) get 0 delay all the time in
> > FQ_x, or "0 first packet + pf" for the burst of packets.
> >
> > this is an essential, fine difference in measurement that can be
> > tracked receiver side unique to fq_x.
> >
> > ... where all it takes with a single queue, with AQM on, is one greedy
> > flow, to induce L latency on all flows, which in the case of pie/codel
> > is > 16/5ms - with plenty of jitter until things settle down. ( I wish
> > there was a way to express in a variable that it has a bounded range
> > of some sort, a ~16ms isn't good, >16ms or 16+ms neither )
> >
> > dualpi retains that >16ms characteristic for normal flows, and a
> > claimed 1ms for dualpi, which is... IMHO simply impossible in a wide
> > range of circumstances, but I'd just as soon try to focus on improving
> > FQ_x and co-designed transports in a more ideal world for a while, on
> > this thread.
> >
> > For purposes of exposition, let's assume that fq_x is the dominant AQM
> > algorithm in the world, the only one with
> > a proven and oft enabled, and *deterministic*, RFC3168 CE response on
> > overload, where a loss is assumed equivalent to a mark.
> >
> > In terms of co-designing a transport for it, a transport can then
> > assume that a CE mark is coming from FQ_x. Knowing that,
> > there are new curves that can be followed in various phases of the
> > evolution of a flow.
> >
> > Abstractly:
> >
> > 0 delay - we have capacity to spare, grow the window
> > "some delay" - we have a queue of "f", and thus a thinner setpoint observable.
> > mild jitter between a recent arrival and the rest of the burst (the
> > sparse flow optimization)
> >
> > # Benefits of FQ_x
> >
> > FQ_x is robust against abuse. A single flow cannot overwhelm it. Some
> > level of service is guaranteed for the vast
> > majority of flows (excepting collisions) in the number of flows configured.
> > FQ_x is also robust against different treatments of drop (bbr without
> > ecn) and CE (l4s)
> > FQ_x allows for delay based and hybrid delay based (like BBR) to "just
> > work", without any ecn support at all. The additional support in "x"
> > pushes queue lengths for drop based algorithms back to where the most
> > common TCPs can shift back
> > into classic slow start and congestion avoidance modes, instead of
> > being bound (as they are often today) in rwind, etc.
> > FQ_x is (add more)
> >
> > # Some observations regarding a CE mark
> >
> > Packet loss is a weak signal of a variety of events.
> >
> > A CE mark is a currently a strong signal you are in FQ_x - the odds
> > are good, this will be the event that kicks the transport out of slow
> > start. Now knowing you got a CE mark, gives you a chance to optimize,
> > knowing that your queue length is not a fifo, but relative to "f". In
> > BBR's case in particular, resetting the bandwidth and pacing rate to
> > the lowest recently observed (in the last 100 ms) "RTT - a little" is
> > better than the classic RFC3168 response of halving.
> >
> > One thing that bugs me about RTT based measurements is when the return
> > path is inflated - in FQ_x it's a decent assumption that both sides of
> > the path have FQ, so the ack return path is far less inflated, but in
> > pie/dualpi/codel it certainly can be for a variety of reasons. This is
> > why the rrul test exists. ack thinning does help also. the amount of
> > potential
> > jitter in the return path is enormous, and one benchmark I've not yet
> > seen from anyone on that side.
> >
> > moving sideways:
> >
> > I happen to like (in terms of determinism) an even stronger signal
> > than RFC3168, "loss and mark", where a combination of loss and marks
> > is even more meaningful than either, and thus the sender should back
> > off even harder (or, the receiver pretend it got CE in two different
> > RTTs). when we have queue sizes elsewhere measured in seconds, and a
> > colossal bufferbloat mess in general, anything that moves a link below
> > capacity would be great. The deterministic "loss and mark" feature was
> > in cake until a year or two back but I never got around much to
> > mucking with a transport's interpretation of it.
> >
> > # The SCE concept in addition to that
> >
> > With or without SCE, just that much, just that normal CE signal, is
> > enough to evolve a transport towards more sensitive
> > delay based signaling. It could be added to cubic, for example...
> >
> > Anyway...
> >
> > We have two public implementations of SCE under test - the cake one
> > uses a ramp, the fq_codel_fast one just uses
> > a setpoint where we have a consistently measurable queue (1ms), and
> > that setpoint is different
> > for wifi (1-2 TXOPs)
> >
> > SCE (presently) kicks in almost immediately upon building a queue.
> > Often, immediately! with IW10 at low bandwidths, (without initial
> > spreading, pacing or chirping). There is also the bulkyness of
> > draining the oft-large rx ring and the effects
> > of NAPI interrupt mitigation to deal with - which is usually around 1ms.
> >
> > Thus it is an extremely strong signal both that there is a queue, and
> > that fq_x is present. SCE requires support at the receiver - not the
> > sender - in order to work at all. The receiver can decide what to do
> > with it. My own first experimental preference was to kick tcp out of
> > slow start on receipt of any SCE mark, but afterwards in congestion
> > avoidance as a much more gradual signal, or even ignore it entirely.
> > I'm grumpy enough about IW10 to still consider that, but as the
> > current
> > sch_fq code does indeed pace the next burst, perhaps ignoring SCE on
> > the first few packets of a connection is useful to consider, also.
> >
> > There is plenty of work on all the congestion avoidance mode stuff
> > (reusing nonce sum, accecn, etc), but the key point
> > (for me) was signalling and thinking hard about the fact that fq_x was
> > present and that f governed the behavior of the queues. Knowing this,
> > growth and signalling patterns such as ELR, dctcp etc, can change.
> >
> > # Benefits of SCE
> >
> > * Plenty of stuff to write here that has been written elsewhere
> >
> > * Backward compatible
> > * gradual upgrade
> > * easy change to fq_x
> > * SCE re-enables the possibility of low priority congestion control
> > for background tcp flows
> >
> >
> > --
> >
> > Dave Täht
> > CTO, TekLibre, LLC
> > http://www.teklibre.com
> > Tel: 1-831-205-9740
> > _______________________________________________
> > Ecn-sane mailing list
> > Ecn-sane at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/ecn-sane
> >



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


More information about the Ecn-sane mailing list