[Ecn-sane] rfc3168 sec 6.1.2

Thu Aug 29 15:10:43 EDT 2019

On Thu, Aug 29, 2019 at 7:42 AM Jonathan Morton <chromatix99 at gmail.com> wrote:
>
> > On 29 Aug, 2019, at 4:51 pm, Dave Taht <dave.taht at gmail.com> wrote:
> >
> > I am leveraging hazy memories of old work a years back where I pounded 50 ? 100 ? flows through a 100Mbit ethernet
>
> At 100 flows, that gives you 1Mbps per flow fair share, so 80pps or 12.5ms between packets on each flow, assuming they're all saturating.  This also means you have a minimum sojourn time (for saturating flows) of 12.5ms, which is well above the Codel target, so Codel will always be in dropping-state and will continuously ramp up its signalling frequency (unless some mitigation is in place for this very situation, which there is in Cake).
>
> Both Cake and fq_codel should still be able to prioritise sparse flows to sub-millisecond delays under these conditions.  They'll be pretty strict about what counts as "sparse" though.  Your individual keystrokes and echoes should get through quickly, but output from programs may end up waiting.
>
> > A) fq_codel with drop had MUCH lower RTTs - and would trigger RTOs etc
>
> RTOs are bad.  They indicate that the steady flow of traffic has broken down on that flow due to tail loss, which is a particular danger at very small cwnds.

They indicated that traffic has broken down for any of a zillion
reasons. RTO's for example, are what
gets tcp restarted after babel does the circuit breaker thing on this
test and restores it.

RTOs are Good. :)

> Cake tries to avoid them by not dropping the last queued packet from any given flow.  Fq_codel doesn't have that protection, so in non-ECN mode it will drop way too many packets in a desperate (and misguided) attempt to maintain the target sojourn time.

We are trying to encourage others to stop editorizing so much. As the
author of this behavior in fq_codel,
my reasoning at the time was that under conditions of overload that
there were usually packets "in the network", and keeping the last
packet in the queue scaled badly in terms of total RTT. Saying "go
away, come back later" was a totally reasonable response, baked into
TCPs since the very beginning.

I'm glad that cake and fq_codel have a different response curve here.
It's interesting. Catagorizing the
differences between approaches is good.

As best as I can recall I put this behavior into fq_codel after some
very similar testing back in 2012.

> What you need to understand here is that dropped packets increase *application* latency, even if they also reduce the delay to individual packets.  ECN doesn't incur that problem.

Well, let me point at my data here:
http://blog.cerowrt.org/post/ecn_fq_codel_wifi_airbook/

We need to be clear about what we consider an "application". I tend to
think about things more
as "human facing" or not, and optimize for humans first.

In this case dropped packets on a 2 second flow account for a maximum
of 16ms increase for FCT. Inperceptable. Compared to making room for
other packets from other flows at the point of contention
is a win for those other flows.

In particular (and perhaps we can show this with a heavy load test)
having shorter RTTs from drop makes it
faster for a new or existing flows to grab back bandwidth when part of
that load exits.

I've long bought the argument for human interactive flows that need a
reliable transport - that ecn is good - as we did in mosh. But (being
chicken) on doing it to everything, not so much.

Anyway, the cwnd 1 + retransmit (or pacing!) idea would hopefully
reduce the ecn'd RTTs to something
more comparable to the drop in this particular test, which would be a
step forward.

I'll get to your other points below, later.

> > B) cake (or fq_codel with ecn) hit, I don't remember, 40ms tcp delays.
>
> A delay of 40ms suggests about 3 packets per flow are in the queue.  That's pretty close to the minimum cwnd of 2.  One would like to do better than that, of course, but options for doing so become limited.
>
> I would expect SCE to do better at staying *at* the minimum cwnd in these conditions.  That by itself would reduce your delay to 25ms.  Combined with setting the CA pacing scale factor to 40%, that would also reduce the average packets per flow in the queue to 0.8.  I think that's independent of whether the receiver still acks only every other segment.  The delay on each flow would probably go down to about 10ms on average, but I'm not going to claim anything about the variance around that value.
>
> Since 10ms is still well above the normal Codel target, SCE will be signalling 100% to these flows, and thus preventing them from increasing the cwnd from 2.
>
> > C) The workload was such that the babel protocol (1000?  routes - 4
> > packet non-ecn'd udp bursts) would eventually fail - dramatically, by
> > retracting the route I was on and thus acting as a circuit breaker on
> > all traffic, so I'd lose connectivit for 16 sec
>
> That's a problem with Babel, not with ECN.  A robust routing protocol should not drop the last working route to any node, just because the link gets congested.  It *may* consider that link as non-preferred and seek alternative routes that are less congested, but it *must* keep the route open (if it is working at all) until such an alternative is found.
>
> But you did find that turning on ECN for the routing protocol helped.  So the problem wasn't latency per se, but packet loss from the AQM over-reacting to that latency.
>
> > Anyway, 100 flows, no delays, straight ethernet, and babel with 1000+ routes is easy to setup as a std test, and I'd love it if y'all could have that in your testbed.
>
> Let's put it on the todo list.  Do you have a working script we can just use?
>
>  - Jonathan Morton

-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740