[Ecn-sane] rfc3168 sec 6.1.2

Dave Taht dave.taht at gmail.com
Thu Aug 29 09:51:25 EDT 2019


On Thu, Aug 29, 2019 at 1:02 AM Jonathan Morton <chromatix99 at gmail.com> wrote:
>
> > On 29 Aug, 2019, at 5:08 am, Dave Taht <dave.taht at gmail.com> wrote:
> >
> > It would explain a lot if this was not actually implemented in Linux.
> > I'm afraid to look. cwnd reduction is capped to 2. 1 should put you
> > in quickack mode AND to go lower seemingly it's supposed to
> > then rely on the retransmit timer.
>
> Nowadays the same effect can be obtained from the pacing timer.  Just set the CA scale factor to 50% to get an effective minimum cwnd of 1, or lower still if needed.

But that wouldn't trigger quickacks from the other side unless it's cwnd 1?

> In most of our SCE testing, we're now setting the SS scale factor to 100% (the default is 200%, which means the cwnd is sent over half an RTT and the other half is idle) and the CA scale factor to 40% (default is 120%, so the effective minimum cwnd is actually 2.4 from a packet-pair standpoint).  See the last substantive slide in the IETF-105 SCE deck.
>
>  - Jonathan Morton

I am leveraging hazy memories of old work a years back where I pounded
50 ? 100 ? flows
through a 100Mbit ethernet bottleneck, with a variety of aqm and tcp
ccs. I never got around to writing it up,
but what I observed was along the lines of:

A) fq_codel with drop had MUCH lower RTTs - and would trigger RTOs etc
- and interactive ssh sessions kept  working - which made me happier
than
B) cake (or fq_codel with ecn) hit, I don't remember, 40ms tcp delays.
> double that of drop is the
stat I remember
C) The workload was such that the babel protocol (1000?  routes - 4
packet non-ecn'd udp bursts) would eventually fail - dramatically, by
retracting the route I was on and thus acting as a circuit breaker on
all traffic, so I'd lose connectivit for 16 sec  - and failed much
more often in the latter case
D) The packet caps were capped at cwnd 2 or (4 with BBR) and I didn't
know enough about that until today
E) Head drop really was remarkablly better at keeping all the flows going
F) Pie hit its targets as did codel with drop, but with ecn.... ugh....

And at the time of all this carnage I basically said "ecn scares me
yet again", patched my babel daemons to use it, filed a bug on how we
think about micro flows wrongly here:
here:  https://github.com/tohojo/flent/issues/148 - wrote the ecn-sane
manefesto - and went off to play with my kid and boat.

Anyway, 100 flows, no delays, straight ethernet, and babel with 1000+
routes is easy to setup as a std test,
and I'd love it if y'all could have that in your testbed.

And:

cwnd 1 + pacing might help in this extreme scenarios. This last bit of
how rfc3168 ecn should be better handled was not in my head, I had
assumed til now that new research into subpacket windows was required.

Leveraging the retransmit timer, btw, would lighten the load on the
network a LOT, way back in 2001
when it was first thunk up. sally was a genius
-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


More information about the Ecn-sane mailing list