[Ecn-sane] Exploring L4S and SCE possible pathologies with ubuntu .debs

Dave Taht dave.taht at gmail.com
Tue Aug 6 00:05:02 EDT 2019


This build is a collection of possible pathological interactions
between the L4S and SCE concepts - as well as being primarily a means
fo me to catch up on the state of the code.... I wouldn't run it on
the internet, nor do I plan to run it much longer than to get a few
tests done and a few packet captures to look at.

http://www.taht.net/~d/l4svsSCE/

It is based on net-next of commit: 9e8fb25254f76cb483303d8e9a97ed80a65418fe

In terms of *others* thinking up a repeatable test matrix and tools
for automating one, maybe this will "help".

A ton of this code is totally untested by me, may have other bugs,
besides the logical ones, and if it breaks, you get to keep both
pieces.
Might not even boot!

* sch_fq has been modified to turn ce_marks into sce_marks via ce_threshold
* sch_fq_codel has both regular ecn and ce_threshold as ce_threshold
* sch_cake is sce with ramp and regular ecn
* sch_fq_codel_fast has sce threshold and also splits GSO/GRO packets
* dualpi is dualpi from the L4Sforupstream repo

dctcp is *stock* which I hope now responds to loss properly
dctcp-sce is Jon's rewrite - but do we need a BSD client?
cubic is still stock cubic with RFC3168 CE
cubic-sce is Jon's SCE mods with RFC3168 still alpha testing
reno-sce is reno with SCE and RFC3168 responses enabled.
bbr is bbrv1 - no ecn support is in that at all - however if you
negotiate ecn, interesting things might happen
bbrv2 is bbrv2. dctcp-style ECN CE is available via a module parameter.

As per the IETF conference, IW10 is paced, not burst. I thought about
making it IW4 as a burst. Certainly we see lots of IW10 or worse bursts.

But wait! There's more! The mac80211 wifi drivers actually dynamically chose
whether codel ecn on or off was a function of the rate, which is, well, dynamic,
and used really outrageous targets in the case of really low rates or
lots of stations.

I made 'em be ecn always and reduced the target back to 6ms from 20ms,
and enabled sce_threshold.

Wheeee!

I was going to layer in fq_pie here also just for fun. Might do that if I feel
an urge to build another release.

This combination of conflicting options can be combined a whole ton of
bad ways and a few good ones, tied together with network namespaces,
run over your net, on your laptop, over your wifi, in a vm or three,
whatever.

Also cpu impact should be measurable - in my case, prior to all this
starting, I'd been working on a faster version of fq_codel that was
more O(1), and wanted to measure it. Might as well measure dualpi &
cake too.

I wouldn't draw any conclusions about bandwidth or latency from this
code without being very, very careful. I fact, if I were
you I'd not download, install it, and play with any of these options
at all. But I'm not you! Have at it.

In wading through all this code I did make two conclusions.

1) If ce_threshold is being used in the field as a way of moderating
self congestion it's interesting. No matter which response to
self-congestion it does lower the load. I really wish we knew how many
used this today.

As much as I wanted to retire ce_threshold I now feel we need a
separate API for sce. Which is OK, assuming the ramp thing is a thing.
It is just massively cleaner codewise to make sce an explicit thing.

2) I think the interaction between GSO'd behavior and non-GSO'd
behavior is going to be interesting. sch_fq releases 2 packets at a
time, the fq_codel deployment is often quantum 300. cake dynamically
sets the quantum. and then there's ack-filtering and wifi!

Anyway:

https://github.com/dtaht/tc-adv has the sch_cake and fq_codel_fast
support, the dualpi repo has the relevant tc support


-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


More information about the Ecn-sane mailing list