[Ecn-sane] [tsvwg] Comments on L4S drafts
Jonathan Morton
chromatix99 at gmail.com
Fri Jul 5 04:26:09 EDT 2019
> On 4 Jul, 2019, at 8:54 pm, Bob Briscoe <ietf at bobbriscoe.net> wrote:
> You are assuming that the one thing we haven't done yet (fall-back to TCP-friendly on detection of classic ECN) won't work, whereas all the problems you have not addressed yet with SCE will work.
This is whataboutism. Please don't.
We have a complete end-to-end implementation of SCE, which not only works but is safe-by-design in today's Internet, as outlined not only in the I-Ds we submitted this week, but also below.
> I believe that using this to enable fine-grained congestion control would still rely on the semantics of the SCE style of signalling still. Correct?
Yes, although the fine detail of these semantics has changed since the first I-D in light of implementation experience. I do suggest reading the new version.
> • Q1. Does SCE require per-flow scheduling?
SCE does not require per-flow scheduling.
It does work *better* with per-flow scheduling, but that's also true of most types of existing traffic.
> • If so, how do you expect it to be supported on L2 links, where not even the IP header is accessible, let alone L4?
While this question is moot, may I ask how you expect the ECN field to be used when the IP header is inaccessible? I'm sure either DCTCP or SCE-like principles can be applied to an L2 flow, but it would not be through ECN per se.
> • If not, how does it work?
In the first place, SCE flows work transparently with existing dumb and CE-marking infrastructure, and behave in an RFC-3168 compliant manner in that case. So no special preparations in the network are required merely to allow SCE endpoints to be deployed safely. We consider this one of SCE's key advantages over L4S.
We have now implemented and at least briefly tested a way to mark SCE in a single-queue bottleneck while retaining fairness versus non-SCE traffic. It requires only an adjustment to a detail of the way SCE marking is done at that node - that is, altering the relationship between CE and SCE marking - and does not increase implementation complexity even there. The tradeoff is that SCE's benefit is diluted because SCE flows may receive unnecessary CE marks, but it does achieve fairness (for example) between plain Reno and Reno-SCE.
You might wish to read the submitted draft outlining our initial test results. They do in fact focus on single-queue behaviour, both with single flows and with two similar or dissimilar flows competing, and should thus answer additional questions you may have on this topic. We are still refining this, of course.
> • Q2. How do you address the lack of ECT(1) feedback in TCP, given no-one is implementing the AccECN TCP option? And even if they did, do you have measurements on how few middleboxes / proxies, etc will allow traversal?
Our experimental reference implementation uses the former NS bit in the TCP header as an ESCE feedback mechanism. NS is unused because Nonce Sum was never deployed, but because Nonce Sum was specified in an RFC, we expect it will traverse the Internet quite well. Additionally, the reuse of NS in another role also associated with ECT(1) seems poetic. Controlled tests over public Internet paths, as well as more extensively in lab conditions, have been carried out successfully.
Disruption of either SCE or ESCE signals is tolerated by design, because in extremis SCE flows still respond to CE marks and packet drops appropriately for effective congestion control.
We expect to publish an I-D covering the above shortly.
Cursory examination of QUIC indicates that it already has a mechanism specified for detailed ECN feedback, and naturally this can also support SCE.
> • Q3. How do you address all the tunnel decapsulators that will black-hole ECT(1) marking of the outer? Do you have measurements of how much of a blockage to progress this will be?
I imagine a blackhole of ECT(1) would also be problematic for L4S. I would consider such tunnels RFC-ignorant (ie. buggy) because ECT(1) is expressly permitted by RFC-3168 in the same circumstances where ECT(0) is. We have not encountered any such problems ourselves.
In any case, the precise effects will depend on the nature of the blackhole. If they change ECT(1) to ECT(0) or Not-ECT, then SCE flows will not receive SCE information and will therefore behave like RFC-3168 flows do. If the affected packets are dropped, then TCP should be able to recover from that.
> • Q4. How do you address the interaction of the two timescale dynamics in the SCE congestion control?
Which two timescale dynamics are you referring to?
> • Q5. Can out-of-order tolerance be relaxed on links supporting SCE? (not a problem as such, but a lack of one of L4S's advantages)
We consider that aspect of L2 link design to be orthogonal to SCE. Most transports currently deployed should be able to cope with microsecond-level reordering on multi-millisecond Internet paths without triggering unnecessary retransmissions.
> {Note 1}: Implementation complexity is only a small part of the objections to FQ.
We are still waiting for a good explanation of these objections. So far, we are aware only of the well-known vulnerability to "gaming" by employing more flows than necessary - but we also have defences against that, which we plan to add to a future version of the LFQ draft. These defences are semantically similar to the dual host-flow fairness currently deployed in Cake, but with a more hardware-friendly algorithm.
- Jonathan Morton
More information about the Ecn-sane
mailing list