[Bloat] [Ecn-sane] sce materials from ietf
Jonathan Morton
chromatix99 at gmail.com
Sat Nov 30 09:32:52 EST 2019
>> 2: Accurate ECN Feedback.
>>
>> We use a spare bit in the header of TCP acks to feed back SCE marks, and the existing ECE/CWR mechanism from RFC-3168 unchanged for CE marks. The SCE feedback is "accurate" but not "reliable", because it can tolerate large errors (as much as 100% relative) without departing the control loop. The scheme is very simple and straightforward to implement at the receiver, and interpret at the sender.
>>
>> L4S uses AccECN to give CE mark feedback that is both "accurate" and "reliable". It is a somewhat complex specification which takes over three TCP header bits, including the two used for RFC-3168 feedback.
>
> Question: How feasible would it be for any SCE aware transport protocol to evaluate AccECN? This might make sense if not viewed from a technical but from a ietf politics perspective?
> I personally believe, that if the ECN feedback woukd e really important it should be packeged into TCP data as the payload has some delivery guarantees, while ACKs are effectively best effort (tangent: and this is why I consider ACK filtering/compression as abominations which should be counted against any guarantee the contract with the traffic-carrier entails, not that this helps end customers).
It would be *possible* to use AccECN for SCE feedback, but only because the distinction between ECT(0) and ECT(1) is fed back in a TCP option. SCE also has no use for the "accurate" CE feedback for which the ECE/CWR bits are replaced; if that three-bit field lay somewhere else, it could conceivably have been used for SCE feedback instead.
There are unfortunate problems with introducing new TCP options, in that some overzealous firewalls block traffic which uses them. This would be a deployment hazard for SCE, which merely using a spare header flag avoids. So instead we are still planning to use the spare bit - which happens to be one that AccECN also uses, but AccECN negotiates in such a way that SCE can safely use it even with an AccECN capable partner.
>> 4: TCP-friendly response to RFC-3168 CE marking.
>>
>> SCE does this by design, retaining the existing feedback mechanism for CE marks and implementing an RFC-8511 (ABE) compliant response in each of the TCP algorithms presented so far. We can do this easily because CE and SCE information from the network is unambiguous.
>>
>> L4S presently does not do this, largely because CE marks from RFC-3168 AQMs are not easily distinguished vice CE marks from an L4S AQM. They seem to be working on some sort of solution, but it has not yet been demonstrated to work, and their paper describing it leaves a lot of open questions (including tuning constants). That we saw no demonstration of it at IETF-106 (indeed they even skipped over their planned talk on it in a side session dedicated to L4S) suggests to me that they found flaws that were difficult to overcome at short notice, and possibly even managed to look bad next to our demonstration of jitter tolerance at the Hackathon.
>
> I fear that they will come up with something that in reality will a) by opt-out, that is they will assume L4S-style feedback until reluctantly convinced that the bottleneck marker is rfc3160-compliant and hence will b) trigger too late c) trigger to rarely to be actually helpful in reality, but might show a good enough effort to push L4S past issue #16.
I'm sure they will, and we will of course point out these shortcomings as they occur, so as to count them against issue #16. Conversely, if they do manage to make it fail-safe, it is highly likely that their scheme will give false positives on real Internet paths and fail to switch into L4S mode, impairing their performance in other ways.
>> 5: Reduced RTT dependence
>>
>> This is a mathematically interesting requirement which, at present, neither L4S nor SCE meets.
>>
>> Fundamentally, any two flows following the same congestion-signal response which makes average cwnd dependent solely on marking probability, and which share the same bottleneck queue and AQM and therefore experience the same marking probability, will converge to the same average cwnd and their relative throughputs will therefore be inversely proportional to their RTTs. This adequately describes both the pure AIMD response of Reno, and the so-called 1/p response of DCTCP (which TCP Prague apes slavishly).
>>
>> The steady-state cwnd formula for CUBIC, however, is a function of both p(CE) and RTT, such that its throughput should be proportional to the reciprocal quartic root of RTT, rather than linearly reciprocal. This assumes that CUBIC is not in its Reno compatibility regime, of course. So CUBIC is the standard to beat, or at least match, for this requirement.
>
> "Funny" story, looking at figure 6 of Høiland-Jørgensen T, Hurtig P, Brunstrom A (2015) The Good, the Bad and the WiFi: Modern AQMs in a residential setting. Computer Networks 89:90–106. shows clearly that a) single queue Pie (the AQM L4S inflicts upon at least the standard compliant traffic) causes worse RTT dependence than pfifo_fast and that fq_codel actually does (mostly) better, so by avoiding FQ like the devil, the L4S team shoots their own foot.
Right, and we can easily explain why this happens. A dumb FIFO adds a more-or-less constant delay to both competing flows, effectively reducing their RTT ratio towards unity. Even at the short effective queue lengths proposed by L4S, the example they give in the Prague Requirements is of a 100ms versus 1ms baseline path, lengthened to 101ms versus 2ms by a 1ms queue. This reduces a 100:1 ratio to 50.5:1.
The FQ example is, however, of the network enforcing fairness, rather than informing the endpoints of the corrections they need to make to resolve unfairness. We really like FQ, of course, but it's not feasible to deploy it everywhere, so we have to ensure reasonable competition between flows sharing a single queue. We've already started testing one such idea…
- Jonathan Morton
More information about the Bloat
mailing list