[Ecn-sane] ECN CE that was ECT(0) incorrectly classified as L4S

Thu Jun 13 12:48:14 EDT 2019

[I'm sending this to ecn-sane 'cos that's where I detect that this 
concern is still rumbling.
I'm also sending to tcpm at ietf 'cos there's a question for TCP experts 
just before the quoted text below.
And tsvwg at ietf is where it ought to be discussed.]

Now that the IPR issue with L4S has been put to bed, one by one I am 
going through the other concerns that have been raised about L4S.

In the IETF draft that records all the pros and cons of different 
identifiers to use for L4S, under the "ECT(1) and CE" choice (which is 
currently the one adopted at the IETF) there was already an explanation 
of why there would be vanishingly low risk of any harmful consequences 
from CE that was originally ECT(0) being classified into the L4S queue:
https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-06#page-32

Re-reading that, I have found some things unstated that I had thought 
were obvious. So I've spelled it all out long-hand in the text below, 
which is now in my local copy of the draft and will be in the next 
revision unless people suggest improvements/corrections here.

*Q#1:* If this glosses over any concerns you have, please explain.
Otherwise I will continue to consider that this is effectively a 
non-issue, which is the conclusion everyone in the TCP community came to 
at the time the L4S identifier was chosen back in 2015.

*Q#2: *The last couple of lines are the only part I am not sure of. Do 
most of today's TCP implementations recover the reduction in congestion 
window when they discover later that a fast retransmit was spurious? 
There's a note at the end of the intro to rfc4015 saying there was 
insufficient consensus to standardize this behaviour, but that most 
likely means it's done in different ways, rather than it isn't done at all.

Bob

======================================

    Risk of reordering classic CE packets:  Classifying all CE packets
       into the L4S queue risks any CE packets that were originally
       ECT(0) being incorrectly classified as L4S.  If there were delay
       in the Classic queue, these incorrectly classified CE packets
       would arrive early, which is a form of reordering.  Reordering can
       cause TCP senders (and senders of similar transports) to
       retransmit spuriously.  However, the risk of spurious
       retransmissions would be extremely low for the following reasons:

       1.  It is quite unusual to experience queuing at more than one
           bottleneck on the same path (the available capacities have to
           be identical).

       2.  In only a subset of these unusual cases would the first
           bottleneck support classic ECN marking while the second
           supported L4S ECN marking, which would be the only scenario
           where some ECT(0) packets could be CE marked by a non-L4S AQM
           then the remainder experienced further delay through the
           Classic side of a subsequent L4S DualQ AQM.

       3.  Even then, when a few packets are delivered early, it takes
           very unusual conditions to cause a spurious retransmission, in
           contrast to when some packets are delivered late.  The first
           bottleneck has to apply CE-marks to at least N contiguous
           packets and the second bottleneck has to inject an
           uninterrupted sequence of at least N of these packets between
           two packets earlier in the stream (where N is the reordering
           window that the transport protocol allows before it considers
           a packet is lost).

              For example consider N=3, and consider the sequence of
              packets 100, 101, 102, 103,... and imagine that packets
              150,151,152 from later in the flow are injected as follows:
              100, 150, 151, 101, 152, 102, 103...  If this were late
              reordering, even one packet arriving 50 out of sequence
              would trigger a spurious retransmission, but there is no
              spurious retransmission here, because packet 101 moves the
              cumulative ACK counter forward before 3 packets have
              arrived out of order.  Later, when packets 148, 149, 153...
              arrive, even though there is a 3-packet hole, there will be
              no problem, because the packets to fill the hole are
              already in the receive buffer.

       4.  Even with the current recommended TCP (N=3) spurious
           retransmissions will be unlikely for all the above reasons.
           As RACK [I-D.ietf-tcpm-rack] is becoming widely deployed, it
           tends to adapt its reordering window to a larger value of N,
           which will make the chance of a contiguous sequence of N early
           arrivals vanishingly small.

       5.  Even a run of 2 CE marks within a classic ECN flow is
           unlikely, given FQ-CoDel is the only known widely deployed AQM
           that supports classic ECN marking and it takes great care to
           separate out flows and to space any markings evenly along each
           flow.

       It is extremely unlikely that the above set of 5 eventualities
       that are each unusual in themselves would all happen
       simultaneously.  But, even if they did, the consequences would
       hardly be dire: the odd spurious fast retransmission.  Admittedly
       TCP reduces its congestion window when it deems there has been a
       loss, but even this can be recovered once the sender detects that
       the retransmission was spurious.

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/ecn-sane/attachments/20190613/10999e82/attachment.html>