[Bloat] Packet reordering and RACK (was The "Some Congestion Experienced" ECN codepoint)

Sun Mar 17 07:45:13 EDT 2019

Hi Carsten,

> On Mar 17, 2019, at 11:23, Carsten Bormann <cabo at tzi.org> wrote:
> 
> On Mar 14, 2019, at 22:43, Sebastian Moeller <moeller0 at gmx.de> wrote:
>> 
>> if a specific link technology is prone to introduce reordering due to retransmit it might as well try to clean up after itself
> 
> The end-to-end argument applies:  Ultimately, there needs to be resequencing at the end anyway, so any reordering in the network would be a performance optimization.  It turns out that keeping packets lying around in some buffer somewhere in the network just to do resequencing before they exit an L2 domain (or a tunnel) is a pessimization, not an optimization.

	I do not buy the end to end argument here, because in the extreme why do ARQ on individual links anyway, we can just leave it to the end-points to do the ARQ and TCP does anyway. The point is transport-ARQ allows to use link technologies that otherwise would not be acceptable at all. So doing ARQ on the individual links already indicates that somethings are more efficient to not only do e2e. I just happen to think that re-ordering falls into the same category, at least for users stuck behind a slow link as is typical at the edge of the internet.

To put numbers to my example, assume I am on a 1/1 Mbps link and I get TCP data at 1 Mbps rate and MTU1500 packets (I am going to keep the numbers approximate) and I get a burst of say 10 packets containing say 10 individual messages for my application telling the position of say an object in 3d space

each packet is going to "hog" the link for: 1000 ms/s * (1500 * 8 b/packet ) / (1000 * 1000 b/s)  = 12 ms
So I get access to messages/new positions every 12 ms and I can display this smoothly

Now if the first packet gets r-odered to be last, I either drop that packet and accept a 12 ms gap or if that is not an option I get to wait 9*12 = 108ms before positions can be updated, that IMHO shows why re-ordering is terrible even if TCP would be more tolerant. Especially in the context of L4S something like this seems to be totally unacceptable if ultra-low latency is supposed to be anything more than marketing. 

> 
> For three decades now, we have acted as if there is no cost for in-order delivery from L2 — not because that is true, but because deployed transport protocol implementations were built and tested with simple links that don’t reorder.  

	Well, that is similar to the argument for performing non-aligned loads fast in hardware, yes this comes with a considerable cost in complexity and it is harder to make this go fast than just allowing aligned loads and fixing up unaligned loads by trapping to software, but from a user perspective the fast hardware beats the fickle only make aligned loads go fast approach any old day.

> Techniques for ECMP (equal-cost multi-path) have been developed that appease that illusion, but they actually also are pessimizations at least in some cases.

	Sure, but if I understand correctly, this is partly due to the fact that transport people opted not to do the re-sorting on a flow-by-flow basis; that would solve the blocking issue from the transport perspective, sure the affected flow would still suffer from some increased delay, but as I tried to show above that might be still smaller than the delay incurred by doing the re-sorting after the bottleneck link. What is wrong with my analysis?

> 
> The question at hand is whether we can make the move back to end-to-end resequencing techniques that work well,

	But we can not, we can make TCP more robust, but what I predict if RACK allows for 100ms delay transports will take this as the new the new goal and will keep pushing against that limit; and all in the name of bandwidth over latency.

> at least within some limits that we still have to find.
>  That probably requires some evolution at the end-to-end transport implementation layer.  We are in a better position to make that happen than we have been for a long time.

	Probably true, but also not very attractive from an end-user perspective.... unless this will allow transport innovations that will allow massively more bandwidth at a smallish latency cost.

Best Regards
	Sebastian

> 
> Grüße, Carsten
>