[Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss)

Fred Baker fred at cisco.com
Sun May 15 16:49:22 EDT 2011


On May 15, 2011, at 11:28 AM, Jonathan Morton wrote:
> The fundamental thing is that the sender must be able to know when sent frames can be flushed from the buffer because they don't need to be retransmitted.  So if there's a NACK, there must also be an ACK - at which point the ACK serves the purpose of the NACK, as it does in TCP.  The only alternative is a wall-time TTL, which is doable on single hops but requires careful design.

To a point. NORM holds a frame for possible retransmission for a stated period of time, and if retransmission isn't requested in that interval forgets it. So the ack isn't actually necessary; what is necessary is that the retention interval be long enough that a nack has a high probability of succeeding in getting the message through. A 100 Gbit interface can handle 97656 per millisecond (100G/(8*128*1000). We're looking at something on the order of 18 bits (4 ms to retransmit without falling back to TCP) for a rational sequence number at 100 Gbps; 16 bits would be enough at 10 Gbps, and 12 bits would be enough at 1 Gbps.

> ...recent versions of Ethernet *do* support a throttling feedback mechanism, and this can and should be exploited to tell the edge host or router that ECN *might* be needed.  Also, with throttling feedback throughout the LAN, the Ethernet can for practical purposes be treated as almost-reliable.  This is *better* in terms of packet loss than ARQ or NACK, although if the Ethernet's buffers are large, it will still increase delay.  (With small buffers, it will just decrease throughput to the capacity, which is fine.)

It increases the delay anyway. It just pushes the retention buffer to another place. What do you think the packet is doing during the "don't transmit" interval?

Throughput never exceeds capacity. If I have a 10 GBPS link, I will never get more than 10 GBPS through it. Buffer fill rate is statistically predictable. With small buffers, the fill rate acheives the top sooner. They increase the probability that the buffers are full, which is to say the drop probability. Which puts us to an end to end retransmission, which is the worst case of what you were worried about.

I'm not going to argue against letting retransmission go end to end; it's an endless debate. I'll simply note that several link layers, including but not limited to those you mention, find that applications using them work better if there is a high high probability of retransmission in an interval on the order of the link RTT as opposed to the end to end RTT. You brought up data centers (aka variable delays in LAN networks); those have been heavily the province of fiberchannel, which is a link layer protocol with retransmission. Think about it.


More information about the Bloat mailing list