[Bloat] What is a good burst? -- AQM evaluation guidelines

Jonathan Morton chromatix99 at gmail.com
Sun Dec 15 07:26:00 EST 2013


On 15 Dec, 2013, at 7:35 am, Naeem Khademi wrote:

> the question remains: "what is a good burst (size) that AQMs should allow?"

The ideal size of a TCP congestion window - which limits the size of a burst on a TCP flow - is equal to the natural bandwidth-delay product for the flow.  That involves the available bandwidth and the natural RTT delay - ie. without added queueing delay.

Codel operates on this basis, making an assumption about typical RTT delays, and permitting queue residency to temporarily rise to that value without initiating marking operations.  A larger burst would be evidence of a congestion window that is too large, or an overall sending rate that exceeds the bandwidth at the link the codel queue controls.  A persistent queue is always taken as evidence of the latter.

In a datacentre or on a LAN, natural RTT delays are much shorter (microseconds) than on the general Internet (milliseconds) - conversely, available bandwidth is typically much higher (Gbps vs. Mbps).  The two factors approximately cancel out, so the bandwidth-delay product remains roughly the same in typical cases - although, of course, atypical cases such as satellite links (seconds of latency) and major backbones (extreme aggregate bandwidth and Internet-scale delays) also exist.  However, RTT is more consistent between installations than bandwidth is (factor of ten difference in typical range of ADSL link speeds, factor of a hundred in WiFi), so Codel uses a time basis rather than a byte-count basis for regulation, and is by default tuned for typical overland Internet latencies.

Fq_codel, as with other FQ-type qdiscs, tends to improve pacing when multiple flows are present, by interleaving packets from different queued bursts.  Pacing is the general absence of bursts, and can be implemented at source by a TCP sender that spreads packets within a congestion window through an interval of time corresponding to the measured RTT.  AFAIK, very few TCP implementations actually do this, probably due to a desire to avoid interrupt overheads (the CPU would have to be woken up by a timer for each packet).  It strikes me as feasible for NIC hardware to take on some of this burden.

 - Jonathan Morton




More information about the Bloat mailing list