[Ecn-sane] [tsvwg] ECN CE that was ECT(0) incorrectly classified as L4S

Jonathan Morton chromatix99 at gmail.com
Mon Aug 5 06:59:47 EDT 2019


> [JM] A progressive narrowing of effective link capacity is very common in consumer Internet access.  Theoretically you can set up a chain of almost unlimited length of consecutively narrowing bottlenecks, such that a line-rate burst injected at the wide end will experience queuing at every intermediate node.  In practice you can expect typically three or more potentially narrowing points:
> 
> [RG] deleted. Please read https://tools.ietf.org/html/rfc5127#page-3 , first two sentences. That's a sound starting point, and I don't think much has changed since 2005. 

As I said, that reference is *usually* true for *responsible* ISPs.  Not all ISPs, however, are responsible vis a vis their subscribers, as opposed to their shareholders.  There have been some high-profile incidents of *deliberately* inadequate peering arrangements in the USA (often involving Netflix vs major cable networks, for example), and consumer ISPs in the UK *typically* have diurnal cycles of general congestion due to under-investment in the high-speed segments of their network.

To say nothing of what goes on in Asia Minor and Africa, where demand routinely far outstrips supply.  In those areas, solutions to make the best use of limited capacity would doubtless be welcomed.

> [RG] About the bursts to expect, it's probably worth noting that today's most popular application generating traffic bursts is watching video clips streamed over the Internet. Viewers dislike the movies to stall. My impression is, all major CDNs are aware of that and try their best to avoid this situation. In particular, I don't expect streaming bursts to overwhelm access link shaper buffers by design. And that, I think, limits burst sizes of the majority of traffic.

In my personal experience with YouTube, to pick a major video streaming service not-at-all at random, the bursts last several seconds and are essentially ack-clocked.  It's just a high/low watermark system in the receiving client's buffer; when it's full, it tells the server to stop sending, and after it drains a bit it tells the server to start again.  When traffic is flowing, it's no different from any other bulk flow (aside from the use of BBR instead of CUBIC or Reno) and can be managed in the same way.

The timescale I'm talking about, on the other hand, is sub-RTT.  Packet intervals may be counted in microseconds at origin, then gradually spaced out into the millisecond range as they traverse the successive bottlenecks en route.  As I mentioned, there are several circumstances when today's servers emit line-rate bursts of traffic; these can also result from aggregation in certain link types (particularly wifi), and hardware offload engines which try to treat multiple physical packets from the same flow as one.  This then results in transient queuing delays as the next bottleneck spaces them out again.

When several such bursts coincide at a single bottleneck, moreover, the queuing required to accommodate them may be as much as their sum.  This "incast effect" is particularly relevant in datacentres, which routinely produce synchronised bursts of traffic as responses to distributed queries, but can also occur in ordinary web traffic when multiple servers are involved in a single page load.  IW10 does not mean you only need 10 packets of buffer space, and many CDNs are in fact using even larger IWs as well.

These effects really do exist; we have measured them in the real world, reproduced them in lab conditions, and designed qdiscs to accommodate them as cleanly as possible.  The question is to what extent they are relevant to the design of a particular technology or deployment; some will be much more sensitive than others.  The only way to be sure of the answer is to be aware, and do the appropriate maths.

> [RG] Other people use their equipment to communicate and play games

These are examples of traffic that would be sensitive to the delay from transient queuing caused by other traffic.  The most robust answer here is to implement FQ at each such queue.  Other solutions may also exist.

> Any solution for Best Effort service which is TCP friendly and support scommunication expecting no congestion at the same time should be easy to deploy and come with obvious benefits. 

Well, obviously.  Although not everyone remembers this at design time.

> [RG] I found Sebastian's response sound. I think, there are people interested in avoiding congestion at their access.

> the access link is the bottleneck, that's what's to be expected.

It is typically *a* bottleneck, but there can be more than one from the viewpoint of a line-rate burst.

> [RG] I'd like to repeat again what's important to me: no corner case engineering. Is there something to be added to Sebastian's scenario?

He makes an essentially similar point to mine, from a different perspective.  Hopefully the additional context provided above is enlightening.

 - Jonathan Morton


More information about the Ecn-sane mailing list