> Thanks for the explanation.  I had heard most of those technology buzzwords 
> but your message puts them into context together. Prioritization and queue 
> management certainly helps, but I still don'tunderstand how the system behaves 
> when it hits capacity somewhere deep inside -- the "thought experiments" I 
> described with a bottleneck somewhere deep inside the Internet as two flows 
> collide.

The active queue management I describe needs to take place on the sending side 
of any congested link. (there is a limited amount of tweaking that can be done 
on the receiving side by not acking to force a sender to slow down, but it works 
FAR better to manage on the sending side)

It's also not that common for the connections deep in the network to be the 
bottleneck. It can happen, but most ISPs watch link capacity and keep 
significant headroom (by adding parallel links in many cases). In practice, it's 
almost always the 'last mile' link that is the bottleneck.

> The scheme you describe also seems vulnerable to users' innovative tactics to 
> get better service.  E.g., an "ISO download" using some scheme like Torrents 
> would spread the traffic around a bunch of connections which may not all go 
> through the same bottleneck and not be judged as low priority.  Also it seems 
> prone to things like DDOS attacks,  e.g., flooding a path with DNS queries 
> from many bot sources that are judged as high priority.

First off, I'm sure that the core folks here who write the code will take 
exception to my simplifications. I welcome corrections.

Cake and fq_codel are not the result of deep academic research (although they 
have spawned quite a bit of it), they are the result of insights and tweaks 
looking at real-world behavior, with the approach being 'keep it as simple as 
possible, but no simpler'. So some of this is heuristics, but they have been 
shown to work and be hard to game over many years.


It is hard to game things, because connections are evaluated based on their 
bahavior, not based on port or inspecting them to determine their protocol.

DNS queries are not giving high priority because they are DNS, new connections 
are given high priority until they start having a lot of data. Since DNS tends 
to be short queries with a short response, they never transfer enough data to 
get impacted. Torrent connections are each passing a significant amount of data 
so they are slowed. Since the torrent connections will be involving different 
endpoints, they will take different paths and only those

Cake also adds a layer that fq_codel doesn't have that can evaluate at a 
host/network/customer level to provide fairness at those levels rather than just 
at the flow level.

There are multiple queues, and sending rotates between them, connections are 
assigned to a queue based on various logic (connection data and the other things 
cake can take into account), you really only have contention within a queue.

queues are kept small, and no sender is allowed to use too much of the queue, so 
the latency for new data is kept small.

In addition, it has been shown that when you do something, there are a lot of 
small flows that happen serially, so any per-connection latency get multiplied 
in terms of the end user experience (think about loading a web page. it 
references many other pages, each URL needs a DNS lookup, then a check to see if 
the cached data for that site is still valid, and other things before the page 
can start to be rendered, and the image data can actually arrive quite a bit 
later without bothering the user)


> The behavior "packets get marked/dropped to signal the sender to slow down" 
> seems essentially the same design as the "Source Quench" behavior defined in 
> the early 1980s.  At the time, I was responsiblefor a TCP implementation, and 
> had questions about what my TCP should do when it received such a "slow down" 
> message.  It was especially unclear in certain situations - e.g., if my TCP 
> sent a datagram toopen a new connection and got a "slow down" response, what 
> exactly should it do?

fq_codel and cake do not invent any new mechanisms to control the flow, they 
just leverage the existing TCP fallback (including the half-measure of ECN 
tagging to signal to slow down without requiring a retransmit)

> There were no good answers back then.  One TCP implementor decided that the 
> best reaction on receiving a "slow down" message was to immediately retransmit 
> the datagram that had just been confirmed to bediscarded.  "Slow down" 
> actually meant "Speed up, I threw away your last datagram."

but you first have to find out that the packet didn't arrive by not getting the 
ack before the timeout, and in the meantime, you don't send more than your 
transmit window. When you retransmit the missing packet, you still have your 
window full until you get an ack for that packet (and you are supposed to shrink 
your window size when a packet is lost or you get an ECN signal)

so at a micro level, you are generating more traffic, but at a macro level you 
are slowing down.

yes, misbehaving stacks can send too much, but that will just mean that more 
packets on the offending connections get dropped.

In terms of packet generators, you can never get perfect defense against pure 
bandwidth flooding, but if you use cake-like mechanisms to ensure fairness 
between IPs/customers/etc you limit the damage

> So, I'm still curious about the Internet behavior with the current mechanisms 
> when the system hits its maximum capacity - the two simple scenarios I 
> mentioned with bottlenecks and only two data flowsinvolved that converged at 
> the bottleneck.   What's supposed to happen in theory?   Are implementations 
> actually doing what they're supposed to do?  What does happen in a real-world 
> test?

as noted above, the vast majority of the time, the link that hits maximum 
capacity is the last-mile hop to the user rather than some ISP <-> ISP hop out 
in the middle of the Internet. fq_codel is pretty cheap to implement (cake is a 
bit more expensive, so more suitable for the endpoints than core systems).

when trying to define what 'the right thing to do' should be, it's extremely 
tempting for academic studies to fall into the trap of deciding what should 
happen based on global knowledge about the entire network. fq_codel and cake 
work by just looking at the data being fed to the congested link (well, cake at 
a last-mile hop can take advantage of some categorization rules/lookups that 
would not be avaialble to core Internet routers)

but I think the short answer to your scenario is 'if it would exceed your queue 
limits, drop a packet from the connection sending the most data'

a little bit of buffering is a good thing, the key is to keep the buffers from 
building up and affecting other connections

David Lang