> Thanks for the explanation.  I had heard most of those technology buzzwords > but your message puts them into context together. Prioritization and queue > management certainly helps, but I still don'tunderstand how the system behaves > when it hits capacity somewhere deep inside -- the "thought experiments" I > described with a bottleneck somewhere deep inside the Internet as two flows > collide. The active queue management I describe needs to take place on the sending side of any congested link. (there is a limited amount of tweaking that can be done on the receiving side by not acking to force a sender to slow down, but it works FAR better to manage on the sending side) It's also not that common for the connections deep in the network to be the bottleneck. It can happen, but most ISPs watch link capacity and keep significant headroom (by adding parallel links in many cases). In practice, it's almost always the 'last mile' link that is the bottleneck. > The scheme you describe also seems vulnerable to users' innovative tactics to > get better service.  E.g., an "ISO download" using some scheme like Torrents > would spread the traffic around a bunch of connections which may not all go > through the same bottleneck and not be judged as low priority.  Also it seems > prone to things like DDOS attacks,  e.g., flooding a path with DNS queries > from many bot sources that are judged as high priority. First off, I'm sure that the core folks here who write the code will take exception to my simplifications. I welcome corrections. Cake and fq_codel are not the result of deep academic research (although they have spawned quite a bit of it), they are the result of insights and tweaks looking at real-world behavior, with the approach being 'keep it as simple as possible, but no simpler'. So some of this is heuristics, but they have been shown to work and be hard to game over many years. It is hard to game things, because connections are evaluated based on their bahavior, not based on port or inspecting them to determine their protocol. DNS queries are not giving high priority because they are DNS, new connections are given high priority until they start having a lot of data. Since DNS tends to be short queries with a short response, they never transfer enough data to get impacted. Torrent connections are each passing a significant amount of data so they are slowed. Since the torrent connections will be involving different endpoints, they will take different paths and only those Cake also adds a layer that fq_codel doesn't have that can evaluate at a host/network/customer level to provide fairness at those levels rather than just at the flow level. There are multiple queues, and sending rotates between them, connections are assigned to a queue based on various logic (connection data and the other things cake can take into account), you really only have contention within a queue. queues are kept small, and no sender is allowed to use too much of the queue, so the latency for new data is kept small. In addition, it has been shown that when you do something, there are a lot of small flows that happen serially, so any per-connection latency get multiplied in terms of the end user experience (think about loading a web page. it references many other pages, each URL needs a DNS lookup, then a check to see if the cached data for that site is still valid, and other things before the page can start to be rendered, and the image data can actually arrive quite a bit later without bothering the user) > The behavior "packets get marked/dropped to signal the sender to slow down" > seems essentially the same design as the "Source Quench" behavior defined in > the early 1980s.  At the time, I was responsiblefor a TCP implementation, and > had questions about what my TCP should do when it received such a "slow down" > message.  It was especially unclear in certain situations - e.g., if my TCP > sent a datagram toopen a new connection and got a "slow down" response, what > exactly should it do? fq_codel and cake do not invent any new mechanisms to control the flow, they just leverage the existing TCP fallback (including the half-measure of ECN tagging to signal to slow down without requiring a retransmit) > There were no good answers back then.  One TCP implementor decided that the > best reaction on receiving a "slow down" message was to immediately retransmit > the datagram that had just been confirmed to bediscarded.  "Slow down" > actually meant "Speed up, I threw away your last datagram." but you first have to find out that the packet didn't arrive by not getting the ack before the timeout, and in the meantime, you don't send more than your transmit window. When you retransmit the missing packet, you still have your window full until you get an ack for that packet (and you are supposed to shrink your window size when a packet is lost or you get an ECN signal) so at a micro level, you are generating more traffic, but at a macro level you are slowing down. yes, misbehaving stacks can send too much, but that will just mean that more packets on the offending connections get dropped. In terms of packet generators, you can never get perfect defense against pure bandwidth flooding, but if you use cake-like mechanisms to ensure fairness between IPs/customers/etc you limit the damage > So, I'm still curious about the Internet behavior with the current mechanisms > when the system hits its maximum capacity - the two simple scenarios I > mentioned with bottlenecks and only two data flowsinvolved that converged at > the bottleneck.   What's supposed to happen in theory?   Are implementations > actually doing what they're supposed to do?  What does happen in a real-world > test? as noted above, the vast majority of the time, the link that hits maximum capacity is the last-mile hop to the user rather than some ISP <-> ISP hop out in the middle of the Internet. fq_codel is pretty cheap to implement (cake is a bit more expensive, so more suitable for the endpoints than core systems). when trying to define what 'the right thing to do' should be, it's extremely tempting for academic studies to fall into the trap of deciding what should happen based on global knowledge about the entire network. fq_codel and cake work by just looking at the data being fed to the congested link (well, cake at a last-mile hop can take advantage of some categorization rules/lookups that would not be avaialble to core Internet routers) but I think the short answer to your scenario is 'if it would exceed your queue limits, drop a packet from the connection sending the most data' a little bit of buffering is a good thing, the key is to keep the buffers from building up and affecting other connections David Lang