Ingress mode works by counting dropped packets, not only delivered packets, against the shaped limit. When there's a large number of non-ECN flows and a low BDP per flow, a lot of packets are dropped to try and keep the intra-flow latency in line. So the goodput tends to decrease when the flow count increases, but this is necessary to control latency.

The modified failsafe ensures that at most a third of the total bandwidth will "go missing" this way. Previously, as much as three-quarters could. At that threshold, Cake stops counting dropped packets, trading a reduction in latency control for maintaining reasonable goodput. There is no more sophisticated heuristic that I can think of to achieve ingress mode's goals.

However, it might be worth revisiting an old question once raised over fq_codel's use of a fixed set of Codel parameters regardless of active flow count. It was then argued that the delay target wasn't dependent on the flow count.

But when the flow count is high, a fixed delay target plus the baseline latency might end up requiring a lower BDP than the sender is able to select as a congestion window (typical TCPs have a hard lower limit of 4x MSS). In that case, currently packets are being dropped for no effect on send rate. This wouldn't matter with ECN, of course.

So a better fix might be to adjust the target latency according to the number of active bulk flows. Fortunately for performance, this should be a multiply, not a division.

- Jonathan Morton