[Cake] Using cake to shape 1000’s of users.

Fri Jul 27 14:58:40 EDT 2018

> On 27 Jul, 2018, at 5:04 pm, Dan Siemon <dan at coverfire.com> wrote:
> 
> Obviously I can't speak for other potential users but we follow the
> upstream kernel very aggressively and have no interest in porting
> something like this to older kernels.

That's a useful data point.  Honestly it shouldn't be too difficult to install the latest kernels on a dedicated box in general.

> We have some deployments with multiple access technologies (eg DOCSIS,
> DSL and wireless) behind the same box so per customer overhead would be
> useful.

The design I presently have in mind would allow setting the overhead per speed tier.  Thus you could have "10Mbit DOCSIS" and "10Mbit ADSL" as separate tiers.  That is, there's likely to be far fewer unique overhead settings than customers.

> I am interested in the diffserv class ideas in Cake but need to do some
> experimentation to see if that helps in this environment. It would be
> interesting to be able to target sub-classes (not sure what the proper
> Cake terminology is) based on a eBPF classifier.

The trick with Diffserv is classifying the traffic correctly, given that most traffic in the wild is not properly marked, and some actively tries to hide its nature.  As an ISP, applying your own rules could be seen as questionable according to some interpretations of Net Neutrality.  With Cake that's not an issue by default, since it relies entirely on the DSCP on the packet itself, but it does mean that a lot of traffic which *should* probably be classified other than best-effort is not.

Cake does cope well with unclassified latency-sensitive traffic, due to DRR++, so it's not actually necessary to specifically identify such applications.  What it has trouble with, in common with all other flow-isolating qdiscs, is applications which open many parallel connections for bulk transfer (eg. BitTorrent, which is used by many software-update mechanisms as well as file-sharing).  Classifying this traffic as Bulk (CS1 DSCP) gets it out of the way of other applications while still permitting them to use available capacity.

I have some small hope that wider deployment of sensible, basic Diffserv at bottlenecks will encourage applications to start using it as intended, thereby solving a decades-long chicken-egg problem.  Presently Cake has such an implementation by default, but is not yet widely deployed enough to have any visible impact.

This is something we should probably discuss in more detail later.

> Re accounting, we currently count per-IP via eBPF not via qdisc counts.

Fine, that's a feature I'm happy to leave out.

> Each subscriber can have several IPs in which case the traffic for
> those IPs needs to go to a single class so that their entire traffic
> envelope is shaped to the desired plan rate. At present we do this via
> eBPF since it is essentially a map lookup operation.
> 
> A few of the above points touch on something that may be somewhat
> unique service provider deployments vs homes, there are many situations
> where the classification logic has to be a map lookup and cannot be
> done just by looking at a given packet.

Yes, eBPF does seem to be a good fit for that.

So in summary, the logical flow of a packet should be:

1: Map dst or src IP to subscriber (eBPF).
2: Map subscriber to speed/overhead tier (eBPF).
3: (optional) Classify Diffserv (???).
4: Enqueue per flow, handle global queue overflow (rare!) by dropping from head of longest queue (like Cake).
--- enqueue/dequeue split ---
5: Wait for shaper on earliest-scheduled subscriber link with waiting traffic (borrow sch_fq's rbtree?).
6: Wait for shaper on aggregate backhaul link (congestion can happen here too).
7: Choose one of subscriber's queues, apply AQM and deliver a packet (mostly like Cake does).

If that seems reasonable, we can treat it as a baseline spec for others' input.

 - Jonathan Morton