[Cake] cake exploration
dave.taht at gmail.com
Sat Apr 11 14:44:39 EDT 2015
Stuff on my backlog of researchy stuff.
1) cake_drop_monitor - I wanted a way to throw drop AND mark
notifications up to userspace,
including the packet´s time of entry and the time of drop, as well as
the IP headers
and next hop destination macaddr.
There are many use cases for this:
A) - testing the functionality of the algorithm and being able to
collect and analyze drops as they happen.
NET_DROP_MONITOR did not cut it but I have not looked at it in a year.
It drives me crazy to be dropping packets all over the system and to
not be able to track down where they happened.
This is the primary reason why I had switched back to 64 bit timestamps, btw.
B) Having the drop notifications might be useful in tuning or steering
traffic to different routes.
C) It is way easier to do a graph of the drop pattern with this info
thrown to userspace.
2) Dearly wanted to actually be doing the timestamping AND hashing in
the native skb
struct on entry to the system itself, not the qdisc. Measuring the
latency from ingress from the
wire to egress would result in much better cpu overload behavior. I am
totally aware of
how much mainline linux would not take this option, but things have
evolved over there, so
leveraging the rxhash and skb->timestamp fields seems a possibility...
I think this would let us get along better with netem also, but would
have to go look again.
Call that cake-rxhash. :)
3) In my benchmark of the latest cake3, ecn traffic was not as good as
expected, but that might have been an anomoly of the test. Need to
test ecn thoroughly this time, almost in preference to looking at drop
behavior. Toke probably has ecn off by default right now. On, after
this test run?
4) Testing higher rates and looking at cwnd for codel is important.
The dropoff toke noted in his paper is real. Also there is possibly
some ideal ratio between number of flows and bandwidth that makes more
sense than a fixed number of flows. Also I keep harping on the darn
resumption algo... but need to test with lousier tcps like windows.
5) Byte Mode-ish handling
Dropping a single 64 byte packet does little good. You will find in
the 50 flow tests that a ton of traffic is acks, not being dropped,
and pie does better in this case than does fq, as it shoots
wildly at everything, but usually misses the fat packets, where DRR
will merrily store up an entire
MTU worth of useless acks when only one is needed.
So just trying to drop more little packets might be helpful in some cases.
6) Ack thinning. I gave what is conventionally called "stretch acks" a
new name, as stretch acks
have a deserved reputation as sucking. Well, they dont suck anymore in
linux, and what I was
mostly thinking was to drop no more than 2 in a row...
One thing this would help with is in packing wifi aggregates - which
have hard limits on the number of packets in a TXOP (42), and a byte
limit on wireless n of 64k. Sending 41 acks from
one flow, when you could send the last 2, seems like a big win on
packing a TXOP.
(this is something eric proposed, and given the drop rates we now see
from wifi and the wild and wooly internet I am inclined to agree that
it is worth fiddling with)
(I am not huge on it, though)
7) Macaddr hashing on the nexthop instead of the 5tuple. When used on
an internal, switched network, it would be better to try and maximize
the port usage rather than the 5 tuple in some cases.
I have never got around to writing a mac hash I liked, my goal
originally was to write one that found a minimal perfect hash solution
eventually as mac addrs tend to be pretty stable on a network and
Warning: minimal perfect hash attempts are a wet paint thing! I really
want a FPGA solver for them.... dont go play with the code out there,
you will lose days to it... you have been warned.
I would like there to be a generic mac hashing thing in tc, actually.
8) Parallel FIB lookup
IF you assume that you have tons of queues routing packets from
ingress to egress, on tons of cpus, you can actually do the FIB lookup
in parallel also. There is some old stuff on virtualqueue
and virtual clock fqing which makes for tighter
9) Need a codel *library* that works at the mac80211 layer. I think
codel*.h sufficies but am not sure. And for that matter, codel itself
seems like it would need a calculated target and a few other thing to
work right on wifi.
As for the hashing...
Personally I do not think that the 8 way set associative has is what
wifi needs for cake, I tend to think we need to "pack" aggregates with
as many different flows as possible, and randomize how we packet
them... I think.... maybe....
10) I really dont like BQL with multi-queued hardware queues. More
backpressure is needed in that case than we get.
11) GRO peeling
Let's make wifi fast, less jittery and reliable again!
More information about the Cake