[Cake] Advantages to tightly tuning latency

Thu Apr 23 08:33:29 EDT 2020

On Thursday 23 Apr 2020 à 13:57:25 (+0200), Toke Høiland-Jørgensen wrote:

Hello Toke,

> That is awesome! Please make sure you include the AQL patch for ath10k,
> it really works wonders, as Dave demonstrated:
> 
> https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-March/002721.html

Was it in 5.4 ? we try to stick to LTS kernel

> We're working on that in kernel land - ever heard of XDP? On big-iron
> servers we have no issues pushing 10s and 100s of Gbps in software
> (well, the latter only given enough cores to throw at the problem :)).
> There's not a lot of embedded platforms support as of yet, but we do
> have some people in the ARM world working on that.
> 
> Personally, I do see embedded platforms as an important (future) use
> case for XDP, though, in particular for CPEs. So I would be very
> interested in hearing details about your particular platform, and your
> DPDK solution, so we can think about what it will take to achieve the
> same with XDP. If you're interested in this, please feel free to reach
> out :)

Last time I looked at XDP, its primary use cases were "early drop" /
"anti ddos".

In our case, each packet has to be routed+NAT, we have VLAN tags, we
also have MAP-E for IPv4 traffic. So in the vanilla forwading path,
this does multiple rounds of RX/TX because of tunneling.

TBH, the hard work in our optimized forwarding code is figuring out
what modifications to apply to each packets. Now whether modifications
and tx would be done by XDP or by hand written C code in the kernel is
more of a detail, even though using XDP is much cleaner of course.

What the kernel always lacked is what DaveM called once the "grand
unified flow cache", the ability to do a single lookup and be able to
decide what to do with the packet. Instead we have the bridge
forwarding table, the ip routing table (used to be a cache), the
netfilter conntrack lookup, and multiple round of those if you do
tunneling.

Once you have this "flow table" infrastructure, it becomes easy to
offload forwarding, either to real hardware, or software (for example,
dedicate a CPU core in polling mode)

The good news is that it seems nftables is building this:

https://wiki.nftables.org/wiki-nftables/index.php/Flowtable

I'm still using iptables, but it seems that the features I was missing
like TCPMSS are now in nft also, so I will have a look.

> Setting aside the fact that those single-stream tests ought to die a
> horrible death, I do wonder if it would be feasible to do a bit of
> 'optimising for the test'? With XDP we do have the ability to steer
> packets between CPUs based on arbitrary criteria, and while it is not as
> efficient as hardware-based RSS it may be enough to achieve line rate
> for a single TCP flow?

You cannot do steering for a single TCP flow at those rates because
you will get out-of-order packets and kill TCP performance.

I do not consider those single-stream tests to be unrealistic, this is
exactly what happen if say you buy a game on Steam and download it.

-- 
Maxime