[Cake] de-natting & host fairness
kevin at darbyshire-bryant.me.uk
Mon Sep 26 09:02:59 EDT 2016
Hi Sebastian et al,
I'm feeling a bit unwell at the moment with an eye infection and I'm
working nights on some tennis coverage for TV so the brain cell is
It is indeed the missing puzzle piece and represents something of a holy
grail for my use case. A *lot* of credit has to go to 'tegularius' who
took the idea and ran with it after I'd given up. My only consolation
is that the methods are broadly similar, the current implementation is
so much neater and obviously written in a more kernel/conntrack
knowledgeable way (based on net/sched/cls_flow.c)
This really needs to be tested. As I mentioned the 'ingress' side of
things is harder work because the kernel hasn't filled in the conntrack
pointer for us. There are some remaining concerns over how reliable our
own lookup actually is. The conntrack entry 'direction' is apparently
determined by where it is seen first, there are then 2 tuples created in
the 'original' and 'reverse' directions. This made me think that a
connection initiated by the router vs a connection initiated from
outside into it (even if natted) would have the src & destination fields
swapped...however in my limited testing 'who started the connection'
appeared to make no difference. But conntrack makes my brain cell hurt.
I'm sure there are people on this list who are a) much cleverererer than
me and b) know conntrack upside down & backwards. Help is as ever
Regarding IPv6 vs IPv4: As it currently stands the code does conntrack
lookups for both so if someone is translating IPv6 addresses then we
know about it. I'm now thinking about making IPv6 lookups a runtime
option (default off) From a flow/host fairness point of view I really
don't care if a one to one address translation has occurred...and if
someone really does implement a 'masquerading many hosts behind one IPv6
address' environment...and they still want per IP & per flow fairness
then unmentionable things should be done to them.
I'm not a fan of de-natting by default. Per IP fairness is not the
default and requires at least one of the 'dual-???host' or
'triple-isolate' options to be relevant. I've also concerns on CPU usage.
CPU usage is difficult to quantify. As a rough guestimate my Archer C7
used about 10% cpu per megabit. I'd say that has gone up by 2% percent
with this change, so it is heavy!
The code is out there, if you've an itch...scratch it :-) Fork it,
improve it etc but please don't think I'm any sort of kernel guru :-)
Incidentally, an obvious gaming of this: A host that has both IPv4 & v6
addresses can get at least double the bandwidth than a host with only
one of them, it's per IP fairness really, not per host.
On 26/09/16 09:54, moeller0 wrote:
> Hi Kevin,
> this is like the missing puzzle piece, if you solved this, most home users might end up deep in your debt (without them realizing it of course).
> Question, if I enable this on my link how will it deal with the typical differences between IPv4 and IPv6? I believe that the situation I have at home, NAT for IPv4 but no NAT for IPv6 (or if NAT, at least NAT with identifying last 64 bits of the IPv6 addresses, no port remapping games) is quite common now a days. I assume it will do the right thing for IPv4 but will it still do the right thing for IPv6 flows as well? And what if for $DEITY’s sake someone would insist on using a port-remapping NAT on IPv6?
> If, what I assume it will do the right thing by default, I would vote for enabling this by default and introduce keywords to disable this if required (in what I assume to be one of cake’s main ideas use reasonable defaults that in general do the right thing, but also allow crazy stuff if need be).
> Do you have any idea how expensive this is computationally? I realize that this is a tad hard to measure as cake will not simply reduce the available bandwidth when running out of CPU cycles but first will allow the latency to increase.
> Best Regards
>> On Sep 26, 2016, at 05:20 , Kevin Darbyshire-Bryant <kevin at darbyshire-bryant.me.uk> wrote:
>> A while back I started on a quest to make cake 'nat' aware as the lack of host fairness in a typical home router environment was the only thing that prevented cake from being the ultimate qdisc in my opinion. This involves dealing with conntrack which on egress is easy (the kernel fills in a data structure for us), ingress is less clear. I hacked something together but wasn't really happy with it.
>> Another github user 'tegularius' presented some beautifully crafted code that did the lookups in a much neater way. Originally it too had an 'ingress' lookup problem. This was worked on and I hacked some conditional 'denat' options into cake & tc.
>> For your 'delight' a denat cake https://github.com/kdarbyshirebryant/sch_cake/tree/natoptions along with a matching tc https://github.com/kdarbyshirebryant/tc-adv/tree/denat
>> Typically I use 'dual-srchost srcnat' options on the egress interface, with 'dual-dsthost dstnat' in the ingress ifb interface. In *brief* testing, bandwidth is shared fairly between hosts, and fairly by flow within each host. And it's not crashed yet.
>> Cake mailing list
>> Cake at lists.bufferbloat.net
More information about the Cake