[Cake] de-natting & host fairness
moeller0 at gmx.de
Mon Sep 26 09:28:20 EDT 2016
> On Sep 26, 2016, at 15:02 , Kevin Darbyshire-Bryant <kevin at darbyshire-bryant.me.uk> wrote:
> Hi Sebastian et al,
> I'm feeling a bit unwell at the moment with an eye infection and I'm working nights on some tennis coverage for TV so the brain cell is somewhat addled.
> It is indeed the missing puzzle piece and represents something of a holy grail for my use case. A *lot* of credit has to go to 'tegularius' who took the idea and ran with it after I'd given up. My only consolation is that the methods are broadly similar, the current implementation is so much neater and obviously written in a more kernel/conntrack knowledgeable way (based on net/sched/cls_flow.c)
> This really needs to be tested. As I mentioned the 'ingress' side of things is harder work because the kernel hasn't filled in the conntrack pointer for us. There are some remaining concerns over how reliable our own lookup actually is. The conntrack entry 'direction' is apparently determined by where it is seen first, there are then 2 tuples created in the 'original' and 'reverse' directions. This made me think that a connection initiated by the router vs a connection initiated from outside into it (even if natted) would have the src & destination fields swapped...however in my limited testing 'who started the connection' appeared to make no difference. But conntrack makes my brain cell hurt.
Does that mean an initial packet(s) for a flow will be “misclassified” (not really since there should be no record yet to snatch the translated IP from) do all those initially non-classified packets end up in the same bin? (I guess even if that should not matter too much unless in extreme situations and those merit extreme reactions anyways)
> I'm sure there are people on this list who are a) much cleverererer than me and b) know conntrack upside down & backwards. Help is as ever gratefully received.
> Regarding IPv6 vs IPv4: As it currently stands the code does conntrack lookups for both so if someone is translating IPv6 addresses then we know about it. I’m now thinking about making IPv6 lookups a runtime option (default off)
That would allow to easily measure the cost of those lookups.
> From a flow/host fairness point of view I really don't care if a one to one address translation has occurred...and if someone really does implement a 'masquerading many hosts behind one IPv6 address' environment...and they still want per IP & per flow fairness then unmentionable things should be done to them.
> I'm not a fan of de-natting by default. Per IP fairness is not the default and requires at least one of the 'dual-???host' or 'triple-isolate' options to be relevant. I’ve also concerns on CPU usage.
Mmh, I would have thought that even for srchost and dsthost (note no dual) it would make sense to allow to deNAT? If we default to deNAT we might also default to triple-isolate, assuming that it actually works… Cake offers to refine the hashing for discerning users, but for everybody else we should pick well working defaults. Cake not being upstream yet is a virtue as we will not need to argue against the “no unexpected surprise behavior change” policy that seems to be used in the kernel (no argument from my side, for the kernel that seems a good policy, but we still can try to upstream the most useful deaults for “my mom”).
> CPU usage is difficult to quantify. As a rough guestimate my Archer C7 used about 10% cpu per megabit. I’d say that has gone up by 2% percent with this change, so it is heavy!
That is a tad high; maybe too high for making it a default but still it would be nice having a qdisc that by default does what naive users expect a(ll) qdisc(s) to do ;)
> The code is out there, if you’ve an itch...scratch it :-) Fork it, improve it etc but please don't think I'm any sort of kernel guru :-)
Yepp, I really need to get my own LEDE builds going so I can start playing around with that again. (I am slow with that as my typical use cases at home work pretty well with what we have right now; and I somehow don’t want to start with heavy bit-torrenting (how many debian DVD images could I actually ever need?) or windows10 updates).
> Incidentally, an obvious gaming of this: A host that has both IPv4 & v6 addresses can get at least double the bandwidth than a host with only one of them, it’s per IP fairness really, not per host.
That is pretty much our new IPv6 world, per-IP fairness might actually not be the kind of guarantee we actually want, but I assume it is the only one we can expect to get (IPv6 privacy addressing alone will bestow a flock of IP addresses (that changes over time) to each active host).
> On 26/09/16 09:54, moeller0 wrote:
>> Hi Kevin,
>> this is like the missing puzzle piece, if you solved this, most home users might end up deep in your debt (without them realizing it of course).
>> Question, if I enable this on my link how will it deal with the typical differences between IPv4 and IPv6? I believe that the situation I have at home, NAT for IPv4 but no NAT for IPv6 (or if NAT, at least NAT with identifying last 64 bits of the IPv6 addresses, no port remapping games) is quite common now a days. I assume it will do the right thing for IPv4 but will it still do the right thing for IPv6 flows as well? And what if for $DEITY’s sake someone would insist on using a port-remapping NAT on IPv6?
>> If, what I assume it will do the right thing by default, I would vote for enabling this by default and introduce keywords to disable this if required (in what I assume to be one of cake’s main ideas use reasonable defaults that in general do the right thing, but also allow crazy stuff if need be).
>> Do you have any idea how expensive this is computationally? I realize that this is a tad hard to measure as cake will not simply reduce the available bandwidth when running out of CPU cycles but first will allow the latency to increase.
>> Best Regards
>>> On Sep 26, 2016, at 05:20 , Kevin Darbyshire-Bryant <kevin at darbyshire-bryant.me.uk> wrote:
>>> A while back I started on a quest to make cake 'nat' aware as the lack of host fairness in a typical home router environment was the only thing that prevented cake from being the ultimate qdisc in my opinion. This involves dealing with conntrack which on egress is easy (the kernel fills in a data structure for us), ingress is less clear. I hacked something together but wasn't really happy with it.
>>> Another github user 'tegularius' presented some beautifully crafted code that did the lookups in a much neater way. Originally it too had an 'ingress' lookup problem. This was worked on and I hacked some conditional 'denat' options into cake & tc.
>>> For your 'delight' a denat cake https://github.com/kdarbyshirebryant/sch_cake/tree/natoptions along with a matching tc https://github.com/kdarbyshirebryant/tc-adv/tree/denat
>>> Typically I use 'dual-srchost srcnat' options on the egress interface, with 'dual-dsthost dstnat' in the ingress ifb interface. In *brief* testing, bandwidth is shared fairly between hosts, and fairly by flow within each host. And it's not crashed yet.
>>> Cake mailing list
>>> Cake at lists.bufferbloat.net
More information about the Cake