From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id EDB053B2A0 for ; Mon, 26 Sep 2016 09:28:24 -0400 (EDT) Received: from [172.17.3.48] ([134.76.241.253]) by mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id 0Lu7ty-1aqk5b3U2A-011Qiu; Mon, 26 Sep 2016 15:28:21 +0200 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) From: moeller0 In-Reply-To: <76bde2de-b6da-60d1-6daa-b5346cc7c185@darbyshire-bryant.me.uk> Date: Mon, 26 Sep 2016 15:28:20 +0200 Cc: cake@lists.bufferbloat.net Content-Transfer-Encoding: quoted-printable Message-Id: <3DE6B4CF-ABFF-4B7E-855F-CEEBF80C8EB2@gmx.de> References: <3a99770e-6350-471f-72b6-b209d7d77d75@darbyshire-bryant.me.uk> <76bde2de-b6da-60d1-6daa-b5346cc7c185@darbyshire-bryant.me.uk> To: Kevin Darbyshire-Bryant X-Mailer: Apple Mail (2.2104) X-Provags-ID: V03:K0:0Z8EzwTJectRZOOxFxgGGuLFjDnk86t9CgLZIDqJrK76fXq7KAY FCRZZPoeIPUDBQ+rzZgYt0UEytpz2Ce2LoIkoDBZQXPpwxUdKaf/DYNpR4xqzu5w75iS2aB d5fpTiWRe7YVYzrTRBjdFUXFP3QE7dFrtJOOZA+Mwp99eCz6e3Vn1KdJpKykX4DL8ufg9PS +mMkG7Zi7zO/Xrliux8OA== X-UI-Out-Filterresults: notjunk:1;V01:K0:110A0pU7DWs=:/xUnap3IOHAOPPuTPnfi1T weBWV8X1dypA1nxWC0p9E/DvdaaUENZ5X825Vz8ztct7m2tc/QJwmSTQfCM5Xcw7lFla4X7LR RyPnQB6jjA0IdHoDJzbwNIe+oeYjED4Who+XIyAXM6WE6Pj4qD2c1Njmz73dVk4k1FQeYlWFn oE7elClhj7jOQyY+QMPsv7ZMcGLNAmmMkntBiJsyjCA+8ez5n/BEazeWhjjqfmiKUOjAN6lLJ goBAOrNJRVNDvjiqKHgQ3Tj6WZMWQdKZDrQRciRHvMu31VYziSYRGVAVCWbjVIt0j7trFgq4h dvk9I8CTA6To7mWlqxymXCgqCx/rHvjt7NvWnBQgK+aBXCy4Als1VjThyxr9CbWKNW6kw28Wm Rcrt4nqHPOAt2NChHElA95fmePAL/UE8nlLgQaa3pW5HxjcjacmjFxodaqTnID4WAhVSm1s3P SXVYt7yN6ArwkRpcfcdPt96D3kzhROWsyWCoJ41RYV39YNfhSa6RXUM0KQsnvhflDhGeS50Uk KFjkqxreMbZ7YNod1mRoLzGNUf6Yw8wBSRSe3xhLC/wOdSwZlwjV2GBKbm2nlLwjz2lTWKuPI O9u7iBP46fO4BQYO14XTSfs3fjYifavNVV2HTJ5vc/K5LcFw4swbkO2PaU3orL0sH+WAeQCML ONO3az+W0/1kXIC1w2oI4ytB2vQwZUAQreoVFZ/fTZzHSkhcfVvZgNU9jMVP6aghiLisFwW7q 8Jsie7Jj23xTaqg+kx799+1CZV0KXcyazb+ysJcp9+UTe7V5wnY8EeUKn5ijrgv31PNer/+lx 40NgfN2 Subject: Re: [Cake] de-natting & host fairness X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Sep 2016 13:28:25 -0000 Hi Kevin, > On Sep 26, 2016, at 15:02 , Kevin Darbyshire-Bryant = wrote: >=20 > Hi Sebastian et al, >=20 > I'm feeling a bit unwell at the moment with an eye infection and I'm = working nights on some tennis coverage for TV so the brain cell is = somewhat addled. >=20 > It is indeed the missing puzzle piece and represents something of a = holy grail for my use case. A *lot* of credit has to go to 'tegularius' = who took the idea and ran with it after I'd given up. My only = consolation is that the methods are broadly similar, the current = implementation is so much neater and obviously written in a more = kernel/conntrack knowledgeable way (based on net/sched/cls_flow.c) Ah. >=20 > This really needs to be tested. As I mentioned the 'ingress' side of = things is harder work because the kernel hasn't filled in the conntrack = pointer for us. There are some remaining concerns over how reliable our = own lookup actually is. The conntrack entry 'direction' is apparently = determined by where it is seen first, there are then 2 tuples created in = the 'original' and 'reverse' directions. This made me think that a = connection initiated by the router vs a connection initiated from = outside into it (even if natted) would have the src & destination fields = swapped...however in my limited testing 'who started the connection' = appeared to make no difference. But conntrack makes my brain cell hurt. Does that mean an initial packet(s) for a flow will be = =E2=80=9Cmisclassified=E2=80=9D (not really since there should be no = record yet to snatch the translated IP from) do all those initially = non-classified packets end up in the same bin? (I guess even if that = should not matter too much unless in extreme situations and those merit = extreme reactions anyways) >=20 > I'm sure there are people on this list who are a) much cleverererer = than me and b) know conntrack upside down & backwards. Help is as ever = gratefully received. >=20 > Regarding IPv6 vs IPv4: As it currently stands the code does = conntrack lookups for both so if someone is translating IPv6 addresses = then we know about it. I=E2=80=99m now thinking about making IPv6 = lookups a runtime option (default off) =20 That would allow to easily measure the cost of those lookups. > =46rom a flow/host fairness point of view I really don't care if a one = to one address translation has occurred...and if someone really does = implement a 'masquerading many hosts behind one IPv6 address' = environment...and they still want per IP & per flow fairness then = unmentionable things should be done to them. >=20 > I'm not a fan of de-natting by default. Per IP fairness is not the = default and requires at least one of the 'dual-???host' or = 'triple-isolate' options to be relevant. I=E2=80=99ve also concerns on = CPU usage. Mmh, I would have thought that even for srchost and dsthost = (note no dual) it would make sense to allow to deNAT? If we default to = deNAT we might also default to triple-isolate, assuming that it actually = works=E2=80=A6 Cake offers to refine the hashing for discerning users, = but for everybody else we should pick well working defaults. Cake not = being upstream yet is a virtue as we will not need to argue against the = =E2=80=9Cno unexpected surprise behavior change=E2=80=9D policy that = seems to be used in the kernel (no argument from my side, for the kernel = that seems a good policy, but we still can try to upstream the most = useful deaults for =E2=80=9Cmy mom=E2=80=9D). >=20 > CPU usage is difficult to quantify. As a rough guestimate my Archer = C7 used about 10% cpu per megabit. I=E2=80=99d say that has gone up by = 2% percent with this change, so it is heavy! That is a tad high; maybe too high for making it a default but = still it would be nice having a qdisc that by default does what naive = users expect a(ll) qdisc(s) to do ;) >=20 > The code is out there, if you=E2=80=99ve an itch...scratch it :-) = Fork it, improve it etc but please don't think I'm any sort of kernel = guru :-) Yepp, I really need to get my own LEDE builds going so I can = start playing around with that again. (I am slow with that as my typical = use cases at home work pretty well with what we have right now; and I = somehow don=E2=80=99t want to start with heavy bit-torrenting (how many = debian DVD images could I actually ever need?) or windows10 updates). >=20 > Incidentally, an obvious gaming of this: A host that has both IPv4 & = v6 addresses can get at least double the bandwidth than a host with only = one of them, it=E2=80=99s per IP fairness really, not per host. That is pretty much our new IPv6 world, per-IP fairness might = actually not be the kind of guarantee we actually want, but I assume it = is the only one we can expect to get (IPv6 privacy addressing alone will = bestow a flock of IP addresses (that changes over time) to each active = host). Best Regards Sebastian >=20 > Kevin >=20 >=20 > On 26/09/16 09:54, moeller0 wrote: >> Hi Kevin, >>=20 >> this is like the missing puzzle piece, if you solved this, most home = users might end up deep in your debt (without them realizing it of = course). >> Question, if I enable this on my link how will it deal with the = typical differences between IPv4 and IPv6? I believe that the situation = I have at home, NAT for IPv4 but no NAT for IPv6 (or if NAT, at least = NAT with identifying last 64 bits of the IPv6 addresses, no port = remapping games) is quite common now a days. I assume it will do the = right thing for IPv4 but will it still do the right thing for IPv6 flows = as well? And what if for $DEITY=E2=80=99s sake someone would insist on = using a port-remapping NAT on IPv6? >> If, what I assume it will do the right thing by default, I would vote = for enabling this by default and introduce keywords to disable this if = required (in what I assume to be one of cake=E2=80=99s main ideas use = reasonable defaults that in general do the right thing, but also allow = crazy stuff if need be). >> Do you have any idea how expensive this is computationally? I realize = that this is a tad hard to measure as cake will not simply reduce the = available bandwidth when running out of CPU cycles but first will allow = the latency to increase. >>=20 >> Best Regards >> Sebastian >>=20 >>> On Sep 26, 2016, at 05:20 , Kevin Darbyshire-Bryant = wrote: >>>=20 >>> Greetings! >>>=20 >>> A while back I started on a quest to make cake 'nat' aware as the = lack of host fairness in a typical home router environment was the only = thing that prevented cake from being the ultimate qdisc in my opinion. = This involves dealing with conntrack which on egress is easy (the kernel = fills in a data structure for us), ingress is less clear. I hacked = something together but wasn't really happy with it. >>>=20 >>> Another github user 'tegularius' presented some beautifully crafted = code that did the lookups in a much neater way. Originally it too had = an 'ingress' lookup problem. This was worked on and I hacked some = conditional 'denat' options into cake & tc. >>>=20 >>> For your 'delight' a denat cake = https://github.com/kdarbyshirebryant/sch_cake/tree/natoptions along with = a matching tc https://github.com/kdarbyshirebryant/tc-adv/tree/denat >>>=20 >>> Typically I use 'dual-srchost srcnat' options on the egress = interface, with 'dual-dsthost dstnat' in the ingress ifb interface. In = *brief* testing, bandwidth is shared fairly between hosts, and fairly by = flow within each host. And it's not crashed yet. >>>=20 >>> Kevin >>> _______________________________________________ >>> Cake mailing list >>> Cake@lists.bufferbloat.net >>> https://lists.bufferbloat.net/listinfo/cake >>=20