From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from shards.monkeyblade.net (shards.monkeyblade.net [184.105.139.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id A825C3B29E for ; Wed, 23 May 2018 14:44:44 -0400 (EDT) Received: from localhost (pool-173-77-163-54.nycmny.fios.verizon.net [173.77.163.54]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) (Authenticated sender: davem-davemloft) by shards.monkeyblade.net (Postfix) with ESMTPSA id 7DC31144E1545; Wed, 23 May 2018 11:44:43 -0700 (PDT) Date: Wed, 23 May 2018 14:44:42 -0400 (EDT) Message-Id: <20180523.144442.864194409238516747.davem@davemloft.net> To: toke@toke.dk Cc: netdev@vger.kernel.org, cake@lists.bufferbloat.net, netfilter-devel@vger.kernel.org From: David Miller In-Reply-To: <152699745846.21931.4558451708304709296.stgit@alrua-kau> References: <152699741881.21931.11656377745581563912.stgit@alrua-kau> <152699745846.21931.4558451708304709296.stgit@alrua-kau> X-Mailer: Mew version 6.7 on Emacs 25.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.12 (shards.monkeyblade.net [149.20.54.216]); Wed, 23 May 2018 11:44:43 -0700 (PDT) Subject: Re: [Cake] [PATCH net-next v15 4/7] sch_cake: Add NAT awareness to packet classifier X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 May 2018 18:44:44 -0000 From: Toke H=F8iland-J=F8rgensen Date: Tue, 22 May 2018 15:57:38 +0200 > When CAKE is deployed on a gateway that also performs NAT (which is a= > common deployment mode), the host fairness mechanism cannot distingui= sh > internal hosts from each other, and so fails to work correctly. > = > To fix this, we add an optional NAT awareness mode, which will query = the > kernel conntrack mechanism to obtain the pre-NAT addresses for each p= acket > and use that in the flow and host hashing. > = > When the shaper is enabled and the host is already performing NAT, th= e cost > of this lookup is negligible. However, in unlimited mode with no NAT = being > performed, there is a significant CPU cost at higher bandwidths. For = this > reason, the feature is turned off by default. > = > Cc: netfilter-devel@vger.kernel.org > Signed-off-by: Toke H=F8iland-J=F8rgensen This is really pushing the limits of what a packet scheduler can require for correct operation. And this creates an incredibly ugly dependency. I'd much rather you do something NAT method agnostic, like save or compute the necessary information on ingress and then later use it on egress. Because what you have here will completely break when someone does NAT using eBPF, act_nat, or similar. There is even skb->rxhash, be creative :-)