From: "Toke Høiland-Jørgensen" <toke@toke.dk>
To: netdev@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org, cake@lists.bufferbloat.net
Subject: [Cake] [PATCH net-next v19 5/8] sch_cake: Add NAT awareness to packet classifier
Date: Fri, 06 Jul 2018 17:37:19 +0200	[thread overview]
Message-ID: <153089143943.14813.7164191512811411579.stgit@alrua-x1> (raw)
In-Reply-To: <153089141290.14813.1010951705929696896.stgit@alrua-x1>
When CAKE is deployed on a gateway that also performs NAT (which is a
common deployment mode), the host fairness mechanism cannot distinguish
internal hosts from each other, and so fails to work correctly.
To fix this, we add an optional NAT awareness mode, which will query the
kernel conntrack mechanism to obtain the pre-NAT addresses for each packet
and use that in the flow and host hashing.
When the shaper is enabled and the host is already performing NAT, the cost
of this lookup is negligible. However, in unlimited mode with no NAT being
performed, there is a significant CPU cost at higher bandwidths. For this
reason, the feature is turned off by default.
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 net/sched/sch_cake.c |   51 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 49 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
index 930096d46c4f..633ca1578114 100644
--- a/net/sched/sch_cake.c
+++ b/net/sched/sch_cake.c
@@ -71,6 +71,10 @@
 #include <net/tcp.h>
 #include <net/flow_dissector.h>
 
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+#include <net/netfilter/nf_conntrack_core.h>
+#endif
+
 #define CAKE_SET_WAYS (8)
 #define CAKE_MAX_TINS (8)
 #define CAKE_QUEUES (1024)
@@ -516,6 +520,29 @@ static bool cobalt_should_drop(struct cobalt_vars *vars,
 	return drop;
 }
 
+static void cake_update_flowkeys(struct flow_keys *keys,
+				 const struct sk_buff *skb)
+{
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+	struct nf_conntrack_tuple tuple = {};
+	bool rev = !skb->_nfct;
+
+	if (tc_skb_protocol(skb) != htons(ETH_P_IP))
+		return;
+
+	if (!nf_ct_get_tuple_skb(&tuple, skb))
+		return;
+
+	keys->addrs.v4addrs.src = rev ? tuple.dst.u3.ip : tuple.src.u3.ip;
+	keys->addrs.v4addrs.dst = rev ? tuple.src.u3.ip : tuple.dst.u3.ip;
+
+	if (keys->ports.ports) {
+		keys->ports.src = rev ? tuple.dst.u.all : tuple.src.u.all;
+		keys->ports.dst = rev ? tuple.src.u.all : tuple.dst.u.all;
+	}
+#endif
+}
+
 /* Cake has several subtle multiple bit settings. In these cases you
  *  would be matching triple isolate mode as well.
  */
@@ -543,6 +570,9 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
 	skb_flow_dissect_flow_keys(skb, &keys,
 				   FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL);
 
+	if (flow_mode & CAKE_FLOW_NAT_FLAG)
+		cake_update_flowkeys(&keys, skb);
+
 	/* flow_hash_from_keys() sorts the addresses by value, so we have
 	 * to preserve their order in a separate data structure to treat
 	 * src and dst host addresses as independently selectable.
@@ -1939,12 +1969,25 @@ static int cake_change(struct Qdisc *sch, struct nlattr *opt,
 	if (err < 0)
 		return err;
 
+	if (tb[TCA_CAKE_NAT]) {
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+		q->flow_mode &= ~CAKE_FLOW_NAT_FLAG;
+		q->flow_mode |= CAKE_FLOW_NAT_FLAG *
+			!!nla_get_u32(tb[TCA_CAKE_NAT]);
+#else
+		NL_SET_ERR_MSG_ATTR(extack, tb[TCA_CAKE_NAT],
+				    "No conntrack support in kernel");
+		return -EOPNOTSUPP;
+#endif
+	}
+
 	if (tb[TCA_CAKE_BASE_RATE64])
 		q->rate_bps = nla_get_u64(tb[TCA_CAKE_BASE_RATE64]);
 
 	if (tb[TCA_CAKE_FLOW_MODE])
-		q->flow_mode = (nla_get_u32(tb[TCA_CAKE_FLOW_MODE]) &
-				CAKE_FLOW_MASK);
+		q->flow_mode = ((q->flow_mode & CAKE_FLOW_NAT_FLAG) |
+				(nla_get_u32(tb[TCA_CAKE_FLOW_MODE]) &
+					CAKE_FLOW_MASK));
 
 	if (tb[TCA_CAKE_RTT]) {
 		q->interval = nla_get_u32(tb[TCA_CAKE_RTT]);
@@ -2111,6 +2154,10 @@ static int cake_dump(struct Qdisc *sch, struct sk_buff *skb)
 	if (nla_put_u32(skb, TCA_CAKE_ACK_FILTER, q->ack_filter))
 		goto nla_put_failure;
 
+	if (nla_put_u32(skb, TCA_CAKE_NAT,
+			!!(q->flow_mode & CAKE_FLOW_NAT_FLAG)))
+		goto nla_put_failure;
+
 	return nla_nest_end(skb, opts);
 
 nla_put_failure:
next prev parent reply	other threads:[~2018-07-06 15:37 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-06 15:37 [Cake] [PATCH net-next v19 0/8] sched: Add Common Applications Kept Enhanced (cake) qdisc Toke Høiland-Jørgensen
2018-07-06 15:37 ` [Cake] [PATCH net-next v19 7/8] sch_cake: Add overhead compensation support to the rate shaper Toke Høiland-Jørgensen
2018-07-06 15:37 ` [Cake] [PATCH net-next v19 3/8] sch_cake: Add optional ACK filter Toke Høiland-Jørgensen
2018-07-06 15:37 ` [Cake] [PATCH net-next v19 4/8] netfilter: Add nf_ct_get_tuple_skb global lookup function Toke Høiland-Jørgensen
2018-07-06 15:37 ` [Cake] [PATCH net-next v19 6/8] sch_cake: Add DiffServ handling Toke Høiland-Jørgensen
2018-07-06 15:37 ` Toke Høiland-Jørgensen [this message]
2018-07-06 15:37 ` [Cake] [PATCH net-next v19 8/8] sch_cake: Conditionally split GSO segments Toke Høiland-Jørgensen
2018-07-06 15:37 ` [Cake] [PATCH net-next v19 1/8] sched: Add Common Applications Kept Enhanced (cake) qdisc Toke Høiland-Jørgensen
2018-07-06 15:37 ` [Cake] [PATCH net-next v19 2/8] sch_cake: Add ingress mode Toke Høiland-Jørgensen
2018-07-11  5:56 ` [Cake] [PATCH net-next v19 0/8] sched: Add Common Applications Kept Enhanced (cake) qdisc David Miller
2018-07-11 20:40   ` Toke Høiland-Jørgensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
  List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=153089143943.14813.7164191512811411579.stgit@alrua-x1 \
    --to=toke@toke.dk \
    --cc=cake@lists.bufferbloat.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox