Cake - FQ_codel the next generation
 help / color / mirror / Atom feed
From: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk>
To: "cake@lists.bufferbloat.net" <cake@lists.bufferbloat.net>
Subject: Re: [Cake] act_connmark + dscp
Date: Wed, 6 Mar 2019 18:40:20 +0000	[thread overview]
Message-ID: <6B530473-971A-4265-B94B-3595D39D57AF@darbyshire-bryant.me.uk> (raw)
In-Reply-To: <875zsw110r.fsf@toke.dk>

[-- Attachment #1: Type: text/plain, Size: 2970 bytes --]



> On 6 Mar 2019, at 15:21, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> 
> Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:
> 
>> Before I go too far down this road (and to avoid the horror of
>> actually trying to code it) here’s what I’m trying to achieve.
>> 
>> 
>> act_connmark + dscp is designed to copy a DSCP code to/from conntrack mark.  It uses 8 bits of the mark field, currently the most significant byte.
>> 
>> Bits 31-26: DSCP
>> Bit 25: Spare/Future
>> Bit 24: Valid DSCP set
>> 
>> The valid bit is set when the ‘getdscp’ function has written a DSCP
>> value into the conntrack (& hence skb) mark. This allows us & other
>> skb->mark/ct->mark applications (eg iptables, cake qdisc) to know that
>> a DSCP value has been placed in the field. We cannot simply use a
>> non-zero DSCP because zero is a valid DSCP.
> 
> If someone installs the action, the field is supposedly always copied;
> so why do we need another flag?

I’m trying to limit the number of times expensive iptables mangle rules have to run.

Egress path:

Packet comes in (internal to device or forward)
iptables mangle - check fwmark ’set’ bit
if not set
	jump to a possibly extensive set of rules that mangle the DSCP
else
do nothing

Packet arrives at act_connmark dscpset
looks at fwmark ’set’ bit
if not set
	copy DSCP to fwmark & set the ’set’ bit.
else
	do nothing
cake gets hold of it - selects a tin based on fwmark contained DSCP

Do the routine again for the next packet in the same connection and you’ll skip the iptables mangle rules but still have cake classify based on the fwmark stored DSCP.  Without that flag you’ll have to run the iptables mangle rules for every packet and update the fwmark too.


I personally think that cake should also have the fwmark/DSCP decode routine on ingress. e.g.

Ingress

Packet arrives
act_connmark restores the fwmark
if fwmark/dscp set then optionally restores diffserv from fwmark
Cake looks for fwmark/dscp set bit
if true 
	use fwmark DSCP for tin select
else
	use diffserv field as before
Cake possibly washes


Without the ’set’ bit, act_connmark has to restore the diffserv field on every (ip) packet and cake possibly has to wash it out again.



The reality is that I enjoyed doing this in the cake codebase.  I cannot say the same for act_connmark in fact I hate it so much I’m stopping.  The mental effort for a non-programmer and more importantly a non-kernel programmer is exhausting & I’m completely disillusioned.  I really need to concentrate on the job that means I can pay the mortgage, which isn’t bashing my head against the kernel.


Anyway 4 files - 2 are patches against current cake & tc and a ‘my_layer_cake’ qos script that’s ‘fwmark/cake’ aware.  4th file is the start of a hack on act_connmark.  Do with them as you will, I never want to see the last one again.



[-- Attachment #2: 0001-Automagically-use-update-DSCP-contained-in-fwmark.patch --]
[-- Type: application/octet-stream, Size: 6679 bytes --]

From 2bf78196b242226f8778695fea922eb5341c0f21 Mon Sep 17 00:00:00 2001
From: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Date: Fri, 1 Mar 2019 11:35:49 +0000
Subject: [PATCH] Automagically use & update DSCP contained in fwmark

and then I had a really bad idea!

Which is - if we're in fwmark & egress mode & we don't have a mark so
we've fallen through to DSCP, set a mark based on the DSCP.
That way, reply packets will be automagically fw marked for us on their
return as tc act connmark will have restored the mark which we can then
use.

With the 'icing' keyword, copy the dscp coded into the fwmark into the
real packets (the opposite of wash) in ingress mode.

The fwmark encodes the dscp thus (we steal the top byte)

ct->mark &= 0x00ffffff; - mask out top byte
ct->mark |= (0x01 | dscp) << 24; or in the DSCP (held in top 6 bits) and
set lowest bit as a 'cake set this fwmark' flag

options from tc:

fwmark - enable fwmark usage
getdscp - update conntrack mark with the DSCP if no existing CAKE mark
typically used with egress qdisc
setdscp - update DSCP on packets from CAKE mark typically used with
ingress qdisc

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
---
 pkt_sched.h |   2 +
 sch_cake.c  | 112 ++++++++++++++++++++++++++++++++++++++++++----------
 2 files changed, 94 insertions(+), 20 deletions(-)

diff --git a/pkt_sched.h b/pkt_sched.h
index a2f570c..e3c2ac4 100644
--- a/pkt_sched.h
+++ b/pkt_sched.h
@@ -879,6 +879,8 @@ enum {
 	TCA_CAKE_ACK_FILTER,
 	TCA_CAKE_SPLIT_GSO,
 	TCA_CAKE_FWMARK,
+	TCA_CAKE_GETDSCP,
+	TCA_CAKE_SETDSCP,
 	__TCA_CAKE_MAX
 };
 #define TCA_CAKE_MAX	(__TCA_CAKE_MAX - 1)
diff --git a/sch_cake.c b/sch_cake.c
index 052ac05..e27a8b1 100644
--- a/sch_cake.c
+++ b/sch_cake.c
@@ -269,7 +269,9 @@ enum {
 	CAKE_FLAG_INGRESS	   = BIT(2),
 	CAKE_FLAG_WASH		   = BIT(3),
 	CAKE_FLAG_SPLIT_GSO	   = BIT(4),
-	CAKE_FLAG_FWMARK	   = BIT(5)
+	CAKE_FLAG_FWMARK	   = BIT(5), /* USEFWMARK */
+	CAKE_FLAG_GETDSCP	   = BIT(6), /* STOREDSCP */
+	CAKE_FLAG_SETDSCP	   = BIT(7)  /* RESTOREDSCP */
 };
 
 /* COBALT operates the Codel and BLUE algorithms in parallel, in order to
@@ -1616,19 +1618,36 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free)
 	return idx + (tin << 16);
 }
 
-static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
+static void cake_update_diffserv(struct sk_buff *skb, u8 dscp)
+{
+	switch (skb->protocol) {
+	case htons(ETH_P_IP):
+		if ((ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK) != dscp)
+			ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, dscp);
+		break;
+	case htons(ETH_P_IPV6):
+		if ((ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK) != dscp)
+			ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, dscp);
+		break;
+	default:
+		break;
+	}
+
+}
+
+static u8 cake_handle_diffserv(struct sk_buff *skb, bool wash)
 {
 	u8 dscp;
 
 	switch (skb->protocol) {
 	case htons(ETH_P_IP):
-		dscp = ipv4_get_dsfield(ip_hdr(skb)) >> 2;
+		dscp = ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK;
 		if (wash && dscp)
 			ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, 0);
 		return dscp;
 
 	case htons(ETH_P_IPV6):
-		dscp = ipv6_get_dsfield(ipv6_hdr(skb)) >> 2;
+		dscp = ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK;
 		if (wash && dscp)
 			ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, 0);
 		return dscp;
@@ -1642,37 +1661,68 @@ static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
 	}
 }
 
+static void cake_update_ct_mark(struct sk_buff *skb, u8 dscp)
+{
+#if IS_REACHABLE(CONFIG_NF_CONNTRACK)
+	enum ip_conntrack_info ctinfo;
+	struct nf_conn *ct;
+
+	ct = nf_ct_get(skb, &ctinfo);
+	if (!ct)
+		return;
+
+	ct->mark &= ~(0xff << 24);
+	ct->mark |= (0x01 | dscp) << 24;
+	nf_conntrack_event_cache(IPCT_MARK, ct);
+#endif
+}
+
 static struct cake_tin_data *cake_select_tin(struct Qdisc *sch,
 					     struct sk_buff *skb)
 {
 	struct cake_sched_data *q = qdisc_priv(sch);
-	u32 tin;
+	bool wash;
 	u8 dscp;
+	u8 tin;
 
-	/* Tin selection: Default to diffserv-based selection, allow overriding
-	 * using firewall marks or skb->priority.
-	 */
-	dscp = cake_handle_diffserv(skb,
-				    q->rate_flags & CAKE_FLAG_WASH);
+	wash = !!(q->rate_flags & CAKE_FLAG_WASH);
+
+	if (q->tin_mode == CAKE_DIFFSERV_BESTEFFORT) {
 
-	if (q->tin_mode == CAKE_DIFFSERV_BESTEFFORT)
 		tin = 0;
+		if (wash)
+			cake_update_diffserv(skb, 0);
 
-	else if (q->rate_flags & CAKE_FLAG_FWMARK && /* use fw mark */
-		 skb->mark &&
-		 skb->mark <= q->tin_cnt)
-		tin = q->tin_order[skb->mark - 1];
+	} else if (TC_H_MAJ(skb->priority) == sch->handle && /* use priority */
+		   TC_H_MIN(skb->priority) > 0 &&
+		   TC_H_MIN(skb->priority) <= q->tin_cnt) {
 
-	else if (TC_H_MAJ(skb->priority) == sch->handle &&
-		 TC_H_MIN(skb->priority) > 0 &&
-		 TC_H_MIN(skb->priority) <= q->tin_cnt)
 		tin = q->tin_order[TC_H_MIN(skb->priority) - 1];
+		if (wash)
+			cake_update_diffserv(skb, 0);
 
-	else {
-		tin = q->tin_index[dscp];
+	} else if (q->rate_flags & CAKE_FLAG_FWMARK && /* use fw mark */
+		   skb->mark & (0x01 << 24)) {
+
+		dscp = skb->mark >> 24 & ~INET_ECN_MASK;
+		tin = q->tin_index[dscp >> 2];
+
+		if (wash)
+			cake_update_diffserv(skb, 0);
+		else if (q->rate_flags & CAKE_FLAG_SETDSCP)
+			cake_update_diffserv(skb, dscp);
+
+	} else { /* fallback to DSCP */
+		/* extract the Diffserv Precedence field, if it exists */
+		/* and clear DSCP bits if washing */
+		dscp = cake_handle_diffserv(skb, wash);
+		tin = q->tin_index[dscp >> 2];
 
 		if (unlikely(tin >= q->tin_cnt))
 			tin = 0;
+
+		if (q->rate_flags & CAKE_FLAG_FWMARK && q->rate_flags & CAKE_FLAG_GETDSCP)
+			cake_update_ct_mark(skb, dscp);
 	}
 
 	return &q->tins[tin];
@@ -2760,6 +2810,20 @@ static int cake_change(struct Qdisc *sch, struct nlattr *opt,
 			q->rate_flags &= ~CAKE_FLAG_FWMARK;
 	}
 
+	if (tb[TCA_CAKE_GETDSCP]) {
+		if (!!nla_get_u32(tb[TCA_CAKE_GETDSCP]))
+			q->rate_flags |= CAKE_FLAG_GETDSCP;
+		else
+			q->rate_flags &= ~CAKE_FLAG_GETDSCP;
+	}
+
+	if (tb[TCA_CAKE_SETDSCP]) {
+		if (!!nla_get_u32(tb[TCA_CAKE_SETDSCP]))
+			q->rate_flags |= CAKE_FLAG_SETDSCP;
+		else
+			q->rate_flags &= ~CAKE_FLAG_SETDSCP;
+	}
+
 	if (q->tins) {
 		sch_tree_lock(sch);
 		cake_reconfigure(sch);
@@ -2944,6 +3008,14 @@ static int cake_dump(struct Qdisc *sch, struct sk_buff *skb)
 			!!(q->rate_flags & CAKE_FLAG_FWMARK)))
 		goto nla_put_failure;
 
+	if (nla_put_u32(skb, TCA_CAKE_GETDSCP,
+			!!(q->rate_flags & CAKE_FLAG_GETDSCP)))
+		goto nla_put_failure;
+
+	if (nla_put_u32(skb, TCA_CAKE_SETDSCP,
+			!!(q->rate_flags & CAKE_FLAG_SETDSCP)))
+		goto nla_put_failure;
+
 	return nla_nest_end(skb, opts);
 
 nla_put_failure:
-- 
2.17.2 (Apple Git-113)


[-- Attachment #3: 0001-tc-cake-add-fwmark-getdscp-setdscp-options.patch --]
[-- Type: application/octet-stream, Size: 5830 bytes --]

From 5f5f9d02280616e1e85bc87456ca46e27cd28ca9 Mon Sep 17 00:00:00 2001
From: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Date: Wed, 27 Feb 2019 14:46:05 +0000
Subject: [PATCH] cake: add fwmark & getdscp & setdscp options

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
---
 include/uapi/linux/pkt_sched.h |  3 ++
 man/man8/tc-cake.8             | 19 +++++++++++
 tc/q_cake.c                    | 60 ++++++++++++++++++++++++++++++++++
 3 files changed, 82 insertions(+)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 89ee47c2..766cd9a1 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -991,6 +991,9 @@ enum {
 	TCA_CAKE_INGRESS,
 	TCA_CAKE_ACK_FILTER,
 	TCA_CAKE_SPLIT_GSO,
+	TCA_CAKE_FWMARK,
+	TCA_CAKE_GETDSCP,
+	TCA_CAKE_SETDSCP,
 	__TCA_CAKE_MAX
 };
 #define TCA_CAKE_MAX	(__TCA_CAKE_MAX - 1)
diff --git a/man/man8/tc-cake.8 b/man/man8/tc-cake.8
index c62e5547..7db0f142 100644
--- a/man/man8/tc-cake.8
+++ b/man/man8/tc-cake.8
@@ -73,6 +73,12 @@ TIME |
 ]
 .br
 [
+.BR fwmark
+|
+.BR nofwmark*
+]
+.br
+[
 .BR split-gso*
 |
 .BR no-split-gso
@@ -623,6 +629,19 @@ override mechanism; if a host ID is assigned, it will be used as both source and
 destination host.
 
 
+.SH OVERRIDING CLASSIFICATION WITH NETFILTER CONNMARKS
+
+In addition to TC FILTER tin classification, firewall marks may also optionally
+be used.  The priority order (highest to lowest) for tin selection is TC filter,
+firewall mark and then DSCP.
+.PP
+.B fwmark
+
+.br
+	Enables CONNMARK based tin selection. Valid CONNMARKS range from 1 to the
+maximum number of tins i.e. 3 tins for diffserv3, 4 tins for diffserv4.
+Values outside the valid range are ignored and CAKE will fall back to using
+DSCP for tin selection.
 
 .SH EXAMPLES
 # tc qdisc delete root dev eth0
diff --git a/tc/q_cake.c b/tc/q_cake.c
index e827e3f1..9c892a3b 100644
--- a/tc/q_cake.c
+++ b/tc/q_cake.c
@@ -79,6 +79,9 @@ static void explain(void)
 "                  dual-srchost | dual-dsthost | triple-isolate* ]\n"
 "                [ nat | nonat* ]\n"
 "                [ wash | nowash* ]\n"
+"                [ fwmark | nofwmark* ]\n"
+"                [ getdscp | nogetdscp* ]\n"
+"                [ setdscp | nosetdscp* ]\n"
 "                [ split-gso* | no-split-gso ]\n"
 "                [ ack-filter | ack-filter-aggressive | no-ack-filter* ]\n"
 "                [ memlimit LIMIT ]\n"
@@ -106,6 +109,9 @@ static int cake_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 	int autorate = -1;
 	int ingress = -1;
 	int overhead = 0;
+	int getdscp = -1;
+	int setdscp = -1;
+	int fwmark = -1;
 	int wash = -1;
 	int nat = -1;
 	int atm = -1;
@@ -161,6 +167,18 @@ static int cake_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			split_gso = 1;
 		} else if (strcmp(*argv, "no-split-gso") == 0) {
 			split_gso = 0;
+		} else if (strcmp(*argv, "fwmark") == 0) {
+			fwmark = 1;
+		} else if (strcmp(*argv, "nofwmark") == 0) {
+			fwmark = 0;
+		} else if (strcmp(*argv, "getdscp") == 0) {
+			getdscp = 1;
+		} else if (strcmp(*argv, "nogetdscp") == 0) {
+			getdscp = 0;
+		} else if (strcmp(*argv, "setdscp") == 0) {
+			setdscp = 1;
+		} else if (strcmp(*argv, "nosetdscp") == 0) {
+			setdscp = 0;
 		} else if (strcmp(*argv, "flowblind") == 0) {
 			flowmode = CAKE_FLOW_NONE;
 		} else if (strcmp(*argv, "srchost") == 0) {
@@ -383,6 +401,15 @@ static int cake_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 	if (split_gso != -1)
 		addattr_l(n, 1024, TCA_CAKE_SPLIT_GSO, &split_gso,
 			  sizeof(split_gso));
+	if (fwmark != -1)
+		addattr_l(n, 1024, TCA_CAKE_FWMARK, &fwmark,
+			  sizeof(fwmark));
+	if (getdscp != -1)
+		addattr_l(n, 1024, TCA_CAKE_GETDSCP, &getdscp,
+			  sizeof(getdscp));
+	if (setdscp != -1)
+		addattr_l(n, 1024, TCA_CAKE_SETDSCP, &setdscp,
+			  sizeof(setdscp));
 	if (ingress != -1)
 		addattr_l(n, 1024, TCA_CAKE_INGRESS, &ingress, sizeof(ingress));
 	if (ack_filter != -1)
@@ -415,6 +442,9 @@ static int cake_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 	int overhead = 0;
 	int autorate = 0;
 	int ingress = 0;
+	int getdscp = 0;
+	int setdscp = 0;
+	int fwmark = 0;
 	int wash = 0;
 	int raw = 0;
 	int mpu = 0;
@@ -500,6 +530,18 @@ static int cake_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 	    RTA_PAYLOAD(tb[TCA_CAKE_SPLIT_GSO]) >= sizeof(__u32)) {
 		split_gso = rta_getattr_u32(tb[TCA_CAKE_SPLIT_GSO]);
 	}
+	if (tb[TCA_CAKE_FWMARK] &&
+	    RTA_PAYLOAD(tb[TCA_CAKE_FWMARK]) >= sizeof(__u32)) {
+		fwmark = rta_getattr_u32(tb[TCA_CAKE_FWMARK]);
+	}
+	if (tb[TCA_CAKE_GETDSCP] &&
+	    RTA_PAYLOAD(tb[TCA_CAKE_GETDSCP]) >= sizeof(__u32)) {
+		getdscp = rta_getattr_u32(tb[TCA_CAKE_GETDSCP]);
+	}
+	if (tb[TCA_CAKE_SETDSCP] &&
+	    RTA_PAYLOAD(tb[TCA_CAKE_SETDSCP]) >= sizeof(__u32)) {
+		setdscp = rta_getattr_u32(tb[TCA_CAKE_SETDSCP]);
+	}
 	if (tb[TCA_CAKE_RAW]) {
 		raw = 1;
 	}
@@ -532,6 +574,24 @@ static int cake_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 		print_string(PRINT_FP, NULL, "no-split-gso ", NULL);
 	print_bool(PRINT_JSON, "split_gso", NULL, split_gso);
 
+	if (fwmark)
+		print_string(PRINT_FP, NULL, "fwmark ", NULL);
+	else
+		print_string(PRINT_FP, NULL, "nofwmark ", NULL);
+	print_bool(PRINT_JSON, "fwmark", NULL, fwmark);
+
+	if (getdscp)
+		print_string(PRINT_FP, NULL, "getdscp ", NULL);
+	else
+		print_string(PRINT_FP, NULL, "nogetdscp ", NULL);
+	print_bool(PRINT_JSON, "getdscp", NULL, getdscp);
+
+	if (setdscp)
+		print_string(PRINT_FP, NULL, "setdscp ", NULL);
+	else
+		print_string(PRINT_FP, NULL, "nosetdscp ", NULL);
+	print_bool(PRINT_JSON, "setdscp", NULL, setdscp);
+
 	if (interval)
 		print_string(PRINT_FP, NULL, "rtt %s ",
 			     sprint_time(interval, b2));
-- 
2.17.2 (Apple Git-113)


[-- Attachment #4: my_layer_cake.qos --]
[-- Type: application/octet-stream, Size: 5131 bytes --]

#!/bin/sh
# Cero3 Shaper
# A cake shaper and AQM solution that allows several diffserv marking schemes
# for ethernet gateways

# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License version 2 as
# published by the Free Software Foundation.
#
#       Copyright (C) 2012-5 Michael D. Taht, Toke Høiland-Jørgensen, Sebastian Moeller


#sm: TODO pass in the cake diffserv keyword

. ${SQM_LIB_DIR}/defaults.sh
QDISC=cake

# Default traffic classication is passed in INGRESS_CAKE_OPTS and EGRESS_CAKE_OPTS, defined in defaults.sh now

egress() {
    SILENT=1 $TC qdisc del dev $IFACE root
    $TC qdisc add dev $IFACE root handle cacf: $( get_stab_string ) cake \
        bandwidth ${UPLINK}kbit $( get_cake_lla_string ) ${EGRESS_CAKE_OPTS} ${EQDISC_OPTS}

}


ingress() {

    SILENT=1 $TC qdisc del dev $IFACE handle ffff: ingress
    $TC qdisc add dev $IFACE handle ffff: ingress

    SILENT=1 $TC qdisc del dev $DEV root

    [ "$IGNORE_DSCP_INGRESS" -eq "1" ] && INGRESS_CAKE_OPTS="$INGRESS_CAKE_OPTS besteffort"
    [ "$ZERO_DSCP_INGRESS" -eq "1" ] && INGRESS_CAKE_OPTS="$INGRESS_CAKE_OPTS wash"

    $TC qdisc add dev $DEV root handle cace: $( get_stab_string ) cake \
        bandwidth ${DOWNLINK}kbit $( get_cake_lla_string ) ${INGRESS_CAKE_OPTS} ${IQDISC_OPTS}

    $IP link set dev $DEV up

    # redirect all IP packets arriving in $IFACE to ifb0
    # and restore connmark this may be used by cake+
    $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
	match u32 0 0 flowid 1:1 action connmark action mirred egress redirect dev $DEV

    # Configure iptables chain to mark packets
    ipt -t mangle -N QOS_MARK_${IFACE}

    # Change DSCP of relevant hosts/packets - this will be picked up by cake+ and placed in the firewall connmark
    # also the DSCP is used as the tin selector.

iptables -t mangle -A QOS_MARK_${IFACE} -p tcp -s 192.168.219.5 -m comment --comment "Skybox DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1
iptables -t mangle -A QOS_MARK_${IFACE} -p udp -s 192.168.219.5 -m comment --comment "Skybox DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1
iptables -t mangle -A QOS_MARK_${IFACE} -p tcp -s 192.168.219.10 -m comment --comment "Bluray DSCP CS2 Video" -j DSCP --set-dscp-class CS2
iptables -t mangle -A QOS_MARK_${IFACE} -p udp -s 192.168.219.10 -m comment --comment "Bluray DSCP CS2 Video" -j DSCP --set-dscp-class CS2
iptables -t mangle -A QOS_MARK_${IFACE} -p tcp -s 192.168.219.12 -m tcp --sport 6981 -m comment --comment "BT DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1
iptables -t mangle -A QOS_MARK_${IFACE} -p udp -s 192.168.219.12 -m udp --sport 6981 -m comment --comment "BT DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1
iptables -t mangle -A QOS_MARK_${IFACE} -p tcp -s 192.168.219.12 -m tcp --dport 4443 -m comment --comment "BT DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1
iptables -t mangle -A QOS_MARK_${IFACE} -p tcp -s 192.168.219.12 -m tcp --dport 443 -m comment --comment "HTTPS uploads DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1

iptables -t mangle -A QOS_MARK_${IFACE} -m set --match-set Vid4 dst -j DSCP --set-dscp-class CS2 -m comment --comment "Vid CS2 ipset"

ip6tables -t mangle -A QOS_MARK_${IFACE} -p tcp -s ::c/::ffff:ffff:ffff:ffff -m tcp --sport 6981 -m comment --comment "BT DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1
ip6tables -t mangle -A QOS_MARK_${IFACE} -p udp -s ::c/::ffff:ffff:ffff:ffff -m udp --sport 6981 -m comment --comment "BT DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1
ip6tables -t mangle -A QOS_MARK_${IFACE} -p tcp -s ::c/::ffff:ffff:ffff:ffff -m tcp --dport 4443 -m comment --comment "BT DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1
ip6tables -t mangle -A QOS_MARK_${IFACE} -p tcp -s ::c/::ffff:ffff:ffff:ffff -m tcp --dport 443 -m comment --comment "HTTPS uploads DSCP CS1 Bulk" -j DSCP --set-dscp-class CS1

ip6tables -t mangle -A QOS_MARK_${IFACE} -m set --match-set Vid6 dst -j DSCP --set-dscp-class CS2 -m comment --comment "Vid CS2 ipset"

    # Send cake+ unmarked connections to the marking chain - Cake+ uses top byte as the
    # i've been marked & here's the dscp placeholder. 
    # top 6 bits are DSCP, LSB is DSCP is valid flag
    ipt -t mangle -A PREROUTING  -i $IFACE -m mark --mark 0x00/0x01000000 -g QOS_MARK_${IFACE}
    ipt -t mangle -A POSTROUTING -o $IFACE -m mark --mark 0x00/0x01000000 -g QOS_MARK_${IFACE}

}

sqm_start() {
    [ -n "$IFACE" ] || return 1
    do_modules
    verify_qdisc $QDISC "cake" || return 1
    sqm_debug "Starting ${SCRIPT}"

    [ -z "$DEV" ] && DEV=$( get_ifb_for_if ${IFACE} )

    if [ "${UPLINK}" -ne 0 ];
    then
        egress
        sqm_debug "egress shaping activated"
    else
        sqm_debug "egress shaping deactivated"
        SILENT=1 $TC qdisc del dev ${IFACE} root
    fi
    if [ "${DOWNLINK}" -ne 0 ];
    then
	verify_qdisc ingress "ingress" || return 1
        ingress
        sqm_debug "ingress shaping activated"
    else
        sqm_debug "ingress shaping deactivated"
        SILENT=1 $TC qdisc del dev ${DEV} root
        SILENT=1 $TC qdisc del dev ${IFACE} ingress
    fi

    return 0
}

[-- Attachment #5: 0001-start-of-act_connmark-hack.patch --]
[-- Type: application/octet-stream, Size: 3387 bytes --]

From 91f2c94d333fbd391dbde30eff48ee1e859bbdf8 Mon Sep 17 00:00:00 2001
From: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Date: Tue, 5 Mar 2019 14:48:30 +0000
Subject: [PATCH] start of act_connmark hack

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
---
 net/sched/act_connmark.c | 61 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c
index 8475913f2070..033e54dbee28 100644
--- a/net/sched/act_connmark.c
+++ b/net/sched/act_connmark.c
@@ -41,6 +41,7 @@ static int tcf_connmark_act(struct sk_buff *skb, const struct tc_action *a,
 	struct nf_conntrack_zone zone;
 	struct nf_conn *c;
 	int proto;
+	u8 dscp;
 
 	spin_lock(&ca->tcf_lock);
 	tcf_lastuse_update(&ca->tcf_tm);
@@ -62,6 +63,36 @@ static int tcf_connmark_act(struct sk_buff *skb, const struct tc_action *a,
 
 	c = nf_ct_get(skb, &ctinfo);
 	if (c) {
+		if (dscpops & GET_DSCP && proto && !(c->mark & 0x01000000)) {
+			/* mark does not cotnain DSCP so store DSCP bits into c->mark */
+			switch (proto) {
+			case NFPROTO_IPV4:
+				dscp = ipv4_get_dsfield(ip_hdr(skb));
+				break;
+			case NFPROTO_IPV6:
+				dscp = ipv6_get_dsfield(ip_hdr(skb));
+				break;
+			default:
+				dscp = 0;
+				break;
+			}
+			c->mark &= 0x00ffffff;
+			c->mark |= (0x01 | dscp) << 24;
+		} else if (dscpops & SET_DSCP && proto && (c->mark & 0x01000000)) {
+			/* mark contains DSCP so restore DSCP bits from c->mark into diffserv */
+			switch (proto) {
+			case NFPROTO_IPV4:
+				if (ipv4_get_dsfield(ip_hdr(skb) & ~INET_ECN_MASK) != c->mark >> 24 & ~INET_ECN_MASK)
+					ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, c->mark >> 24 & ~INET_ECN_MASK);
+				break;
+			case NFPROTO_IPV6:
+				if (ipv6_get_dsfield(ip_hdr(skb) & ~INET_ECN_MASK) != c->mark >> 24 & ~INET_ECN_MASK)
+					ipv6_change_dsfield(ip_hdr(skb), INET_ECN_MASK, c->mark >> 24 & ~INET_ECN_MASK);
+				break;
+			default:
+				break;
+			}
+		}
 		skb->mark = c->mark;
 		/* using overlimits stats to count how many packets marked */
 		ca->tcf_qstats.overlimits++;
@@ -82,6 +113,36 @@ static int tcf_connmark_act(struct sk_buff *skb, const struct tc_action *a,
 	c = nf_ct_tuplehash_to_ctrack(thash);
 	/* using overlimits stats to count how many packets marked */
 	ca->tcf_qstats.overlimits++;
+	if (dscpops & GET_DSCP && proto && !(c->mark & 0x01000000)) {
+		/* store the DSCP bits into c->mark */
+		switch (proto) {
+		case NFPROTO_IPV4:
+			dscp = ipv4_get_dsfield(ip_hdr(skb));
+			break;
+		case NFPROTO_IPV6:
+			dscp = ipv6_get_dsfield(ip_hdr(skb));
+			break;
+		default:
+			dscp = 0;
+			break;
+		}
+		c->mark &= 0x00ffffff;
+		c->mark |= (0x01 | dscp) << 24;
+	} else if (dscpops & SET_DSCP && proto && (c->mark & 0x01000000)) {
+		/* restore the DSCP bits from c->mark into diffserv */
+		switch (proto) {
+		case NFPROTO_IPV4:
+			if (ipv4_get_dsfield(ip_hdr(skb) & ~INET_ECN_MASK) != c->mark >> 24 & ~INET_ECN_MASK)
+				ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, c->mark >> 24 & ~INET_ECN_MASK);
+			break;
+		case NFPROTO_IPV6:
+			if (ipv6_get_dsfield(ip_hdr(skb) & ~INET_ECN_MASK) != c->mark >> 24 & ~INET_ECN_MASK)
+				ipv6_change_dsfield(ip_hdr(skb), INET_ECN_MASK, c->mark >> 24 & ~INET_ECN_MASK);
+			break;
+		default:
+			break;
+		}
+	}
 	skb->mark = c->mark;
 	nf_ct_put(c);
 
-- 
2.17.2 (Apple Git-113)


  parent reply	other threads:[~2019-03-06 18:40 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-05 14:35 Kevin Darbyshire-Bryant
2019-03-06 15:21 ` Toke Høiland-Jørgensen
2019-03-06 16:47   ` John Sager
2019-03-07  9:50     ` Toke Høiland-Jørgensen
2019-03-06 18:40   ` Kevin Darbyshire-Bryant [this message]
2019-03-07 10:10     ` Toke Høiland-Jørgensen
2019-03-07 15:56       ` Kevin Darbyshire-Bryant
2019-03-07 17:40         ` Toke Høiland-Jørgensen
2019-03-08 11:13           ` Kevin Darbyshire-Bryant
2019-03-08 11:28             ` Toke Høiland-Jørgensen
2019-03-08 14:03               ` Kevin Darbyshire-Bryant
2019-03-09 14:08                 ` Toke Høiland-Jørgensen
2019-03-10 15:21                   ` Kevin Darbyshire-Bryant
2019-03-10 23:56                     ` Toke Høiland-Jørgensen
2019-03-11 10:51                       ` Kevin Darbyshire-Bryant
2019-03-11 13:00                         ` Toke Høiland-Jørgensen
2019-03-11 14:11                           ` Kevin Darbyshire-Bryant
2019-03-11 14:32                             ` Toke Høiland-Jørgensen
2019-03-09 20:21                 ` John Sager

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6B530473-971A-4265-B94B-3595D39D57AF@darbyshire-bryant.me.uk \
    --to=kevin@darbyshire-bryant.me.uk \
    --cc=cake@lists.bufferbloat.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox