* [Cake] Ingress classification
@ 2019-02-05 12:08 Kevin Darbyshire-Bryant
2019-02-05 13:38 ` John Sager
0 siblings, 1 reply; 8+ messages in thread
From: Kevin Darbyshire-Bryant @ 2019-02-05 12:08 UTC (permalink / raw)
To: Cake List
Hi Experts!
After a long self imposed absence I’m back…and as confused as ever. I’m trying to use cake’s diffserv/dscp interpretation abilities with my Internet facing wan link on openwrt.
The chances of getting DSCP across the Internet unmolested are nill, but I can use dscp codes within my own LAN/Router to select relevant tins and hence bandwidth/latency limits within cake. e.g. to de-prioritise certain host/port combinations (bittorrent into bulk)
egress traffic is ‘easy’, simply use an iptables rule e.g. as produced by fw3 on openwrt:
/etc/config/firewall:
config rule
option name 'Bittorrent server DSCP BULK'
option target 'dscp'
option src 'lan'
option family 'ipv4'
option src_ip '192.168.218.12'
option dest '*'
option set_dscp 'CS1'
option src_port '6989'
option proto 'tcp udp'
Produces:
iptables -t mangle -A PREROUTING -p tcp -s 192.168.218.12/255.255.255.255 -m tcp --sport 6989 -m comment --comment "!fw3: Bittorrent server DSCP BULK" -j DSCP --set-dscp 0x08
iptables -t mangle -A PREROUTING -p udp -s 192.168.218.12/255.255.255.255 -m udp --sport 6989 -m comment --comment "!fw3: Bittorrent server DSCP BULK" -j DSCP --set-dscp 0x08
root@Router:~# iptables -t mangle -L
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
DSCP tcp -- Waldorf.lan.darbyshire-bryant.me.uk anywhere tcp spt:6989 /* !fw3: Bittorrent server DSCP BULK */ DSCP set 0x08
DSCP udp -- Waldorf.lan.darbyshire-bryant.me.uk anywhere udp spt:6989 /* !fw3: Bittorrent server DSCP BULK */ DSCP set 0x08
So that’s the egress (upload) sorted out ‘easy’. Of course the tcp ack packets on the return (ingress) path aren’t classified as BULK nor is any bittorrent download traffic.
Ingress is harder and my knowledge goes flaky at this point too. The qdisc is running at a point pre de-NAT, this was the ‘flaw’ with host fairness in the NAT world before the ’nat’ keyword. Similarly I also think that any rules to change DSCP at this point have also happened too late for cake to notice. Then Toke shone a light and gave Cake tc filter knowledge and so for a while I had the following in a tweaked ‘layer_cake.qos’ script:
MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
# catch bittorrent incoming
$TC filter add dev $DEV parent $MAJOR protocol ip u32 \
match ip dport 6989 0xffff action skbedit priority ${MAJOR}1
This catches any traffic destined to port 6989 and tells cake to put it into the Bulk Tin. So I have a solution that is putting traffic coming from my bittorrent host on port 6989 or going to port 6989 into the bulk tin. It’s not exactly scaleable though.
Then I saw someone using connmark [1] and I’ve subsequently tweaked that into my own version and it appears to work but… I don’t really know/understand why - hence this email!
One of the things that openwrt’s sqm-scripts does to enable ingress QoS is to redirect ingress traffic on the WAN interface to an ifb interface…and you put the qdisc on the egress of the ifb to effectively create an ingress qdisc. The mirror/redirect function is implemented by ‘action mirred redirect dev target_dev’. Linux/TC also allows you to restore connection tracking marks at this stage as well, hence the following line:
# redirect all IP packets arriving in $IFACE to ifb0
# and restore connmark
$TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
match u32 0 0 flowid 1:1 action connmark action mirred egress redirect dev $DEV
With the connmarks restored TC filters can be told to ‘do stuff’ based on those marks, in this case edit the skb’s priority field so now Cake stuffs the packet into the selected tin(1/3):
MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
$TC filter add dev $DEV parent $MAJOR protocol ip \
handle 0x01 fw action skbedit priority ${MAJOR}1 #handle mark 0x01
$TC filter add dev $DEV parent $MAJOR protocol ip \
handle 0x03 fw action skbedit priority ${MAJOR}3 #handle mark 0x03
As I understand it, this isn’t changing the DSCP field on the packet itself but just an skb priority that Cake understands (something to do with Major numbers matching)
So now all we have to do is mark our connections as required. And here’s where my knowledge trouble really starts:
# Configure iptables chain to mark packets
ipt -t mangle -N QOS_MARK_${IFACE}
# If packets have DSCP 8 then mark that connection so the response packets are also sent to bulk
ipt -t mangle -A QOS_MARK_${IFACE} -p tcp -m dscp --dscp 8 -j MARK --set-mark 0x01/0xff
ipt -t mangle -A QOS_MARK_${IFACE} -p udp -m dscp --dscp 8 -j MARK --set-mark 0x01/0xff
#kdb high priority marking
ipt -t mangle -A QOS_MARK_${IFACE} -p tcp -m dscp --dscp 0x30 -j MARK --set-mark 0x03/0xff
ipt -t mangle -A QOS_MARK_${IFACE} -p udp -m dscp --dscp 0x30 -j MARK --set-mark 0x03/0xff
This makes sense so far, basically anything that has a matching dscp already, set the corresponding connmark.
Then the original script does this:
# You can add more rules here. For example, to mark incomming connections on port 9999 tcp/udp
# ipt -t mangle -A QOS_MARK_${IFACE} -i $IFACE -p tcp --dport 9999 -j MARK --set-mark 0x01/0xff
# ipt -t mangle -A QOS_MARK_${IFACE} -i $IFACE -p udp --dport 9999 -j MARK --set-mark 0x01/0xff
I’m confused at this point - see later.
# save the connmark - will be 0x00 unless any of the above changed it
ipt -t mangle -A QOS_MARK_${IFACE} -j CONNMARK --save-mark
That makes sense, I get that.
And finally after all that effort of making the chain, we’d better actually use it:
# Send unmarked connections to the marking chain
ipt -t mangle -A PREROUTING -i $IFACE -m mark --mark 0x00/0xff -g QOS_MARK_${IFACE}
ipt -t mangle -A POSTROUTING -o $IFACE -m mark --mark 0x00/0xff -g QOS_MARK_${IFACE}
And so, finally, we come to my point of confusion and my question: How/Does setting a firewall mark on ingress actually work because I didn’t think the prerouting mangle table would have run before tc’s action connmark action mirred.
Somehow it does appear to be working but I genuinely don’t understand how say an unsolicited ingress UDP packet (from a bittorrent) would have been marked.
Sorry for the ‘War & Peace’ length. I’ve tried reading iptables tutorials and a tc filters tutorial [2] but I can’t quite put it all together.
[1] https://github.com/dtaht/sch_cake/issues/97#issuecomment-412248518
[2] http://linux-ip.net/gl/tc-filters/tc-filters.html
Cheers,
Kevin D-B
012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Cake] Ingress classification
2019-02-05 12:08 [Cake] Ingress classification Kevin Darbyshire-Bryant
@ 2019-02-05 13:38 ` John Sager
2019-02-06 12:52 ` Kevin Darbyshire-Bryant
0 siblings, 1 reply; 8+ messages in thread
From: John Sager @ 2019-02-05 13:38 UTC (permalink / raw)
To: cake
As you say, an unsolicited incoming packet doesn't get marked. However it
creates a conntrack record with zero mark. What you then do is to mark the
conntrack record later so that all subsequent packets on that connection get
marked by 'action connmark'. So the first packet gets classified on ifb to
some low priority queue, but subsequent ones go where they should.
I do this for incoming ssh and VPN connections, though I'm using
htb/fq_codel rather than cake at the moment.
John
On 05/02/2019 12:08, Kevin Darbyshire-Bryant wrote:
> Hi Experts!
>
> After a long self imposed absence I’m back…and as confused as ever. I’m trying to use cake’s diffserv/dscp interpretation abilities with my Internet facing wan link on openwrt.
>
> The chances of getting DSCP across the Internet unmolested are nill, but I can use dscp codes within my own LAN/Router to select relevant tins and hence bandwidth/latency limits within cake. e.g. to de-prioritise certain host/port combinations (bittorrent into bulk)
>
> egress traffic is ‘easy’, simply use an iptables rule e.g. as produced by fw3 on openwrt:
>
> /etc/config/firewall:
>
> config rule
> option name 'Bittorrent server DSCP BULK'
> option target 'dscp'
> option src 'lan'
> option family 'ipv4'
> option src_ip '192.168.218.12'
> option dest '*'
> option set_dscp 'CS1'
> option src_port '6989'
> option proto 'tcp udp'
>
> Produces:
>
> iptables -t mangle -A PREROUTING -p tcp -s 192.168.218.12/255.255.255.255 -m tcp --sport 6989 -m comment --comment "!fw3: Bittorrent server DSCP BULK" -j DSCP --set-dscp 0x08
> iptables -t mangle -A PREROUTING -p udp -s 192.168.218.12/255.255.255.255 -m udp --sport 6989 -m comment --comment "!fw3: Bittorrent server DSCP BULK" -j DSCP --set-dscp 0x08
>
> root@Router:~# iptables -t mangle -L
> Chain PREROUTING (policy ACCEPT)
> target prot opt source destination
> DSCP tcp -- Waldorf.lan.darbyshire-bryant.me.uk anywhere tcp spt:6989 /* !fw3: Bittorrent server DSCP BULK */ DSCP set 0x08
> DSCP udp -- Waldorf.lan.darbyshire-bryant.me.uk anywhere udp spt:6989 /* !fw3: Bittorrent server DSCP BULK */ DSCP set 0x08
>
> So that’s the egress (upload) sorted out ‘easy’. Of course the tcp ack packets on the return (ingress) path aren’t classified as BULK nor is any bittorrent download traffic.
>
>
> Ingress is harder and my knowledge goes flaky at this point too. The qdisc is running at a point pre de-NAT, this was the ‘flaw’ with host fairness in the NAT world before the ’nat’ keyword. Similarly I also think that any rules to change DSCP at this point have also happened too late for cake to notice. Then Toke shone a light and gave Cake tc filter knowledge and so for a while I had the following in a tweaked ‘layer_cake.qos’ script:
>
> MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
> # catch bittorrent incoming
> $TC filter add dev $DEV parent $MAJOR protocol ip u32 \
> match ip dport 6989 0xffff action skbedit priority ${MAJOR}1
>
> This catches any traffic destined to port 6989 and tells cake to put it into the Bulk Tin. So I have a solution that is putting traffic coming from my bittorrent host on port 6989 or going to port 6989 into the bulk tin. It’s not exactly scaleable though.
>
> Then I saw someone using connmark [1] and I’ve subsequently tweaked that into my own version and it appears to work but… I don’t really know/understand why - hence this email!
>
> One of the things that openwrt’s sqm-scripts does to enable ingress QoS is to redirect ingress traffic on the WAN interface to an ifb interface…and you put the qdisc on the egress of the ifb to effectively create an ingress qdisc. The mirror/redirect function is implemented by ‘action mirred redirect dev target_dev’. Linux/TC also allows you to restore connection tracking marks at this stage as well, hence the following line:
>
> # redirect all IP packets arriving in $IFACE to ifb0
> # and restore connmark
> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
> match u32 0 0 flowid 1:1 action connmark action mirred egress redirect dev $DEV
>
> With the connmarks restored TC filters can be told to ‘do stuff’ based on those marks, in this case edit the skb’s priority field so now Cake stuffs the packet into the selected tin(1/3):
>
> MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
> $TC filter add dev $DEV parent $MAJOR protocol ip \
> handle 0x01 fw action skbedit priority ${MAJOR}1 #handle mark 0x01
> $TC filter add dev $DEV parent $MAJOR protocol ip \
> handle 0x03 fw action skbedit priority ${MAJOR}3 #handle mark 0x03
>
> As I understand it, this isn’t changing the DSCP field on the packet itself but just an skb priority that Cake understands (something to do with Major numbers matching)
>
>
> So now all we have to do is mark our connections as required. And here’s where my knowledge trouble really starts:
>
> # Configure iptables chain to mark packets
> ipt -t mangle -N QOS_MARK_${IFACE}
>
> # If packets have DSCP 8 then mark that connection so the response packets are also sent to bulk
> ipt -t mangle -A QOS_MARK_${IFACE} -p tcp -m dscp --dscp 8 -j MARK --set-mark 0x01/0xff
> ipt -t mangle -A QOS_MARK_${IFACE} -p udp -m dscp --dscp 8 -j MARK --set-mark 0x01/0xff
>
> #kdb high priority marking
> ipt -t mangle -A QOS_MARK_${IFACE} -p tcp -m dscp --dscp 0x30 -j MARK --set-mark 0x03/0xff
> ipt -t mangle -A QOS_MARK_${IFACE} -p udp -m dscp --dscp 0x30 -j MARK --set-mark 0x03/0xff
>
> This makes sense so far, basically anything that has a matching dscp already, set the corresponding connmark.
>
> Then the original script does this:
>
> # You can add more rules here. For example, to mark incomming connections on port 9999 tcp/udp
> # ipt -t mangle -A QOS_MARK_${IFACE} -i $IFACE -p tcp --dport 9999 -j MARK --set-mark 0x01/0xff
> # ipt -t mangle -A QOS_MARK_${IFACE} -i $IFACE -p udp --dport 9999 -j MARK --set-mark 0x01/0xff
>
> I’m confused at this point - see later.
>
> # save the connmark - will be 0x00 unless any of the above changed it
> ipt -t mangle -A QOS_MARK_${IFACE} -j CONNMARK --save-mark
>
> That makes sense, I get that.
>
> And finally after all that effort of making the chain, we’d better actually use it:
>
> # Send unmarked connections to the marking chain
> ipt -t mangle -A PREROUTING -i $IFACE -m mark --mark 0x00/0xff -g QOS_MARK_${IFACE}
> ipt -t mangle -A POSTROUTING -o $IFACE -m mark --mark 0x00/0xff -g QOS_MARK_${IFACE}
>
>
> And so, finally, we come to my point of confusion and my question: How/Does setting a firewall mark on ingress actually work because I didn’t think the prerouting mangle table would have run before tc’s action connmark action mirred.
>
> Somehow it does appear to be working but I genuinely don’t understand how say an unsolicited ingress UDP packet (from a bittorrent) would have been marked.
>
> Sorry for the ‘War & Peace’ length. I’ve tried reading iptables tutorials and a tc filters tutorial [2] but I can’t quite put it all together.
>
>
> [1] https://github.com/dtaht/sch_cake/issues/97#issuecomment-412248518
> [2] http://linux-ip.net/gl/tc-filters/tc-filters.html
>
> Cheers,
>
> Kevin D-B
>
> 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Cake] Ingress classification
2019-02-05 13:38 ` John Sager
@ 2019-02-06 12:52 ` Kevin Darbyshire-Bryant
2019-02-06 13:54 ` Toke Høiland-Jørgensen
2019-02-06 16:19 ` Stephen Hemminger
0 siblings, 2 replies; 8+ messages in thread
From: Kevin Darbyshire-Bryant @ 2019-02-06 12:52 UTC (permalink / raw)
To: John Sager; +Cc: cake
> On 5 Feb 2019, at 13:38, John Sager <john@sager.me.uk> wrote:
>
> As you say, an unsolicited incoming packet doesn't get marked. However it
> creates a conntrack record with zero mark. What you then do is to mark the
> conntrack record later so that all subsequent packets on that connection get
> marked by 'action connmark'. So the first packet gets classified on ifb to
> some low priority queue, but subsequent ones go where they should.
>
> I do this for incoming ssh and VPN connections, though I'm using
> htb/fq_codel rather than cake at the moment.
>
Thank you John, that has confirmed my understanding that in essence it’s not possible in linux to mangle/mark the first packet on ingress and you ideally need the DSCP to be correct.
My router threw me another curve ball in that it was classifying incoming packets correctly but outgoing acks weren’t. Since (ingress) connmarks were based on DSCP values I really couldn’t understand how the connection had been marked correctly for ingress but the egress was wrong.
This turned out to be fallout from openwrt’s software flow offload feature which bypasses some more of the stack. So ingress classification was based on connmarks whilst the egress was based on DSCP and because of the flow offloading the DSCP values weren’t being mangled after the first few packets.
At this stage I’m wondering if its possible to get tc/cake to classify egress based on connmarks instead of relying on DSCP but my tc filter syntax is failing me at the moment :-)
Kevin D-B
012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Cake] Ingress classification
2019-02-06 12:52 ` Kevin Darbyshire-Bryant
@ 2019-02-06 13:54 ` Toke Høiland-Jørgensen
2019-02-10 21:54 ` Kevin Darbyshire-Bryant
2019-02-06 16:19 ` Stephen Hemminger
1 sibling, 1 reply; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-02-06 13:54 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant, John Sager; +Cc: cake
Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:
>> On 5 Feb 2019, at 13:38, John Sager <john@sager.me.uk> wrote:
>>
>> As you say, an unsolicited incoming packet doesn't get marked. However it
>> creates a conntrack record with zero mark. What you then do is to mark the
>> conntrack record later so that all subsequent packets on that connection get
>> marked by 'action connmark'. So the first packet gets classified on ifb to
>> some low priority queue, but subsequent ones go where they should.
>>
>> I do this for incoming ssh and VPN connections, though I'm using
>> htb/fq_codel rather than cake at the moment.
>>
>
> Thank you John, that has confirmed my understanding that in essence
> it’s not possible in linux to mangle/mark the first packet on ingress
> and you ideally need the DSCP to be correct.
Not with iptables, but you can do it with tc filters. Either by writing
a BPF filter, or by using the pedit action (which actually changes bytes
in the packet unlike skbedit).
-Toke
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Cake] Ingress classification
2019-02-06 13:54 ` Toke Høiland-Jørgensen
@ 2019-02-10 21:54 ` Kevin Darbyshire-Bryant
2019-02-10 22:18 ` Toke Høiland-Jørgensen
0 siblings, 1 reply; 8+ messages in thread
From: Kevin Darbyshire-Bryant @ 2019-02-10 21:54 UTC (permalink / raw)
To: Toke Høiland-Jørgensen; +Cc: cake
> On 6 Feb 2019, at 13:54, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:
>
>>
>> Thank you John, that has confirmed my understanding that in essence
>> it’s not possible in linux to mangle/mark the first packet on ingress
>> and you ideally need the DSCP to be correct.
>
> Not with iptables, but you can do it with tc filters. Either by writing
> a BPF filter, or by using the pedit action (which actually changes bytes
> in the packet unlike skbedit).
>
> -Toke
It’s not so much about tweaking DSCP values but more about persuading packets to go into different cake tins for bandwidth allocation/latency target purposes. I’m assuming there’s a performance advantage in not tweaking the packet if at all necessary.
The previously mentioned attempt at getting egress tc filters to work *did* actually succeed. Toke may ‘appreciate’ the following hacked extract from an sqm-scripts layer_cake.qos
egress() {
SILENT=1 $TC qdisc del dev $IFACE root
$TC qdisc add dev $IFACE root $( get_stab_string ) cake \
bandwidth ${UPLINK}kbit $( get_cake_lla_string ) ${EGRESS_CAKE_OPTS} ${EQDISC_OPTS}
MAJOR=$( tc qdisc show dev $IFACE | head -1 | awk '{print $3}' )
$TC filter add dev $IFACE parent $MAJOR protocol ip handle 0x01 fw action skbedit priority ${MAJOR}1
$TC filter add dev $IFACE parent $MAJOR protocol ip handle 0x03 fw action skbedit priority ${MAJOR}3
$TC filter add dev $IFACE parent $MAJOR protocol ip handle 0x04 fw action skbedit priority ${MAJOR}4
}
The ingress side being:
$TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
match u32 0 0 flowid 1:1 action connmark action mirred egress redirect dev $DEV
MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
$TC filter add dev $DEV parent $MAJOR protocol all handle 0x01 fw action skbedit priority ${MAJOR}1
$TC filter add dev $DEV parent $MAJOR protocol all handle 0x03 fw action skbedit priority ${MAJOR}3
$TC filter add dev $DEV parent $MAJOR protocol all handle 0x04 fw action skbedit priority ${MAJOR}4
# Configure iptables chain to mark packets
ipt -t mangle -N QOS_MARK_${IFACE}
A variety of rules along the lines (to set the packet mark)
iptables -t mangle -A QOS_MARK_${IFACE} -p tcp -s 192.168.218.5/255.255.255.255 -m comment \
--comment "Skybox DSCP CS1 Bulk" -j MARK --set-mark 0x01/0xff
# save the packet mark to connmark
ipt -t mangle -A QOS_MARK_${IFACE} -j CONNMARK --save-mark
# Send unmarked connections to the marking chain
ipt -t mangle -A PREROUTING -i $IFACE -m mark --mark 0x00/0xff -g QOS_MARK_${IFACE}
ipt -t mangle -A POSTROUTING -o $IFACE -m mark --mark 0x00/0xff -g QOS_MARK_${IFACE}
The vast majority of the egress stuff above being shamelessly stolen from a github entry I saw ;-)
I do wonder if there’s a more efficient way of doing it though. Setting CONNMARK directly instead of setting a packet mark and then copying that across to a connmark would appear sensible?
Cheers,
Kevin D-B
012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Cake] Ingress classification
2019-02-10 21:54 ` Kevin Darbyshire-Bryant
@ 2019-02-10 22:18 ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-02-10 22:18 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: cake
Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:
>> On 6 Feb 2019, at 13:54, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:
>>
>>>
>>> Thank you John, that has confirmed my understanding that in essence
>>> it’s not possible in linux to mangle/mark the first packet on ingress
>>> and you ideally need the DSCP to be correct.
>>
>> Not with iptables, but you can do it with tc filters. Either by writing
>> a BPF filter, or by using the pedit action (which actually changes bytes
>> in the packet unlike skbedit).
>>
>> -Toke
>
> It’s not so much about tweaking DSCP values but more about persuading
> packets to go into different cake tins for bandwidth
> allocation/latency target purposes. I’m assuming there’s a
> performance advantage in not tweaking the packet if at all necessary.
I very much doubt you would be able to measure any difference between
the two approaches. And actually remarking the packets would keep the
effect when they traverse the network (say, for WiFi links).
> The previously mentioned attempt at getting egress tc filters to work
> *did* actually succeed. Toke may ‘appreciate’ the following hacked
> extract from an sqm-scripts layer_cake.qos
>
>
> egress() {
> SILENT=1 $TC qdisc del dev $IFACE root
> $TC qdisc add dev $IFACE root $( get_stab_string ) cake \
> bandwidth ${UPLINK}kbit $( get_cake_lla_string ) ${EGRESS_CAKE_OPTS} ${EQDISC_OPTS}
>
> MAJOR=$( tc qdisc show dev $IFACE | head -1 | awk '{print $3}' )
> $TC filter add dev $IFACE parent $MAJOR protocol ip handle 0x01 fw action skbedit priority ${MAJOR}1
> $TC filter add dev $IFACE parent $MAJOR protocol ip handle 0x03 fw action skbedit priority ${MAJOR}3
> $TC filter add dev $IFACE parent $MAJOR protocol ip handle 0x04 fw action skbedit priority ${MAJOR}4
> }
>
> The ingress side being:
>
> $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
> match u32 0 0 flowid 1:1 action connmark action mirred egress redirect dev $DEV
>
> MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
> $TC filter add dev $DEV parent $MAJOR protocol all handle 0x01 fw action skbedit priority ${MAJOR}1
> $TC filter add dev $DEV parent $MAJOR protocol all handle 0x03 fw action skbedit priority ${MAJOR}3
> $TC filter add dev $DEV parent $MAJOR protocol all handle 0x04 fw action skbedit priority ${MAJOR}4
>
> # Configure iptables chain to mark packets
> ipt -t mangle -N QOS_MARK_${IFACE}
>
> A variety of rules along the lines (to set the packet mark)
> iptables -t mangle -A QOS_MARK_${IFACE} -p tcp -s 192.168.218.5/255.255.255.255 -m comment \
> --comment "Skybox DSCP CS1 Bulk" -j MARK --set-mark 0x01/0xff
>
> # save the packet mark to connmark
> ipt -t mangle -A QOS_MARK_${IFACE} -j CONNMARK --save-mark
>
> # Send unmarked connections to the marking chain
> ipt -t mangle -A PREROUTING -i $IFACE -m mark --mark 0x00/0xff -g QOS_MARK_${IFACE}
> ipt -t mangle -A POSTROUTING -o $IFACE -m mark --mark 0x00/0xff -g QOS_MARK_${IFACE}
>
>
> The vast majority of the egress stuff above being shamelessly stolen
> from a github entry I saw ;-)
>
>
> I do wonder if there’s a more efficient way of doing it though.
> Setting CONNMARK directly instead of setting a packet mark and then
> copying that across to a connmark would appear sensible?
Depending on how many rules you have, my guess would be that the most
inefficient thing is traversing all of them. You could use ipset to
alleviate this, I guess. Or reimplement the whole thing as a single BPF
filter...
Or maybe just re-evaluate whether you really need that convoluted a
ruleset? ;)
-Toke
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Cake] Ingress classification
2019-02-06 12:52 ` Kevin Darbyshire-Bryant
2019-02-06 13:54 ` Toke Høiland-Jørgensen
@ 2019-02-06 16:19 ` Stephen Hemminger
2019-02-07 16:28 ` Kevin Darbyshire-Bryant
1 sibling, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2019-02-06 16:19 UTC (permalink / raw)
To: cake
On Wed, 6 Feb 2019 12:52:22 +0000
Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
> > On 5 Feb 2019, at 13:38, John Sager <john@sager.me.uk> wrote:
> >
> > As you say, an unsolicited incoming packet doesn't get marked. However it
> > creates a conntrack record with zero mark. What you then do is to mark the
> > conntrack record later so that all subsequent packets on that connection get
> > marked by 'action connmark'. So the first packet gets classified on ifb to
> > some low priority queue, but subsequent ones go where they should.
> >
> > I do this for incoming ssh and VPN connections, though I'm using
> > htb/fq_codel rather than cake at the moment.
> >
>
> Thank you John, that has confirmed my understanding that in essence it’s not possible in linux to mangle/mark the first packet on ingress and you ideally need the DSCP to be correct.
>
> My router threw me another curve ball in that it was classifying incoming packets correctly but outgoing acks weren’t. Since (ingress) connmarks were based on DSCP values I really couldn’t understand how the connection had been marked correctly for ingress but the egress was wrong.
>
> This turned out to be fallout from openwrt’s software flow offload feature which bypasses some more of the stack. So ingress classification was based on connmarks whilst the egress was based on DSCP and because of the flow offloading the DSCP values weren’t being mangled after the first few packets.
>
> At this stage I’m wondering if its possible to get tc/cake to classify egress based on connmarks instead of relying on DSCP but my tc filter syntax is failing me at the moment :-)
It is possible to use a tc ingress filter to remark DSCP.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Cake] Ingress classification
2019-02-06 16:19 ` Stephen Hemminger
@ 2019-02-07 16:28 ` Kevin Darbyshire-Bryant
0 siblings, 0 replies; 8+ messages in thread
From: Kevin Darbyshire-Bryant @ 2019-02-07 16:28 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Cake List
> On 6 Feb 2019, at 16:19, Stephen Hemminger <stephen@networkplumber.org> wrote:
>
>>
>>
>> At this stage I’m wondering if its possible to get tc/cake to classify egress based on connmarks instead of relying on DSCP but my tc filter syntax is failing me at the moment :-)
>
> It is possible to use a tc ingress filter to remark DSCP.
Hmmm. I wonder if this would be more efficient to re-mark the DSCP code and let cake use its normal DSCP to tin mapping instead of overriding via an skb priority edit?
My new problem is using fw marks with tc filters changing skb priority on *egress*. If I could get that to work then I think it would become compatible with software flow offloading.
I can’t get it to work and don’t know if I’m doing something wrong (likely) with my tc incantations or if it’s just not possible.
I think something like the following should work, assuming the cake instance handling egress on eth0 is 809a: as shown by
# tc qdisc show dev ifb4eth0
qdisc cake 809b: root refcnt 2 bandwidth 78Mbit diffserv4 dual-dsthost nat nowash ingress no-ack-filter split-gso rtt 100.0ms ptm overhead 26
So the following should change the skb priority field based on a firewall connmark of 0x01
tc filter add dev eth0 parent 809a: protocol all handle 0x01 fw action skbedit priority 809a:4
Cheers,
Kevin D-B
012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-02-10 22:18 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-05 12:08 [Cake] Ingress classification Kevin Darbyshire-Bryant
2019-02-05 13:38 ` John Sager
2019-02-06 12:52 ` Kevin Darbyshire-Bryant
2019-02-06 13:54 ` Toke Høiland-Jørgensen
2019-02-10 21:54 ` Kevin Darbyshire-Bryant
2019-02-10 22:18 ` Toke Høiland-Jørgensen
2019-02-06 16:19 ` Stephen Hemminger
2019-02-07 16:28 ` Kevin Darbyshire-Bryant
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox