[Cake] act_connmark + dscp
Kevin Darbyshire-Bryant
kevin at darbyshire-bryant.me.uk
Thu Mar 7 10:56:51 EST 2019
> On 7 Mar 2019, at 10:10, Toke Høiland-Jørgensen <toke at redhat.com> wrote:
<snipping some context>
>>>>
>>>> The valid bit is set when the ‘getdscp’ function has written a DSCP
>>>> value into the conntrack (& hence skb) mark. This allows us & other
>>>> skb->mark/ct->mark applications (eg iptables, cake qdisc) to know that
>>>> a DSCP value has been placed in the field. We cannot simply use a
>>>> non-zero DSCP because zero is a valid DSCP.
>>>
>>> If someone installs the action, the field is supposedly always copied;
>>> so why do we need another flag?
>>
>> I’m trying to limit the number of times expensive iptables mangle
>> rules have to run.
>
> Right, I see your point, but I'm worried that this can risk becoming a
> source of hard-to-debug bugs if this bit happens to get set by accident
> in other places. So, I would suggest to at least make it optional (and
> configurable). So how about the following configuration options:
Phew, I explained myself clearly enough (just) that time :-). There’s a compromise here to be had between setting/using a DSCP for connection(fwmark) vs setting/using a DSCP per packet. Mostly it’s a (improved) performance vs DSCP accuracy compromise and assumes DSCP isn’t going to change after the connection has been established - some people have rules that mangle DSCP based on amount of data transferred. Personal view: anything that permits some sort of classification restoration on ingress has to be an improvement on what we have now.
>
> - fwmark mask (32-bit value)
> specifies a mask to apply to the fwmark field before all operations
Yes - makes sense.
>
> - fwmark shift (valid values 0-32)
> specifies how many bits to left-shift the DSCP values before putting
> them into the fwmark (and how many bits to right-shift the value read
> from fwmark before writing it to the DSCP field); this could also be
> inferred from the mask rather than be a separate option (shift =
> lowest set bit of mask)
I think the inferred choice is best. I can see all manner of confusing behaviour with mismatches between mask & shift. Also it’s one less parameter to pass … and in my dream world have to add some of these parameters into cake as well so it can interpret the DSCP containing fwmark as well… the fewer the better :-)
>
> - get_dscp (boolean; cannot be set along with set_dscp)
> if set, the DSCP field will be copied to the fwmark field, subject to
> mask and shift
Yes
>
> - set_dscp (boolean; cannot be set along with get_dscp)
> if set, the value in fwmark will be copied to the DSCP field, subject
> to mask and shift
Yes
>
> - state mask (32-bit value; probably needs better name)
> if set: the get_dscp action will OR the resulting fwmark before
> storing it. the set_dscp value will AND the fwmark with this value
> before doing anything, and abort if the result is false.
Nearly: If set; the get_dscp action will AND the fwmark with this value and abort if true. If false it will OR the resulting fwmark before storing it. the set_dscp action will AND the fwmark with this value before doing anything and will abort if result is false.
The ‘get_dscp action AND with fwmark and abort if true’ is the function that allows DSCP values to ‘stick' into a connection, thus obviating the need for iptables rules to mangle the DSCP on every egress packet.
I can see a conflict between people who want the DSCP copied into fwmark no matter what and people such as myself who wish it to only happen if fwmark doesn’t have it set already.
Is the solution to have a ‘get_check_state’ option that enables the ‘get_dscp action and with fwmark ’state mask’ abort if true check? Maybe even simpler to have a ‘get_state mask’ and a ’set_state mask’ - AND the fwmark with the get_state mask, if false copy the dscp, else abort.
>
> I think this would allow you to implement what you described, without
> hard-coding any behaviour. Right?
Not quite - see above.
>
> Does anyone else have any opinions / objections to the above API?
>
>
>> The reality is that I enjoyed doing this in the cake codebase. I
>> cannot say the same for act_connmark in fact I hate it so much I’m
>> stopping. The mental effort for a non-programmer and more importantly
>> a non-kernel programmer is exhausting & I’m completely disillusioned.
>> I really need to concentrate on the job that means I can pay the
>> mortgage, which isn’t bashing my head against the kernel.
>
> Fair enough; no reason to do this if you're not enjoying it! We can
> iterate on the API, and I guess I can write the code at some point in
> the future if no one else beats me to it. No promises on when, though,
> so if someone else feels like tackling it, please go ahead :)
>
> -Toke
More information about the Cake
mailing list