From: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: "cake@lists.bufferbloat.net" <cake@lists.bufferbloat.net>
Subject: Re: [Cake] act_connmark + dscp
Date: Fri, 8 Mar 2019 11:13:33 +0000 [thread overview]
Message-ID: <00E839ED-7FA4-4577-838F-775EC9A90608@darbyshire-bryant.me.uk> (raw)
In-Reply-To: <878sxq1t3e.fsf@toke.dk>
> On 7 Mar 2019, at 17:40, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:
>
>>> On 7 Mar 2019, at 10:10, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> <snipping some context>
>>>>>>
>>>>>> The valid bit is set when the ‘getdscp’ function has written a DSCP
>>>>>> value into the conntrack (& hence skb) mark. This allows us & other
>>>>>> skb->mark/ct->mark applications (eg iptables, cake qdisc) to know that
>>>>>> a DSCP value has been placed in the field. We cannot simply use a
>>>>>> non-zero DSCP because zero is a valid DSCP.
>>>>>
>>>>> If someone installs the action, the field is supposedly always copied;
>>>>> so why do we need another flag?
>>>>
>>>> I’m trying to limit the number of times expensive iptables mangle
>>>> rules have to run.
>>>
>>> Right, I see your point, but I'm worried that this can risk becoming a
>>> source of hard-to-debug bugs if this bit happens to get set by accident
>>> in other places. So, I would suggest to at least make it optional (and
>>> configurable). So how about the following configuration options:
>>
>> Phew, I explained myself clearly enough (just) that time :-). There’s
>> a compromise here to be had between setting/using a DSCP for
>> connection(fwmark) vs setting/using a DSCP per packet. Mostly it’s a
>> (improved) performance vs DSCP accuracy compromise and assumes DSCP
>> isn’t going to change after the connection has been established - some
>> people have rules that mangle DSCP based on amount of data
>> transferred. Personal view: anything that permits some sort of
>> classification restoration on ingress has to be an improvement on what
>> we have now.
>
> So I don't think we in general can assume that all packets of the same
> flow has the same DSCP mark. If we want to enforce this, that is another
> matter entirely.
>
>>> - fwmark shift (valid values 0-32)
>>> specifies how many bits to left-shift the DSCP values before putting
>>> them into the fwmark (and how many bits to right-shift the value read
>>> from fwmark before writing it to the DSCP field); this could also be
>>> inferred from the mask rather than be a separate option (shift =
>>> lowest set bit of mask)
>>
>> I think the inferred choice is best. I can see all manner of confusing
>> behaviour with mismatches between mask & shift. Also it’s one less
>> parameter to pass … and in my dream world have to add some of these
>> parameters into cake as well so it can interpret the DSCP containing
>> fwmark as well… the fewer the better :-)
>
> Yeah, I tend to lean towards that as well. Which could also work for
> cake: simply interpret the FWMARK parameter as a mask, and shift by the
> lowest set bit.
>
>>> - get_dscp (boolean; cannot be set along with set_dscp)
>>> if set, the DSCP field will be copied to the fwmark field, subject to
>>> mask and shift
>>
>> Yes
>>
>>>
>>> - set_dscp (boolean; cannot be set along with get_dscp)
>>> if set, the value in fwmark will be copied to the DSCP field, subject
>>> to mask and shift
>>
>> Yes
>>
>>>
>>> - state mask (32-bit value; probably needs better name)
>>> if set: the get_dscp action will OR the resulting fwmark before
>>> storing it. the set_dscp value will AND the fwmark with this value
>>> before doing anything, and abort if the result is false.
>>
>> Nearly: If set; the get_dscp action will AND the fwmark with this
>> value and abort if true. If false it will OR the resulting fwmark
>> before storing it. the set_dscp action will AND the fwmark with this
>> value before doing anything and will abort if result is false.
>>
>> The ‘get_dscp action AND with fwmark and abort if true’ is the
>> function that allows DSCP values to ‘stick' into a connection, thus
>> obviating the need for iptables rules to mangle the DSCP on every
>> egress packet.
>>
>> I can see a conflict between people who want the DSCP copied into
>> fwmark no matter what and people such as myself who wish it to only
>> happen if fwmark doesn’t have it set already.
>>
>> Is the solution to have a ‘get_check_state’ option that enables the
>> ‘get_dscp action and with fwmark ’state mask’ abort if true check?
>> Maybe even simpler to have a ‘get_state mask’ and a ’set_state mask’ -
>> AND the fwmark with the get_state mask, if false copy the dscp, else
>> abort.
>
> Right, if you want to be able to do all those combinations of things we
> quickly run into combinatorial explosion, or we risk hard-coding a
> policy that some users won't like. However, if we do want to express
> these kinds of things, maybe this is better implemented as additional
> options to the existing iptables matches?
>
> I.e., we could add
>
> --set-mark-from-dscp MASK
Yes, that (I think) could work. I could wrap my own iptables based ‘have I copied the required DSCP to the fwmark’ rules around that by setting my own flag bit in the fwmark.
but..
>
> to the MARK target, and
>
> --set-dscp-from-fwmark MASK
>
> to the DSCP target. Then you can implement whatever policy you want
> using iptables rules... Any reason that wouldn't work?
On its own I don’t think that would work for ingress traffic - iptables happens too late. So on planet Kevin I still need some sort of flag held in the fwmark that says ‘I hold a DSCP value’ so cake can use it and act_connmarkdscp can (optionally) restore it to the diffserv field.
I suspect we’re going around in circles around what I would like which is “a bit DSCP fuzzy but lighter on CPU ‘cos I don’t have to hit iptables mangle rules as much” v what I think you would like is ’update the fwmark DSCP every time but that also requires iptables to mangle the DSCP for every packet’
I think the options of get_mask & set_mask can accommodate both behaviour choices.
Phew this is hard - come on all you out there… i know you exist! What am I missing/misunderstanding? Ryan on this list previously said "Perfect! And to me, this functionality truly is the icing on (the) cake that makes it the complete bufferbloat/QoS system I've been
dreaming of for ingress.” Thoughts, comments or do Toke & myself make an amusing enough side show? :-)
Kevin
next prev parent reply other threads:[~2019-03-08 11:13 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-05 14:35 Kevin Darbyshire-Bryant
2019-03-06 15:21 ` Toke Høiland-Jørgensen
2019-03-06 16:47 ` John Sager
2019-03-07 9:50 ` Toke Høiland-Jørgensen
2019-03-06 18:40 ` Kevin Darbyshire-Bryant
2019-03-07 10:10 ` Toke Høiland-Jørgensen
2019-03-07 15:56 ` Kevin Darbyshire-Bryant
2019-03-07 17:40 ` Toke Høiland-Jørgensen
2019-03-08 11:13 ` Kevin Darbyshire-Bryant [this message]
2019-03-08 11:28 ` Toke Høiland-Jørgensen
2019-03-08 14:03 ` Kevin Darbyshire-Bryant
2019-03-09 14:08 ` Toke Høiland-Jørgensen
2019-03-10 15:21 ` Kevin Darbyshire-Bryant
2019-03-10 23:56 ` Toke Høiland-Jørgensen
2019-03-11 10:51 ` Kevin Darbyshire-Bryant
2019-03-11 13:00 ` Toke Høiland-Jørgensen
2019-03-11 14:11 ` Kevin Darbyshire-Bryant
2019-03-11 14:32 ` Toke Høiland-Jørgensen
2019-03-09 20:21 ` John Sager
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=00E839ED-7FA4-4577-838F-775EC9A90608@darbyshire-bryant.me.uk \
--to=kevin@darbyshire-bryant.me.uk \
--cc=cake@lists.bufferbloat.net \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox