From: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: "cake@lists.bufferbloat.net" <cake@lists.bufferbloat.net>
Subject: Re: [Cake] act_connmark + dscp
Date: Mon, 11 Mar 2019 10:51:24 +0000 [thread overview]
Message-ID: <1D3F7894-4A8F-4D57-BE27-18113FB14131@darbyshire-bryant.me.uk> (raw)
In-Reply-To: <87imwq4724.fsf@toke.dk>
I’m tired after the last 3 days working on $paidwork (4 jobs in 3 days) so can’t think straight, anyway...
> On 10 Mar 2019, at 23:56, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:
>
<some context snipped>
>>
>> Yes. I’d point out the (hopefully obvious) that the flag mask needs
>> to be one bit bigger than you might immediately think. e.g. diffserv4
>> needs to store 5 values (0-4), 3 bits. 0 is being used as an implied
>> ’tin is not set, fall back to DSCP’. One could store DSCP+1 of course
>> and use the same logic.
>
> Yeah, or one can just squat on a whole byte like I do in the example ;)
As do I for my ‘DSCP + set/not set flag’ :-)
>
>>
>> The ugliness of doing the diffserv parsing is what prompted the idea
>> of storing the DSCP directly and I felt the stored tin selection was
>> effectively abstracting the diffserv field anyway.
>
> Right, but that means that the CAKE interpretation of the fwmark would
> have to change from something that selects the tin, to something that is
> treated as a DSCP mark. I think this was the part that I was missing
> before. I don't think this is a good idea, as that means we tie the
> marks to one particular use case.
I did say it was incompatible with the existing tin storing idea right from day 1 and why I was so keen/worried by the fact it had already gone upstream.
>
>> Storing the DSCP is more compatible with differing egress v ingress
>> mappings (eg. egress diffserv4, ingress diffserv3 though I can’t
>> really think of a use case for that)
>
> I think that if someone wants to do something like that, we are way out
> of "simple use case that we want to actively support" territory, and can
> legitimately ask people to go write a BPF filter or something :)
>
>> Of course using fwmark as tin number selector in cake doesn’t preclude
>> some other mechanism of storing & recovering DSCP to/from firewall
>> mark e.g. the previously discussed act-connmark+dscp which would help
>> anyone who wanted to do such ‘link traversing’ DSCP shenanigans. That
>> of course makes you happier since cake doesn’t embed itself further
>> into conntrack.
>
> Yeah, I definitely don't think CAKE has any business writing DSCP values
> into the mark. However, as I said before, there may be a case for adding
I’m going forward on the basis that *cake* ’storing DSCP within fwmark’ idea has died and will be trying to cobble something together within act_connmark as that has a wider potential client base than just cake.
> an option to write the tin selection back to conntrack. Something like
> the patch below would do it (with an option to control it, of course),
> but it does incur a dependency on another conntrack header, so I'm not
> sure if it will be acceptable to upstream. Also, we would need to figure
> out how the option should work.
I think before expending any further mental effort on that, the question should be asked of the kernel people. I don’t see the point in working out the semantics of something that ultimately won’t be accepted.
> The alternative would be to use another mechanism; the iptables rules
> replication is one example. Another could be adding a conntrack helper
> to eBPF…
>
> -Toke
>
>
> diff --git a/sch_cake.c b/sch_cake.c
> index a8fa224..c6b7dd9 100644
> --- a/sch_cake.c
> +++ b/sch_cake.c
> @@ -78,6 +78,7 @@
>
> #if IS_REACHABLE(CONFIG_NF_CONNTRACK)
> #include <net/netfilter/nf_conntrack_core.h>
> +#include <net/netfilter/nf_conntrack_ecache.h>
> #include <net/netfilter/nf_conntrack_zones.h>
> #include <net/netfilter/nf_conntrack.h>
> #endif
> @@ -1646,6 +1647,27 @@ static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
> }
> }
>
> +static void cake_set_tin_connmark(struct cake_sched_data *q,
> + struct sk_buff *skb, u32 tin)
> +{
> +#if IS_REACHABLE(CONFIG_NF_CONNTRACK)
> + enum ip_conntrack_info ctinfo;
> + struct nf_conn *ct;
> + u32 newmark;
> +
> + ct = nf_ct_get(skb, &ctinfo);
> + if (ct) {
> + newmark = (ct->mark & ~q->fwmark_mask);
> + newmark ^= (tin << q->fwmark_shft) & q->fwmark_mask;
> +
> + if (ct->mark != newmark) {
> + ct->mark = newmark;
> + nf_conntrack_event_cache(IPCT_MARK, ct);
> + }
> + }
> +#endif
> +}
> +
> static struct cake_tin_data *cake_select_tin(struct Qdisc *sch,
> struct sk_buff *skb)
> {
> @@ -1678,6 +1700,8 @@ static struct cake_tin_data *cake_select_tin(struct Qdisc *sch,
> tin = 0;
> }
>
> + cake_set_tin_connmark(q, skb, tin);
> +
> return &q->tins[tin];
> }
Cheers,
Kevin D-B
012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A
next prev parent reply other threads:[~2019-03-11 10:51 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-05 14:35 Kevin Darbyshire-Bryant
2019-03-06 15:21 ` Toke Høiland-Jørgensen
2019-03-06 16:47 ` John Sager
2019-03-07 9:50 ` Toke Høiland-Jørgensen
2019-03-06 18:40 ` Kevin Darbyshire-Bryant
2019-03-07 10:10 ` Toke Høiland-Jørgensen
2019-03-07 15:56 ` Kevin Darbyshire-Bryant
2019-03-07 17:40 ` Toke Høiland-Jørgensen
2019-03-08 11:13 ` Kevin Darbyshire-Bryant
2019-03-08 11:28 ` Toke Høiland-Jørgensen
2019-03-08 14:03 ` Kevin Darbyshire-Bryant
2019-03-09 14:08 ` Toke Høiland-Jørgensen
2019-03-10 15:21 ` Kevin Darbyshire-Bryant
2019-03-10 23:56 ` Toke Høiland-Jørgensen
2019-03-11 10:51 ` Kevin Darbyshire-Bryant [this message]
2019-03-11 13:00 ` Toke Høiland-Jørgensen
2019-03-11 14:11 ` Kevin Darbyshire-Bryant
2019-03-11 14:32 ` Toke Høiland-Jørgensen
2019-03-09 20:21 ` John Sager
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1D3F7894-4A8F-4D57-BE27-18113FB14131@darbyshire-bryant.me.uk \
--to=kevin@darbyshire-bryant.me.uk \
--cc=cake@lists.bufferbloat.net \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox