Cake - FQ_codel the next generation
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk>
Cc: Pete Heist <pete@heistp.net>,
	"cake\@lists.bufferbloat.net" <cake@lists.bufferbloat.net>
Subject: Re: [Cake] Using firewall connmarks as tin selectors
Date: Mon, 04 Mar 2019 13:44:33 +0100	[thread overview]
Message-ID: <871s3mlsge.fsf@toke.dk> (raw)
In-Reply-To: <52F1A35A-9F57-40EC-AF9B-EC6D8BD376BE@darbyshire-bryant.me.uk>

Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:

>> On 4 Mar 2019, at 11:17, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> 
>> Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> writes:
>> 
>>>> On 4 Mar 2019, at 08:39, Pete Heist <pete@heistp.net> wrote:
>>>> 
>>>> 
>>>>> On Mar 3, 2019, at 12:52 PM, Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
>>>>> 
>>>>> The very bad idea:
>>>>> 
>>>>> And it’s bad ‘cos it’s sort of incompatible with the existing fwmark
>>>>> implementation as described above. So an awful lot of our
>>>>> shenanigans above is due to DSCP not traversing the internet
>>>>> particularly well. The solution above abstracts DSCP into ’tins’
>>>>> which we put into fwmarks. Another approach would be to put the DSCP
>>>>> *into* the fwmark. CAKE could (optionally) copy the FWMARK contained
>>>>> DSCP into the diffserv field onto the actual packets. Voila DSCP
>>>>> traversal across ’tinternet with tin/bandwidth allocation in our
>>>>> local domain preserved.
>>>> 
>>>> If I understand it right, another use case for this “very bad idea”
>>>> is preserving DSCP locally while traversing upstream WiFi links as
>>>> besteffort, which avoids airtime efficiency problems that can occur
>>>> with 802.11e (WMM). In cases where the router config can’t be changed
>>>> (802.11e is mandatory after all) I’ve used IPIP tunnels for this, as
>>>> it hides DSCP from the WiFi stack while preserving the values through
>>>> the tunnel, but this would be easier. Neat… :)
>>> 
>>> Everyone has understood the intent & maybe the implementation
>>> correctly. 2 patches attached, one for cake, one for tc.
>>> 
>>> They are naively coded and some of it undoes Toke’s recent tidying up
>>> (sorry!)
>> 
>> Heh. First comment: Don't do that ;)
>
> I did say naively coded.
>
>> 
>> A few more below.
>> 
>>> 012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A
>>> diff --git a/pkt_sched.h b/pkt_sched.h
>>> index a2f570c..d1f288d 100644
>>> --- a/pkt_sched.h
>>> +++ b/pkt_sched.h
>>> @@ -879,6 +879,7 @@ enum {
>>> 	TCA_CAKE_ACK_FILTER,
>>> 	TCA_CAKE_SPLIT_GSO,
>>> 	TCA_CAKE_FWMARK,
>>> +	TCA_CAKE_ICING,
>>> 	__TCA_CAKE_MAX
>>> };
>>> #define TCA_CAKE_MAX	(__TCA_CAKE_MAX - 1)
>>> diff --git a/sch_cake.c b/sch_cake.c
>>> index 733b897..5aca0f3 100644
>>> --- a/sch_cake.c
>>> +++ b/sch_cake.c
>>> @@ -270,7 +270,8 @@ enum {
>>> 	CAKE_FLAG_INGRESS	   = BIT(2),
>>> 	CAKE_FLAG_WASH		   = BIT(3),
>>> 	CAKE_FLAG_SPLIT_GSO	   = BIT(4),
>>> -	CAKE_FLAG_FWMARK	   = BIT(5)
>>> +	CAKE_FLAG_FWMARK	   = BIT(5),
>>> +	CAKE_FLAG_ICING		   = BIT(6)
>> 
>> This implies that icing and fwmark can be enabled completely
>> independently of each other. Are you sure about the semantics for that?
>
> No, I’m not.  I sent the patches so others could see my lunacy in action and hopefully improve it.
>
> The actual operation links FWMARK, INGRESS & ICING in a variety of
> combinations.

Right, so obviously this needs to be thought through...

>>> };
>>> 
>>> /* COBALT operates the Codel and BLUE algorithms in parallel, in order to
>>> @@ -333,7 +334,7 @@ static const u8 diffserv8[] = {
>>> };
>>> 
>>> static const u8 diffserv4[] = {
>>> -	0, 2, 0, 0, 2, 0, 0, 0,
>>> +	0, 1, 0, 0, 2, 0, 0, 0,
>>> 	1, 0, 0, 0, 0, 0, 0, 0,
>>> 	2, 0, 2, 0, 2, 0, 2, 0,
>>> 	2, 0, 2, 0, 2, 0, 2, 0,
>>> @@ -344,7 +345,7 @@ static const u8 diffserv4[] = {
>>> };
>>> 
>>> static const u8 diffserv3[] = {
>>> -	0, 0, 0, 0, 2, 0, 0, 0,
>>> +	0, 1, 0, 0, 2, 0, 0, 0,
>> 
>> Why are you messing with the diffserv mappings in this patch?
>
> This is a combination patch of Dave’s new LE coding and the
> fwmark/dscp mangling.

Ah. Well let's keep that separate from this patch/discussion...

>> 
>>> 	1, 0, 0, 0, 0, 0, 0, 0,
>>> 	0, 0, 0, 0, 0, 0, 0, 0,
>>> 	0, 0, 0, 0, 0, 0, 0, 0,
>>> @@ -1618,7 +1619,24 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free)
>>> 	return idx + (tin << 16);
>>> }
>>> 
>>> -static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
>>> +void cake_update_diffserv(struct sk_buff *skb, u8 dscp)
>>> +{
>>> +	switch (skb->protocol) {
>>> +	case htons(ETH_P_IP):
>>> +		if ((ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK) != dscp)
>>> +			ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, dscp);
>>> +		break;
>>> +	case htons(ETH_P_IPV6):
>>> +		if ((ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK) != dscp)
>>> +			ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, dscp);
>>> +		break;
>>> +	default:
>>> +		break;
>>> +	}
>>> +
>>> +}
>> 
>> So washing is just a special case of this (wash is
>> cake_update_diffserv(skb,0)). So you shouldn't need to add another
>> function, just augment the existing handling code.
>
> Erm, that’s exactly what I’ve done.

Ah, right; I guess it's just the reverting of the cleanup patch that is
the issue, then :)

>>> +static u8 cake_handle_diffserv(struct sk_buff *skb, bool wash)
>>> {
>>> 	u8 dscp;
>>> 
>>> @@ -1644,37 +1662,70 @@ static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
>>> 	}
>>> }
>>> 
>>> +#if IS_REACHABLE(CONFIG_NF_CONNTRACK)
>> 
>> Save an ifdef below by moving the ifdef inside the function definition.
>> 
>>> +void cake_update_ct_mark(struct sk_buff *skb, u8 dscp)
>>> +{
>>> +	enum ip_conntrack_info ctinfo;
>>> +	struct nf_conn *ct;
>>> +
>>> +	ct = nf_ct_get(skb, &ctinfo);
>>> +	if (!ct)
>>> +		return;
>>> +
>>> +	ct->mark &= 0x80ffffff;
>>> +	ct->mark |= (0x40 | dscp) << 24;
>> 
>> Right, so we *might* have an argument that putting the *tin* into the
>> fwmark is CAKE's business, but copying over the dscp mark is not
>> something a qdisc should be doing…
>
> Why ever not?  It’s not the DSCP, it’s a lookup value into the cake
> priority table, it just happens to look like the DSCP ;-)

If it quacks like a duck...

>
>>> +	nf_conntrack_event_cache(IPCT_MARK, ct);
>>> +}
>>> +#endif
>> 
>> Also, are you sure this will work in all permutations of conntrack being
>> a module vs not etc? (we had to jump through quite some hoops to get the
>> conntrack hooks to work last time; this is probably my biggest worry here).
>
> No, I have absolutely no clue here at all.

Well, another issue that needs fixing, then...

-Toke

  reply	other threads:[~2019-03-04 12:44 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-27 21:12 Felix Resch
2019-02-28  3:24 ` gamanakis
2019-03-03 11:52   ` Kevin Darbyshire-Bryant
2019-03-03 12:22     ` John Sager
2019-03-03 16:25       ` Kevin Darbyshire-Bryant
2019-03-04 11:04         ` Toke Høiland-Jørgensen
2019-03-04 11:39           ` John Sager
2019-03-04  5:37     ` Ryan Mounce
2019-03-04  6:31       ` Jonathan Morton
2019-03-04  6:37         ` Ryan Mounce
2019-03-04  7:15           ` Dave Taht
2019-03-04  8:39     ` Pete Heist
2019-03-04 11:01       ` Kevin Darbyshire-Bryant
2019-03-04 11:17         ` Toke Høiland-Jørgensen
2019-03-04 11:55           ` Kevin Darbyshire-Bryant
2019-03-04 12:44             ` Toke Høiland-Jørgensen [this message]
2019-03-04 15:50               ` Kevin Darbyshire-Bryant
2019-03-04 16:39                 ` Toke Høiland-Jørgensen
2019-03-04 17:19                   ` Kevin Darbyshire-Bryant
2019-03-04 17:36                     ` Toke Høiland-Jørgensen
2019-03-04 20:58                       ` Kevin Darbyshire-Bryant
2019-03-04 21:33                         ` Toke Høiland-Jørgensen
2019-03-04 21:42                           ` Toke Høiland-Jørgensen
2019-03-05 14:06                           ` Kevin Darbyshire-Bryant
  -- strict thread matches above, loose matches on Subject: below --
2019-02-27 14:52 Kevin Darbyshire-Bryant
2019-02-27 15:14 ` Toke Høiland-Jørgensen
2019-02-28  8:32   ` Kevin Darbyshire-Bryant
2019-02-28  9:54     ` Toke Høiland-Jørgensen
2019-02-28 11:00       ` Kevin Darbyshire-Bryant
2019-02-28 11:13         ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871s3mlsge.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=cake@lists.bufferbloat.net \
    --cc=kevin@darbyshire-bryant.me.uk \
    --cc=pete@heistp.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox