From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-f65.google.com (mail-ed1-f65.google.com [209.85.208.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 00AC83B29E for ; Mon, 4 Mar 2019 07:44:35 -0500 (EST) Received: by mail-ed1-f65.google.com with SMTP id m35so4110418ede.10 for ; Mon, 04 Mar 2019 04:44:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=jKcW2NNPNKdIuGcJfDFrfDJYE/SMrnlKhnCWSZfioDc=; b=E4OfW4Gs42lEmw5Aa6gHSX41YH+/W1ZM/axPtTRjvz38HnHyKYFMjEvBoDqk9OSm7d zwuOdwxuKWwiwk6VmdvMoompThT4MUE29MQSVs26P0894aye/jCVeGxDATNA5it6F3yw x3T3nu5QcNZNGq5WlBYAgdKYyxPH6JQhXAGep4EqJEZkjME6vQ2/Ws3f8QSegzFJBG3e ebSTlhYBCie8P/SJrBCLFJKtkbukVMYXftZRMFO5L+D+nH5gMCluwZvM0cPhddg7nS+Y td9xPijRbQRDPEf5bZr9t27qGXcsAU0pZnAVeKeCYtmjfqAUV6/uQv+0bInrsGBKwB9F ApDw== X-Gm-Message-State: APjAAAUzfTewVngbXeaHYV3OjtSuDd5s87gnfmXu2lKs8YBnvPEGsIG+ d8Lof+2h190RwulicXCs4z+QEGCE1G8= X-Google-Smtp-Source: APXvYqx0r8G5InToUI7VwZpP+HuUSIM0WTkJH0bGHSEd39IUToh6YdbdCUVpA9PKC/tW9iVa3dwmbQ== X-Received: by 2002:a17:906:70d6:: with SMTP id g22mr12699397ejk.215.1551703474993; Mon, 04 Mar 2019 04:44:34 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk (borgediget.toke.dk. [85.204.121.218]) by smtp.gmail.com with ESMTPSA id y24sm2019692eds.35.2019.03.04.04.44.33 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 04 Mar 2019 04:44:34 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 7DAC5183BB9; Mon, 4 Mar 2019 13:44:33 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Kevin Darbyshire-Bryant Cc: Pete Heist , "cake\@lists.bufferbloat.net" In-Reply-To: <52F1A35A-9F57-40EC-AF9B-EC6D8BD376BE@darbyshire-bryant.me.uk> References: <1443315187.406067.1551301968084@webmail.strato.de> <000901d4cf15$27748ec0$765dac40$@gmail.com> <67E78212-F68E-4BD1-946D-F1830EEA2968@darbyshire-bryant.me.uk> <4D01FA8A-E732-49C8-A4E6-8EA453CFA80C@heistp.net> <87d0n6lwgw.fsf@toke.dk> <52F1A35A-9F57-40EC-AF9B-EC6D8BD376BE@darbyshire-bryant.me.uk> X-Clacks-Overhead: GNU Terry Pratchett Date: Mon, 04 Mar 2019 13:44:33 +0100 Message-ID: <871s3mlsge.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] Using firewall connmarks as tin selectors X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2019 12:44:36 -0000 Kevin Darbyshire-Bryant writes: >> On 4 Mar 2019, at 11:17, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>=20 >> Kevin Darbyshire-Bryant writes: >>=20 >>>> On 4 Mar 2019, at 08:39, Pete Heist wrote: >>>>=20 >>>>=20 >>>>> On Mar 3, 2019, at 12:52 PM, Kevin Darbyshire-Bryant wrote: >>>>>=20 >>>>> The very bad idea: >>>>>=20 >>>>> And it=E2=80=99s bad =E2=80=98cos it=E2=80=99s sort of incompatible w= ith the existing fwmark >>>>> implementation as described above. So an awful lot of our >>>>> shenanigans above is due to DSCP not traversing the internet >>>>> particularly well. The solution above abstracts DSCP into =E2=80=99ti= ns=E2=80=99 >>>>> which we put into fwmarks. Another approach would be to put the DSCP >>>>> *into* the fwmark. CAKE could (optionally) copy the FWMARK contained >>>>> DSCP into the diffserv field onto the actual packets. Voila DSCP >>>>> traversal across =E2=80=99tinternet with tin/bandwidth allocation in = our >>>>> local domain preserved. >>>>=20 >>>> If I understand it right, another use case for this =E2=80=9Cvery bad = idea=E2=80=9D >>>> is preserving DSCP locally while traversing upstream WiFi links as >>>> besteffort, which avoids airtime efficiency problems that can occur >>>> with 802.11e (WMM). In cases where the router config can=E2=80=99t be = changed >>>> (802.11e is mandatory after all) I=E2=80=99ve used IPIP tunnels for th= is, as >>>> it hides DSCP from the WiFi stack while preserving the values through >>>> the tunnel, but this would be easier. Neat=E2=80=A6 :) >>>=20 >>> Everyone has understood the intent & maybe the implementation >>> correctly. 2 patches attached, one for cake, one for tc. >>>=20 >>> They are naively coded and some of it undoes Toke=E2=80=99s recent tidy= ing up >>> (sorry!) >>=20 >> Heh. First comment: Don't do that ;) > > I did say naively coded. > >>=20 >> A few more below. >>=20 >>> 012C ACB2 28C6 C53E 9775 9123 B3A2 389B 9DE2 334A >>> diff --git a/pkt_sched.h b/pkt_sched.h >>> index a2f570c..d1f288d 100644 >>> --- a/pkt_sched.h >>> +++ b/pkt_sched.h >>> @@ -879,6 +879,7 @@ enum { >>> TCA_CAKE_ACK_FILTER, >>> TCA_CAKE_SPLIT_GSO, >>> TCA_CAKE_FWMARK, >>> + TCA_CAKE_ICING, >>> __TCA_CAKE_MAX >>> }; >>> #define TCA_CAKE_MAX (__TCA_CAKE_MAX - 1) >>> diff --git a/sch_cake.c b/sch_cake.c >>> index 733b897..5aca0f3 100644 >>> --- a/sch_cake.c >>> +++ b/sch_cake.c >>> @@ -270,7 +270,8 @@ enum { >>> CAKE_FLAG_INGRESS =3D BIT(2), >>> CAKE_FLAG_WASH =3D BIT(3), >>> CAKE_FLAG_SPLIT_GSO =3D BIT(4), >>> - CAKE_FLAG_FWMARK =3D BIT(5) >>> + CAKE_FLAG_FWMARK =3D BIT(5), >>> + CAKE_FLAG_ICING =3D BIT(6) >>=20 >> This implies that icing and fwmark can be enabled completely >> independently of each other. Are you sure about the semantics for that? > > No, I=E2=80=99m not. I sent the patches so others could see my lunacy in= action and hopefully improve it. > > The actual operation links FWMARK, INGRESS & ICING in a variety of > combinations. Right, so obviously this needs to be thought through... >>> }; >>>=20 >>> /* COBALT operates the Codel and BLUE algorithms in parallel, in order = to >>> @@ -333,7 +334,7 @@ static const u8 diffserv8[] =3D { >>> }; >>>=20 >>> static const u8 diffserv4[] =3D { >>> - 0, 2, 0, 0, 2, 0, 0, 0, >>> + 0, 1, 0, 0, 2, 0, 0, 0, >>> 1, 0, 0, 0, 0, 0, 0, 0, >>> 2, 0, 2, 0, 2, 0, 2, 0, >>> 2, 0, 2, 0, 2, 0, 2, 0, >>> @@ -344,7 +345,7 @@ static const u8 diffserv4[] =3D { >>> }; >>>=20 >>> static const u8 diffserv3[] =3D { >>> - 0, 0, 0, 0, 2, 0, 0, 0, >>> + 0, 1, 0, 0, 2, 0, 0, 0, >>=20 >> Why are you messing with the diffserv mappings in this patch? > > This is a combination patch of Dave=E2=80=99s new LE coding and the > fwmark/dscp mangling. Ah. Well let's keep that separate from this patch/discussion... >>=20 >>> 1, 0, 0, 0, 0, 0, 0, 0, >>> 0, 0, 0, 0, 0, 0, 0, 0, >>> 0, 0, 0, 0, 0, 0, 0, 0, >>> @@ -1618,7 +1619,24 @@ static unsigned int cake_drop(struct Qdisc *sch,= struct sk_buff **to_free) >>> return idx + (tin << 16); >>> } >>>=20 >>> -static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash) >>> +void cake_update_diffserv(struct sk_buff *skb, u8 dscp) >>> +{ >>> + switch (skb->protocol) { >>> + case htons(ETH_P_IP): >>> + if ((ipv4_get_dsfield(ip_hdr(skb)) & ~INET_ECN_MASK) !=3D dscp) >>> + ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, dscp); >>> + break; >>> + case htons(ETH_P_IPV6): >>> + if ((ipv6_get_dsfield(ipv6_hdr(skb)) & ~INET_ECN_MASK) !=3D dscp) >>> + ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, dscp); >>> + break; >>> + default: >>> + break; >>> + } >>> + >>> +} >>=20 >> So washing is just a special case of this (wash is >> cake_update_diffserv(skb,0)). So you shouldn't need to add another >> function, just augment the existing handling code. > > Erm, that=E2=80=99s exactly what I=E2=80=99ve done. Ah, right; I guess it's just the reverting of the cleanup patch that is the issue, then :) >>> +static u8 cake_handle_diffserv(struct sk_buff *skb, bool wash) >>> { >>> u8 dscp; >>>=20 >>> @@ -1644,37 +1662,70 @@ static u8 cake_handle_diffserv(struct sk_buff *= skb, u16 wash) >>> } >>> } >>>=20 >>> +#if IS_REACHABLE(CONFIG_NF_CONNTRACK) >>=20 >> Save an ifdef below by moving the ifdef inside the function definition. >>=20 >>> +void cake_update_ct_mark(struct sk_buff *skb, u8 dscp) >>> +{ >>> + enum ip_conntrack_info ctinfo; >>> + struct nf_conn *ct; >>> + >>> + ct =3D nf_ct_get(skb, &ctinfo); >>> + if (!ct) >>> + return; >>> + >>> + ct->mark &=3D 0x80ffffff; >>> + ct->mark |=3D (0x40 | dscp) << 24; >>=20 >> Right, so we *might* have an argument that putting the *tin* into the >> fwmark is CAKE's business, but copying over the dscp mark is not >> something a qdisc should be doing=E2=80=A6 > > Why ever not? It=E2=80=99s not the DSCP, it=E2=80=99s a lookup value int= o the cake > priority table, it just happens to look like the DSCP ;-) If it quacks like a duck... > >>> + nf_conntrack_event_cache(IPCT_MARK, ct); >>> +} >>> +#endif >>=20 >> Also, are you sure this will work in all permutations of conntrack being >> a module vs not etc? (we had to jump through quite some hoops to get the >> conntrack hooks to work last time; this is probably my biggest worry her= e). > > No, I have absolutely no clue here at all. Well, another issue that needs fixing, then... -Toke