From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 110753B29E for ; Thu, 7 Mar 2019 12:40:11 -0500 (EST) Received: by mail-ed1-f45.google.com with SMTP id g19so14212426edp.2 for ; Thu, 07 Mar 2019 09:40:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=YhZ+mlbFADhzdHCSwU0Pgbye/Dg6xzzLD8H/Z8SCpYc=; b=XhSKX0/sL2WtpS7Jant5fO3J0tbQuCjIzLXhGcOiZdUIIQXampN/pAGkepJYiXn3f4 sYi1XPRHdyArnepI019+RKO+cuc4smzbHVpT/t69oGPNkg1vDRGpSQZM9FWIPirrfDjG 0Vsznjz7Gk6mhYG2jMryuG0Cwx3K+7oZplRJ7/4JQo+LIUQdiNeZdJ0mbsVX186M6vBB JaviHlNK2CB8VJ4q+xfEFHIF39HGNICeX0bo6X+QY1qrDzA8kkgvCO7e5Zs0QhWuSBtH QjOPP7H4+/8zcnYVa4LBzgWUZns5eO/X6ui9w/jfv+mvp4qNle1+7HofuqCRMLIwhRjO Aq3g== X-Gm-Message-State: APjAAAUvbqrJHLhGwu21mzmT4JmiTst8/Btvu5JdSVUIAPzOFO3H56Hd nl4C5vDBaMc+n4W7MO+vPYgtmg== X-Google-Smtp-Source: APXvYqxnQDmiQZ9P9QO4rb0E7s/2KEBOvltxJ7u8bUtW0yWmLot0Fs5ytQ3Jo1nx/uFkqJ9GJUQdKg== X-Received: by 2002:a50:b308:: with SMTP id q8mr28073757edd.213.1551980410072; Thu, 07 Mar 2019 09:40:10 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk (borgediget.toke.dk. [85.204.121.218]) by smtp.gmail.com with ESMTPSA id 36sm1514500edz.58.2019.03.07.09.40.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 07 Mar 2019 09:40:08 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 599F4182F5B; Thu, 7 Mar 2019 18:40:05 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Kevin Darbyshire-Bryant Cc: "cake\@lists.bufferbloat.net" In-Reply-To: <4505E3A0-6AE2-4C0B-960D-B1EDB616F0CA@darbyshire-bryant.me.uk> References: <875zsw110r.fsf@toke.dk> <6B530473-971A-4265-B94B-3595D39D57AF@darbyshire-bryant.me.uk> <87r2bjyoyn.fsf@toke.dk> <4505E3A0-6AE2-4C0B-960D-B1EDB616F0CA@darbyshire-bryant.me.uk> X-Clacks-Overhead: GNU Terry Pratchett Date: Thu, 07 Mar 2019 18:40:05 +0100 Message-ID: <878sxq1t3e.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] act_connmark + dscp X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2019 17:40:11 -0000 Kevin Darbyshire-Bryant writes: >> On 7 Mar 2019, at 10:10, Toke H=C3=B8iland-J=C3=B8rgensen wrote: > > >>>>>=20 >>>>> The valid bit is set when the =E2=80=98getdscp=E2=80=99 function has = written a DSCP >>>>> value into the conntrack (& hence skb) mark. This allows us & other >>>>> skb->mark/ct->mark applications (eg iptables, cake qdisc) to know that >>>>> a DSCP value has been placed in the field. We cannot simply use a >>>>> non-zero DSCP because zero is a valid DSCP. >>>>=20 >>>> If someone installs the action, the field is supposedly always copied; >>>> so why do we need another flag? >>>=20 >>> I=E2=80=99m trying to limit the number of times expensive iptables mang= le >>> rules have to run. >>=20 >> Right, I see your point, but I'm worried that this can risk becoming a >> source of hard-to-debug bugs if this bit happens to get set by accident >> in other places. So, I would suggest to at least make it optional (and >> configurable). So how about the following configuration options: > > Phew, I explained myself clearly enough (just) that time :-). There=E2=80= =99s > a compromise here to be had between setting/using a DSCP for > connection(fwmark) vs setting/using a DSCP per packet. Mostly it=E2=80=99= s a > (improved) performance vs DSCP accuracy compromise and assumes DSCP > isn=E2=80=99t going to change after the connection has been established -= some > people have rules that mangle DSCP based on amount of data > transferred. Personal view: anything that permits some sort of > classification restoration on ingress has to be an improvement on what > we have now. So I don't think we in general can assume that all packets of the same flow has the same DSCP mark. If we want to enforce this, that is another matter entirely. >> - fwmark shift (valid values 0-32) >> specifies how many bits to left-shift the DSCP values before putting >> them into the fwmark (and how many bits to right-shift the value read >> from fwmark before writing it to the DSCP field); this could also be >> inferred from the mask rather than be a separate option (shift =3D >> lowest set bit of mask) > > I think the inferred choice is best. I can see all manner of confusing > behaviour with mismatches between mask & shift. Also it=E2=80=99s one less > parameter to pass =E2=80=A6 and in my dream world have to add some of the= se > parameters into cake as well so it can interpret the DSCP containing > fwmark as well=E2=80=A6 the fewer the better :-) Yeah, I tend to lean towards that as well. Which could also work for cake: simply interpret the FWMARK parameter as a mask, and shift by the lowest set bit. >> - get_dscp (boolean; cannot be set along with set_dscp) >> if set, the DSCP field will be copied to the fwmark field, subject to >> mask and shift > > Yes > >>=20 >> - set_dscp (boolean; cannot be set along with get_dscp) >> if set, the value in fwmark will be copied to the DSCP field, subject >> to mask and shift > > Yes > >>=20 >> - state mask (32-bit value; probably needs better name) >> if set: the get_dscp action will OR the resulting fwmark before >> storing it. the set_dscp value will AND the fwmark with this value >> before doing anything, and abort if the result is false. > > Nearly: If set; the get_dscp action will AND the fwmark with this > value and abort if true. If false it will OR the resulting fwmark > before storing it. the set_dscp action will AND the fwmark with this > value before doing anything and will abort if result is false. > > The =E2=80=98get_dscp action AND with fwmark and abort if true=E2=80=99 i= s the > function that allows DSCP values to =E2=80=98stick' into a connection, th= us > obviating the need for iptables rules to mangle the DSCP on every > egress packet. > > I can see a conflict between people who want the DSCP copied into > fwmark no matter what and people such as myself who wish it to only > happen if fwmark doesn=E2=80=99t have it set already. > > Is the solution to have a =E2=80=98get_check_state=E2=80=99 option that e= nables the > =E2=80=98get_dscp action and with fwmark =E2=80=99state mask=E2=80=99 abo= rt if true check? > Maybe even simpler to have a =E2=80=98get_state mask=E2=80=99 and a =E2= =80=99set_state mask=E2=80=99 - > AND the fwmark with the get_state mask, if false copy the dscp, else > abort. Right, if you want to be able to do all those combinations of things we quickly run into combinatorial explosion, or we risk hard-coding a policy that some users won't like. However, if we do want to express these kinds of things, maybe this is better implemented as additional options to the existing iptables matches? I.e., we could add --set-mark-from-dscp MASK to the MARK target, and --set-dscp-from-fwmark MASK to the DSCP target. Then you can implement whatever policy you want using iptables rules... Any reason that wouldn't work? -Toke