From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id D1B193CB35 for ; Thu, 7 Mar 2019 05:10:42 -0500 (EST) Received: by mail-lj1-f171.google.com with SMTP id z7so13701107lji.0 for ; Thu, 07 Mar 2019 02:10:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=2LdH+3+kF7TRGe9bmGwyWZ9fKBPIzByNN0knwk8b1jg=; b=Q4GemkoFpqqIcQD/8vJkrUoVqBD0sLotbeixDgF9VTtwkeb9IZyEC6+ZHSESW1DqVE EXb+XEdPCDAHKiK9QO0wVLQbBvPDrcwIE9eS+g6Urj+5sbgZ2B50CoBS4URuKZ4lP+xY 3D7nq2J3h/DmIu1MVb+//k7nIe3Yv7Ny4GwY0J9tivRbQop3Fvt/Uv8v8LU2Q36ah67t BDXzU6yOShiEYaOmcBB5X6rj1BB1FEcPmvkiz4IGUCLlvN5Dw8ci4NJUUpj4f58UPsfU cfICLsgMommXi3g7wNEXK560M9siuX1pwrTTDYD1ux1LVxq4WA7g0r1XEAbjGZav+dag ny9A== X-Gm-Message-State: APjAAAUQ2z3KFHj0ZfxO1JiZYPqAlcyN95ekcH47b2deu0ErGXww9gUU 1o9rchL5zUSR2FCr9xBJA9Bg1rxpgRU= X-Google-Smtp-Source: APXvYqziZIt2WXrpu8wQtagXa6QwstLdo9173m7mqffTjjOuahy/tuL/sY0BRR+bMWRprNiZKSxagg== X-Received: by 2002:a2e:9c42:: with SMTP id t2mr5199412ljj.149.1551953441554; Thu, 07 Mar 2019 02:10:41 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk (borgediget.toke.dk. [85.204.121.218]) by smtp.gmail.com with ESMTPSA id a8sm761894ljf.52.2019.03.07.02.10.40 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 07 Mar 2019 02:10:41 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 437F7182F5B; Thu, 7 Mar 2019 11:10:40 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Kevin Darbyshire-Bryant , "cake\@lists.bufferbloat.net" In-Reply-To: <6B530473-971A-4265-B94B-3595D39D57AF@darbyshire-bryant.me.uk> References: <875zsw110r.fsf@toke.dk> <6B530473-971A-4265-B94B-3595D39D57AF@darbyshire-bryant.me.uk> X-Clacks-Overhead: GNU Terry Pratchett Date: Thu, 07 Mar 2019 11:10:40 +0100 Message-ID: <87r2bjyoyn.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] act_connmark + dscp X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2019 10:10:43 -0000 Kevin Darbyshire-Bryant writes: >> On 6 Mar 2019, at 15:21, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>=20 >> Kevin Darbyshire-Bryant writes: >>=20 >>> Before I go too far down this road (and to avoid the horror of >>> actually trying to code it) here=E2=80=99s what I=E2=80=99m trying to a= chieve. >>>=20 >>>=20 >>> act_connmark + dscp is designed to copy a DSCP code to/from conntrack m= ark. It uses 8 bits of the mark field, currently the most significant byte. >>>=20 >>> Bits 31-26: DSCP >>> Bit 25: Spare/Future >>> Bit 24: Valid DSCP set >>>=20 >>> The valid bit is set when the =E2=80=98getdscp=E2=80=99 function has wr= itten a DSCP >>> value into the conntrack (& hence skb) mark. This allows us & other >>> skb->mark/ct->mark applications (eg iptables, cake qdisc) to know that >>> a DSCP value has been placed in the field. We cannot simply use a >>> non-zero DSCP because zero is a valid DSCP. >>=20 >> If someone installs the action, the field is supposedly always copied; >> so why do we need another flag? > > I=E2=80=99m trying to limit the number of times expensive iptables mangle > rules have to run. Right, I see your point, but I'm worried that this can risk becoming a source of hard-to-debug bugs if this bit happens to get set by accident in other places. So, I would suggest to at least make it optional (and configurable). So how about the following configuration options: - fwmark mask (32-bit value) specifies a mask to apply to the fwmark field before all operations - fwmark shift (valid values 0-32) specifies how many bits to left-shift the DSCP values before putting them into the fwmark (and how many bits to right-shift the value read from fwmark before writing it to the DSCP field); this could also be inferred from the mask rather than be a separate option (shift =3D lowest set bit of mask) - get_dscp (boolean; cannot be set along with set_dscp) if set, the DSCP field will be copied to the fwmark field, subject to mask and shift - set_dscp (boolean; cannot be set along with get_dscp) if set, the value in fwmark will be copied to the DSCP field, subject to mask and shift - state mask (32-bit value; probably needs better name) if set: the get_dscp action will OR the resulting fwmark before storing it. the set_dscp value will AND the fwmark with this value before doing anything, and abort if the result is false. I think this would allow you to implement what you described, without hard-coding any behaviour. Right? Does anyone else have any opinions / objections to the above API? =20=20 > The reality is that I enjoyed doing this in the cake codebase. I > cannot say the same for act_connmark in fact I hate it so much I=E2=80=99m > stopping. The mental effort for a non-programmer and more importantly > a non-kernel programmer is exhausting & I=E2=80=99m completely disillusio= ned. > I really need to concentrate on the job that means I can pay the > mortgage, which isn=E2=80=99t bashing my head against the kernel. Fair enough; no reason to do this if you're not enjoying it! We can iterate on the API, and I guess I can write the code at some point in the future if no one else beats me to it. No promises on when, though, so if someone else feels like tackling it, please go ahead :) -Toke