From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 001113CB38 for ; Sun, 10 Feb 2019 17:18:49 -0500 (EST) Received: by mail-ed1-f48.google.com with SMTP id b14so7296764edt.6 for ; Sun, 10 Feb 2019 14:18:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=L5qSPM3xVsj2XGCknTzh8Oeq0l2e/jKVjHi5gsP1uFo=; b=c+GgMYCjG34ZGZjyRfQFF3vzmGX3bAQRy7CngSfCMfE9Zp5W/vquxpZV61dUkyUCfg kNPEWmBTn3M0++2443H7FNplf51UXW9HNPEEroQEbzoKyFDwZi3yze40x52ZVXDqfq8u hGuHkQlQhpSnaWfbOBA5OOGUVzOZVTlyCAk67kZ1+A1jVVXRowhjAnojjiaBOmCr9fYS flITFrJ2lHd+wDVIQHNigm445v2qrRktHC2Qp0w0avkWROyEINcSNoCnuwgHssZpUNo4 0YVtiAfZBf5WHmPFRnJ5R24PSXAOnF8SvHiJGcgQYKNsS5z8m/zmsVZTvthVABVwsbNs vtog== X-Gm-Message-State: AHQUAuad0x1UOoUDhz1VVWgkVTiah6XRhPXJL1gAf6ED1pLZOjwvOYIN 4KtTpIA5RHmQeG5Z653AjYXhnmF9wf/aug== X-Google-Smtp-Source: AHgI3IaI97M9EixGSBnhO0Fq4CMcgWvrBSG8ZiHSRD3hUGXklM7ubE8wu2dhcu8I4UHdghKY5SWziQ== X-Received: by 2002:a50:ade7:: with SMTP id b36mr25480985edd.215.1549837129030; Sun, 10 Feb 2019 14:18:49 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk (borgediget.toke.dk. [85.204.121.218]) by smtp.gmail.com with ESMTPSA id s19sm2467881edr.14.2019.02.10.14.18.48 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 10 Feb 2019 14:18:48 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id E85B11825D3; Sun, 10 Feb 2019 23:18:47 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Kevin Darbyshire-Bryant Cc: "cake\@lists.bufferbloat.net" In-Reply-To: <0C36DB91-9CD0-4621-B038-61F6817C196E@darbyshire-bryant.me.uk> References: <10501006-3062-47C2-BA2B-4D73155069C1@darbyshire-bryant.me.uk> <2ffe2d11-ed65-6dde-881f-997afa3d8485@sager.me.uk> <65D66C9D-6C65-4307-87AE-35DC93EC5AE1@darbyshire-bryant.me.uk> <8736p1qayh.fsf@toke.dk> <0C36DB91-9CD0-4621-B038-61F6817C196E@darbyshire-bryant.me.uk> X-Clacks-Overhead: GNU Terry Pratchett Date: Sun, 10 Feb 2019 23:18:47 +0100 Message-ID: <871s4fp9rs.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] Ingress classification X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Feb 2019 22:18:50 -0000 Kevin Darbyshire-Bryant writes: >> On 6 Feb 2019, at 13:54, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>=20 >> Kevin Darbyshire-Bryant writes: >>=20 >>>=20 >>> Thank you John, that has confirmed my understanding that in essence >>> it=E2=80=99s not possible in linux to mangle/mark the first packet on i= ngress >>> and you ideally need the DSCP to be correct. >>=20 >> Not with iptables, but you can do it with tc filters. Either by writing >> a BPF filter, or by using the pedit action (which actually changes bytes >> in the packet unlike skbedit). >>=20 >> -Toke > > It=E2=80=99s not so much about tweaking DSCP values but more about persua= ding > packets to go into different cake tins for bandwidth > allocation/latency target purposes. I=E2=80=99m assuming there=E2=80=99s= a > performance advantage in not tweaking the packet if at all necessary. I very much doubt you would be able to measure any difference between the two approaches. And actually remarking the packets would keep the effect when they traverse the network (say, for WiFi links). > The previously mentioned attempt at getting egress tc filters to work > *did* actually succeed. Toke may =E2=80=98appreciate=E2=80=99 the followi= ng hacked > extract from an sqm-scripts layer_cake.qos > > > egress() { > SILENT=3D1 $TC qdisc del dev $IFACE root > $TC qdisc add dev $IFACE root $( get_stab_string ) cake \ > bandwidth ${UPLINK}kbit $( get_cake_lla_string ) ${EGRESS_CAKE_OP= TS} ${EQDISC_OPTS} > > MAJOR=3D$( tc qdisc show dev $IFACE | head -1 | awk '{print $3}' ) > $TC filter add dev $IFACE parent $MAJOR protocol ip handle 0x01 fw ac= tion skbedit priority ${MAJOR}1 > $TC filter add dev $IFACE parent $MAJOR protocol ip handle 0x03 fw ac= tion skbedit priority ${MAJOR}3 > $TC filter add dev $IFACE parent $MAJOR protocol ip handle 0x04 fw ac= tion skbedit priority ${MAJOR}4 > } > > The ingress side being: > > $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \ > match u32 0 0 flowid 1:1 action connmark action mirred egress red= irect dev $DEV > > MAJOR=3D$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' ) > $TC filter add dev $DEV parent $MAJOR protocol all handle 0x01 fw act= ion skbedit priority ${MAJOR}1 > $TC filter add dev $DEV parent $MAJOR protocol all handle 0x03 fw act= ion skbedit priority ${MAJOR}3 > $TC filter add dev $DEV parent $MAJOR protocol all handle 0x04 fw act= ion skbedit priority ${MAJOR}4 > > # Configure iptables chain to mark packets > ipt -t mangle -N QOS_MARK_${IFACE} > > A variety of rules along the lines (to set the packet mark) > iptables -t mangle -A QOS_MARK_${IFACE} -p tcp -s 192.168.218.5/255.2= 55.255.255 -m comment \ > --comment "Skybox DSCP CS1 Bulk" -j MARK --set-mark 0x01/0xff > > # save the packet mark to connmark > ipt -t mangle -A QOS_MARK_${IFACE} -j CONNMARK --save-mark > > # Send unmarked connections to the marking chain > ipt -t mangle -A PREROUTING -i $IFACE -m mark --mark 0x00/0xff -g QO= S_MARK_${IFACE} > ipt -t mangle -A POSTROUTING -o $IFACE -m mark --mark 0x00/0xff -g QO= S_MARK_${IFACE} > > > The vast majority of the egress stuff above being shamelessly stolen > from a github entry I saw ;-) > > > I do wonder if there=E2=80=99s a more efficient way of doing it though. > Setting CONNMARK directly instead of setting a packet mark and then > copying that across to a connmark would appear sensible? Depending on how many rules you have, my guess would be that the most inefficient thing is traversing all of them. You could use ipset to alleviate this, I guess. Or reimplement the whole thing as a single BPF filter... Or maybe just re-evaluate whether you really need that convoluted a ruleset? ;) -Toke