From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id AD4E73B2A4 for ; Sun, 30 Dec 2018 16:52:47 -0500 (EST) Received: by mail-wr1-x441.google.com with SMTP id r10so25276600wrs.10 for ; Sun, 30 Dec 2018 13:52:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heistp.net; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=ccvUbQ0D4jSHERzDzaJqYr8S4H5KiCHEvxHS2qIXwnQ=; b=e+DzsjRBe3QxFiEUDhs0IsX/Sf7N6LauXBSJMsWBA8bVPIxpwf9dpv9gZTk8eqKym8 69QxYFbBtxGPhObHNfJE5JU1LPrNgBlEfOGqWJfXTfu+8BrDLS18tMvdnwcqyCtpgmqA iJWiUHcDtSj4PhmwgPhb8Bb+jRY7R3GvWtiD3z2WJDhBTtXwXdEpTgxEU6OWqYWz99S1 DSPkS+V1pIwlgup/po2M4uaMpBqbGzf/VmYLZK+ncSh67s4YHTGQbXiveQVH0gy233yS 1uWpg/zg9QhFIaTHTD7RRZAnJK41m+vMXj2On1QpEQCcBFE+g3uKURofwgU4jl0KVTi0 BCfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=ccvUbQ0D4jSHERzDzaJqYr8S4H5KiCHEvxHS2qIXwnQ=; b=ORhSXHsaeyzwY+k1vZr3JsNR194a20TRthg4Tdj90tHp0bw4kXBByoh0fgYhXp65xw q8ZTknKNSakmQiwftMiBZ/6uiUob/dtjHla6171+abdee4qTM3Ks+u4usLTo6Iy69Rq7 cio5dhVb95y/s8iYfhIkC+tEhfg6TVvicSFHD3zxK6nWEU6ZjMiguy0l4KQPD7xNjqnR YdvsACwHGN+QY6KRUBFpr0zid0MKtl5uZTD9CRxVsZjqWSr6lyC6ehulhbrPpLyb+bM8 9PdRb2Ndd0V2Afyu6DyYi0vmKejZNEisxD/maFmH3coQST18o2ekeaJ0Lc0kbOS8OVSz L+tw== X-Gm-Message-State: AJcUukdKgsXM0XtlH+WE43oM/7tGP/9AyC46ZYT0L+rlf4Avzf2/VIsy Xm3OnDwiO1cGRQVXXOzAJhdq+FQozyc= X-Google-Smtp-Source: ALg8bN75YO96lhwBGGPMMNoZYmXT9FEn7SE4TBYzoKCKdkQPbBWlmJlEOrHh+b99SF3zVGc1mvnomQ== X-Received: by 2002:a5d:660e:: with SMTP id n14mr31051121wru.19.1546206766714; Sun, 30 Dec 2018 13:52:46 -0800 (PST) Received: from tron.luk.heistp.net (h-1169.lbcfree.net. [185.193.85.130]) by smtp.gmail.com with ESMTPSA id n20sm24751970wmi.11.2018.12.30.13.52.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 30 Dec 2018 13:52:46 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) From: Pete Heist In-Reply-To: Date: Sun, 30 Dec 2018 22:52:44 +0100 Cc: Cake List Content-Transfer-Encoding: quoted-printable Message-Id: <524D3B85-7A64-4325-9EED-4C59352AF8BD@heistp.net> References: <3665b2ec-2f73-2fe4-2ab3-2c1e692773ec@gwozdz.info> To: Dave Taht X-Mailer: Apple Mail (2.3445.9.1) Subject: Re: [Cake] clogging qdisc X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Dec 2018 21:52:47 -0000 There=E2=80=99s at least one reason why hfsc is still in use- good rate = limiting performance, but I was never able to get its service guarantees = working as well as I=E2=80=99d like though. I prefer htb=E2=80=99s = simpler design and predictable behavior, and I'd speculate that it=E2=80=99= s hfsc that=E2=80=99s causing the clogging described. This is interesting though, as I'm currently re-writing FreeNet=E2=80=99s = qos script, =E2=80=9Cdue=E2=80=9D Jan. 8. It=E2=80=99s personal now, = because after an upgrade to Ubiquiti=E2=80=99s AC gear I=E2=80=99ve got = some problems at home with high RTT. One of the two causes of this is = the backhaul qos scripts, which are making a 100mbit full-duplex link = act like a half-duplex link with high TCP RTT. I can reproduce it in the lab, and rrul_be tests are looking much better = with a simpler queueing strategy, and cake. :) Either we=E2=80=99ll be = convinced enough that cake is stable on kernel 3.16, or else it may = still have to be htb/hfsc+fq_codel, we=E2=80=99ll see... > On Dec 30, 2018, at 5:51 PM, Dave Taht wrote: >=20 > real example of an isp configuration >=20 > ---------- Forwarded message --------- > From: Grzegorz Gw=C3=B3=C5=BAd=C5=BA > Date: Sat, Dec 29, 2018 at 4:25 PM > Subject: Re: clogging qdisc > To: >=20 >=20 > sch_cake looks promising but is too simple. I've got thousands of > customers with different tariffs >=20 > My setup (eth0 is FROM customers, eth1 is TO Internet): >=20 > /sbin/tc qdisc add dev eth0 root handle 1: hfsc default 1 > /sbin/tc qdisc add dev eth1 root handle 1: hfsc default 1 >=20 > #Base class > /sbin/tc class add dev eth0 parent 1: classid 1:1 hfsc sc m1 = 2048000kbit > d 10000000 m2 2048000kbit ul m1 2048000kbit d 10000000 m2 2048000kbit > /sbin/tc class add dev eth1 parent 1: classid 1:1 hfsc sc m1 = 2048000kbit > d 10000000 m2 2048000kbit ul m1 2048000kbit d 10000000 m2 2048000kbit >=20 > #Hash filters 1 lvl > /sbin/tc filter add dev eth0 parent 1:0 prio 1 handle 255: protocol ip > u32 divisor 256 > /sbin/tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 ht = 800:: > match ip dst 192.168.0.0/16 hashkey mask 0x0000ff00 at 16 link 255: > /sbin/tc filter add dev eth1 parent 1:0 prio 1 handle 255: protocol ip > u32 divisor 256 > /sbin/tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32 ht = 800:: > match ip src 192.168.0.0/16 hashkey mask 0x0000ff00 at 12 link 255: >=20 > #Hash filters 2 lvl > for i in `seq 1 254`; do > Hi=3D`printf "%.2x" $i` > /sbin/tc filter add dev eth0 parent 1:0 prio 1 handle $Hi: = protocol > ip u32 divisor 256 > /sbin/tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 ht > 255:$Hi: match ip dst 192.168.$i.0/24 hashkey mask 0x000000ff at 16 = link > $Hi: > done >=20 > for i in `seq 1 254`; do > Hi=3D`printf "%.2x" $i` > /sbin/tc filter add dev eth1 parent 1:0 prio 1 handle $Hi: = protocol > ip u32 divisor 256 > /sbin/tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32 ht > 255:$Hi: match ip src 192.168.$i.0/24 hashkey mask 0x000000ff at 12 = link > $Hi: > done >=20 > #And for every customer (about 3000): > ###################### > let dwnrate=3D12288 > let dwnceil=3D14336 > /sbin/tc class add dev eth0 parent 1: classid 1:0113 hfsc sc m1 > $dwnceil"kbit" d 30000000 m2 $dwnrate"kbit" ul m1 $dwnceil"kbit" d > 30000000 m2 $dwnrate"kbit" > /sbin/tc qdisc add dev eth0 parent 1:0113 handle 0113: sfq perturb 10 >=20 > let uplrate=3D3072 > let uplceil=3D3584 > /sbin/tc class add dev eth1 parent 1: classid 1:0113 hfsc sc m1 > $uplceil"kbit" d 30000000 m2 $uplrate"kbit" ul m1 $uplceil"kbit" d > 30000000 m2 $uplrate"kbit" > /sbin/tc qdisc add dev eth1 parent 1:0113 handle 0113: sfq perturb 10 >=20 > /sbin/tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 ht = 01:13: > match ip dst 192.168.1.19/32 flowid 1:0113 > /sbin/tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32 ht = 01:13: > match ip src 192.168.1.19/32 flowid 1:0113 > ###################### >=20 > let dwnrate=3D8192 > let dwnceil=3D10240 > /sbin/tc class add dev eth0 parent 1: classid 1:0219 hfsc sc m1 > $dwnceil"kbit" d 30000000 m2 $dwnrate"kbit" ul m1 $dwnceil"kbit" d > 30000000 m2 $dwnrate"kbit" > /sbin/tc qdisc add dev eth0 parent 1:0219 handle 0219: sfq perturb 10 >=20 > let uplrate=3D2048 > let uplceil=3D2560 > /sbin/tc class add dev eth1 parent 1: classid 1:0219 hfsc sc m1 > $uplceil"kbit" d 30000000 m2 $uplrate"kbit" ul m1 $uplceil"kbit" d > 30000000 m2 $uplrate"kbit" > /sbin/tc qdisc add dev eth1 parent 1:0219 handle 0219: sfq perturb 10 >=20 >=20 > /sbin/tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 ht = 02:19: > match ip dst 192.168.2.25/32 flowid 1:0219 > /sbin/tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32 ht = 02:19: > match ip src 192.168.2.25/32 flowid 1:0219 >=20 > ###################### >=20 > I use static routing and next container (linked by bridge common for > both containers) is doing NAT >=20 >=20 > I would like to delete classes and filters one by one to find out if > this is specific customer that is causing trouble... >=20 > I can do: >=20 > /sbin/tc qdisc del dev eth0 parent 1:0219 handle 0219: sfq perturb 10 >=20 > but I can't do >=20 > /sbin/tc class del dev eth0 parent 1: classid 1:0219 >=20 > or >=20 > /sbin/tc class del dev eth0 parent 1: classid 1:0219 hfsc sc m1 > 10240kbit d 30000000 m2 8192kbit ul m1 10240kbit d 30000000 m2 = 8192kbit >=20 > because: >=20 > RTNETLINK answers: Device or resource busy >=20 > Why? >=20 >=20 > Deleting filters also does not work as expected >=20 > /sbin/tc filter del dev eth0 protocol ip parent 1:0 prio 1 u32 ht = 02:19: > match ip dst 192.168.2.25/32 flowid 1:0219 >=20 > deletes all filters. After that >=20 > tc -s filter ls dev eth0 >=20 > returns nothing. Why? >=20 >=20 > GG >=20 >=20 > On 28.12.2018 12:57, Dave Taht wrote: >> I am of course, always interested in more folk dumping hfsc and >> complicated designs, and trying sch_cake.... >>=20 >> On Fri, Dec 28, 2018 at 3:54 AM Alan Goodman >> wrote: >>> Perhaps you should post an example of your tc setup? >>>=20 >>> I had a bug a few months back where traffic in important queues = would >>> seemingly randomly get 100% drop rate (as in your example below). = Upon >>> penning an email with the tc setup I realised that I had a leaf = class on >>> the wrong branch and was trying to guarantee 99.9+% of traffic for = that >>> leaf if it had significant traffic... Number 1:2 was swapped for = number >>> 1:1 and everything went back to normal. >>>=20 >>> Alan >>>=20 >>> On 27/12/2018 22:26, Grzegorz Gw=C3=B3=C5=BAd=C5=BA wrote: >>>>> Are there any "hacks" in TC allowing to look in the guts? >>>>>=20 >>>>> It looks like it's changing state to "clogged" but >>>>>=20 >>>>> tc -s class ls dev eth0 >>>>>=20 >>>>> looks completely normal (only grows number of sfq queues created >>>>> dynamically for every connection since more and more connections = are >>>>> created but not closed) >>>>=20 >>>> In fact i've noticed something interesting during "clugged" = state... >>>>=20 >>>> a few runs of: >>>>=20 >>>> tc -s class ls dev eth0 >>>>=20 >>>> shows that filters sort packets well but packets that goes into >>>> suitable classes are dropped: >>>>=20 >>>> class hfsc 1:1012 parent 1: leaf 1012: sc m1 6144Kbit d 10.0s m2 >>>> 4096Kbit ul m1 6144Kbit d 10.0s m2 4096Kbit >>>> Sent 103306048 bytes 75008 pkt (dropped 12, overlimits 0 requeues = 0) >>>> backlog 39Kb 127p requeues 0 >>>> period 13718 work 103306048 bytes rtwork 103306048 bytes level 0 >>>>=20 >>>> and after a while: >>>>=20 >>>> class hfsc 1:1012 parent 1: leaf 1012: sc m1 6144Kbit d 10.0s m2 >>>> 4096Kbit ul m1 6144Kbit d 10.0s m2 4096Kbit >>>> Sent 103306048 bytes 75008 pkt (dropped 116, overlimits 0 requeues = 0) >>>> backlog 39160b 127p requeues 0 >>>> period 13718 work 103306048 bytes rtwork 103306048 bytes level 0 >>>>=20 >>>> "Sent" stands still and all packets are "dropped" >>>>=20 >>>> Some classes passes packets but as time goes by more and more = classes >>>> stops passing and starts dropping. >>>>=20 >>>>=20 >>>> GG >>>>=20 >>=20 >>=20 >=20 >=20 >=20 > --=20 >=20 > Dave T=C3=A4ht > CTO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-831-205-9740 > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake