From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id D3C433B2A4 for ; Sun, 30 Dec 2018 11:51:35 -0500 (EST) Received: by mail-qt1-x829.google.com with SMTP id k12so27716653qtf.7 for ; Sun, 30 Dec 2018 08:51:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-transfer-encoding; bh=NEdLv4VUvJbe9D20s0gK9O7PtsZOdirpXjk+g8pFmiE=; b=kiw0eJkRdAmCLxJ05AVO2vg4c7TRRlQ1NerE7Vz932PS0zFfjtLLZbXxpVQ35p7FH5 Rn7J2CZEZRtVydDwItUeZ/dvWj9y/RXf0RgQw7GVA5gyNTlv3Oc2QASrY1bulPZ1VsXH Xj6ndo1r5BeI1OZd0i8LOptnD7PHQNhNFNdRlvTLQxM1LtC0xXeQyDXaqioe4Xq/tiaC R7dDO60DTYy9O0yEN50zw5wMkz9nrwQoeswA3V3AcSwn3Ns6tEPXuIEipPVB/14qxoza lPFKxj6Q6mFKU3yST9fnLFJ8v7dolwIIixPSozX9MpLINteak/3ukjY1XtzTMXms+Su0 MGvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=NEdLv4VUvJbe9D20s0gK9O7PtsZOdirpXjk+g8pFmiE=; b=L+mUX2Ie6ZaGR2oyCkfV/B2WHjeSHx0+SThZnO91Yadx18OJWCu4VvX8WmLn4yyM4j 8pHGO1Egxp0OYNenAO00i542CyMDQIf01ysa4eb+iaglBFjGuboYnbqo+rfgwMR0214I RFTYAcrvci+Kp+knjjecYnyXuEGuknLqk9x83ABX3YSXzEec1B0pMZKNlSbb6xEn2z9i FjHc9fTPm+xoEAfeTDDGuexai+4UivdBYKp87dr6aQOQQHBCAWslqcc9AxZXquRp0nhR JW1mpYFWdSy+kWj5sN2Ac4Bi8iEA1+lyAJv90xNCDzDiglcRJLf7oEZ3gtiYdI3YhK17 54Kw== X-Gm-Message-State: AJcUukdnKpJb5vRWIgfrC67j6zRqLnlHYW9WyIZUOqAXrb94djHEW2A5 VQenq0z1m92PjliIL7bmSM1SJnlpMZo/PvZr0ivqUQ== X-Google-Smtp-Source: ALg8bN6K8NHLK3weadc/N1aYCKW+2o5CPMhvf6rQjW72a2+1XHtaS6hiHurF2TG4vRMmCvIUfMqVkBotaNx0k3ikxC4= X-Received: by 2002:a0c:a402:: with SMTP id w2mr34102161qvw.129.1546188694656; Sun, 30 Dec 2018 08:51:34 -0800 (PST) MIME-Version: 1.0 References: <3665b2ec-2f73-2fe4-2ab3-2c1e692773ec@gwozdz.info> In-Reply-To: From: Dave Taht Date: Sun, 30 Dec 2018 08:51:21 -0800 Message-ID: To: Cake List Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: [Cake] Fwd: clogging qdisc X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Dec 2018 16:51:35 -0000 real example of an isp configuration ---------- Forwarded message --------- From: Grzegorz Gw=C3=B3=C5=BAd=C5=BA Date: Sat, Dec 29, 2018 at 4:25 PM Subject: Re: clogging qdisc To: sch_cake looks promising but is too simple. I've got thousands of customers with different tariffs My setup (eth0 is FROM customers, eth1 is TO Internet): /sbin/tc qdisc add dev eth0 root handle 1: hfsc default 1 /sbin/tc qdisc add dev eth1 root handle 1: hfsc default 1 #Base class /sbin/tc class add dev eth0 parent 1: classid 1:1 hfsc sc m1 2048000kbit d 10000000 m2 2048000kbit ul m1 2048000kbit d 10000000 m2 2048000kbit /sbin/tc class add dev eth1 parent 1: classid 1:1 hfsc sc m1 2048000kbit d 10000000 m2 2048000kbit ul m1 2048000kbit d 10000000 m2 2048000kbit #Hash filters 1 lvl /sbin/tc filter add dev eth0 parent 1:0 prio 1 handle 255: protocol ip u32 divisor 256 /sbin/tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 ht 800:: match ip dst 192.168.0.0/16 hashkey mask 0x0000ff00 at 16 link 255: /sbin/tc filter add dev eth1 parent 1:0 prio 1 handle 255: protocol ip u32 divisor 256 /sbin/tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32 ht 800:: match ip src 192.168.0.0/16 hashkey mask 0x0000ff00 at 12 link 255: #Hash filters 2 lvl for i in `seq 1 254`; do Hi=3D`printf "%.2x" $i` /sbin/tc filter add dev eth0 parent 1:0 prio 1 handle $Hi: protocol ip u32 divisor 256 /sbin/tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 ht 255:$Hi: match ip dst 192.168.$i.0/24 hashkey mask 0x000000ff at 16 link $Hi: done for i in `seq 1 254`; do Hi=3D`printf "%.2x" $i` /sbin/tc filter add dev eth1 parent 1:0 prio 1 handle $Hi: protocol ip u32 divisor 256 /sbin/tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32 ht 255:$Hi: match ip src 192.168.$i.0/24 hashkey mask 0x000000ff at 12 link $Hi: done #And for every customer (about 3000): ###################### let dwnrate=3D12288 let dwnceil=3D14336 /sbin/tc class add dev eth0 parent 1: classid 1:0113 hfsc sc m1 $dwnceil"kbit" d 30000000 m2 $dwnrate"kbit" ul m1 $dwnceil"kbit" d 30000000 m2 $dwnrate"kbit" /sbin/tc qdisc add dev eth0 parent 1:0113 handle 0113: sfq perturb 10 let uplrate=3D3072 let uplceil=3D3584 /sbin/tc class add dev eth1 parent 1: classid 1:0113 hfsc sc m1 $uplceil"kbit" d 30000000 m2 $uplrate"kbit" ul m1 $uplceil"kbit" d 30000000 m2 $uplrate"kbit" /sbin/tc qdisc add dev eth1 parent 1:0113 handle 0113: sfq perturb 10 /sbin/tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 ht 01:13: match ip dst 192.168.1.19/32 flowid 1:0113 /sbin/tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32 ht 01:13: match ip src 192.168.1.19/32 flowid 1:0113 ###################### let dwnrate=3D8192 let dwnceil=3D10240 /sbin/tc class add dev eth0 parent 1: classid 1:0219 hfsc sc m1 $dwnceil"kbit" d 30000000 m2 $dwnrate"kbit" ul m1 $dwnceil"kbit" d 30000000 m2 $dwnrate"kbit" /sbin/tc qdisc add dev eth0 parent 1:0219 handle 0219: sfq perturb 10 let uplrate=3D2048 let uplceil=3D2560 /sbin/tc class add dev eth1 parent 1: classid 1:0219 hfsc sc m1 $uplceil"kbit" d 30000000 m2 $uplrate"kbit" ul m1 $uplceil"kbit" d 30000000 m2 $uplrate"kbit" /sbin/tc qdisc add dev eth1 parent 1:0219 handle 0219: sfq perturb 10 /sbin/tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 ht 02:19: match ip dst 192.168.2.25/32 flowid 1:0219 /sbin/tc filter add dev eth1 protocol ip parent 1:0 prio 1 u32 ht 02:19: match ip src 192.168.2.25/32 flowid 1:0219 ###################### I use static routing and next container (linked by bridge common for both containers) is doing NAT I would like to delete classes and filters one by one to find out if this is specific customer that is causing trouble... I can do: /sbin/tc qdisc del dev eth0 parent 1:0219 handle 0219: sfq perturb 10 but I can't do /sbin/tc class del dev eth0 parent 1: classid 1:0219 or /sbin/tc class del dev eth0 parent 1: classid 1:0219 hfsc sc m1 10240kbit d 30000000 m2 8192kbit ul m1 10240kbit d 30000000 m2 8192kbit because: RTNETLINK answers: Device or resource busy Why? Deleting filters also does not work as expected /sbin/tc filter del dev eth0 protocol ip parent 1:0 prio 1 u32 ht 02:19: match ip dst 192.168.2.25/32 flowid 1:0219 deletes all filters. After that tc -s filter ls dev eth0 returns nothing. Why? GG On 28.12.2018 12:57, Dave Taht wrote: > I am of course, always interested in more folk dumping hfsc and > complicated designs, and trying sch_cake.... > > On Fri, Dec 28, 2018 at 3:54 AM Alan Goodman > wrote: >> Perhaps you should post an example of your tc setup? >> >> I had a bug a few months back where traffic in important queues would >> seemingly randomly get 100% drop rate (as in your example below). Upon >> penning an email with the tc setup I realised that I had a leaf class on >> the wrong branch and was trying to guarantee 99.9+% of traffic for that >> leaf if it had significant traffic... Number 1:2 was swapped for number >> 1:1 and everything went back to normal. >> >> Alan >> >> On 27/12/2018 22:26, Grzegorz Gw=C3=B3=C5=BAd=C5=BA wrote: >>>> Are there any "hacks" in TC allowing to look in the guts? >>>> >>>> It looks like it's changing state to "clogged" but >>>> >>>> tc -s class ls dev eth0 >>>> >>>> looks completely normal (only grows number of sfq queues created >>>> dynamically for every connection since more and more connections are >>>> created but not closed) >>> >>> In fact i've noticed something interesting during "clugged" state... >>> >>> a few runs of: >>> >>> tc -s class ls dev eth0 >>> >>> shows that filters sort packets well but packets that goes into >>> suitable classes are dropped: >>> >>> class hfsc 1:1012 parent 1: leaf 1012: sc m1 6144Kbit d 10.0s m2 >>> 4096Kbit ul m1 6144Kbit d 10.0s m2 4096Kbit >>> Sent 103306048 bytes 75008 pkt (dropped 12, overlimits 0 requeues 0) >>> backlog 39Kb 127p requeues 0 >>> period 13718 work 103306048 bytes rtwork 103306048 bytes level 0 >>> >>> and after a while: >>> >>> class hfsc 1:1012 parent 1: leaf 1012: sc m1 6144Kbit d 10.0s m2 >>> 4096Kbit ul m1 6144Kbit d 10.0s m2 4096Kbit >>> Sent 103306048 bytes 75008 pkt (dropped 116, overlimits 0 requeues 0) >>> backlog 39160b 127p requeues 0 >>> period 13718 work 103306048 bytes rtwork 103306048 bytes level 0 >>> >>> "Sent" stands still and all packets are "dropped" >>> >>> Some classes passes packets but as time goes by more and more classes >>> stops passing and starts dropping. >>> >>> >>> GG >>> > > --=20 Dave T=C3=A4ht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740