From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-x231.google.com (mail-oi0-x231.google.com [IPv6:2607:f8b0:4003:c06::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id BBD8E3B25E; Fri, 6 May 2016 15:43:03 -0400 (EDT) Received: by mail-oi0-x231.google.com with SMTP id k142so150746926oib.1; Fri, 06 May 2016 12:43:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-transfer-encoding; bh=a08ablwJk0g9Ate6pju3pUbcf5OjftXEIGXm/LbGuQ8=; b=NDOQVR5HTg+euF9oIOUXq4zOWNcIo2jLJSfSnHO7yvOpQqxay3g4v+XRviT9oaDwMj 9/gjllWWU1b0wn8O7PA3y97zy6JLbfpk3vqlvLDhMemv4HI9mTQ+WhH2dSLYA497uxrn cnB+s86rOHHFcuz/Pd9XwGWqMuY6M49+EEOG8HsAtSw3hQ/cVCcA6uKSUL0iOR6TVkuC EcL/NyBCvdTiwygQoj/RfPsCats0mIxgkCnbvf3x3HwhjxRPbQQfvYQ7quTAUPnoYWbr RjyLiiIR0KFTcFRheMIsF2wbDruuTGj1IRc7dT5mwF7jajU6+eQACLl7ar4gabl4cXMb B3QA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-transfer-encoding; bh=a08ablwJk0g9Ate6pju3pUbcf5OjftXEIGXm/LbGuQ8=; b=kDWqh3buBwcF22QwPy6aPxb1BYJj6KOZQK5oLH0txGwa8RfJMZ1WNxr6Hgcp8K8ESP vsd6938fdZCU8RUO1LdggrMaJXCc3aSu6xtS7K6+YuU5SKYy32m6iBbYxpFHmCO+dBUn AX/97Dw1VJp5S1tF6xVlYpfiNGhkXmbTeYfxQ7Y5ReVZj6/C5RHeDDo4lAdVqeQnXRAo GcLQXjJsvQe5L8Jkvnw8UFv3qffcKaYKHsVZgDOnAqMPR8tdkbuYvA5teidJjq7axm85 Quy907whXaTDDjz7OfaoRzk5XfhhzSIGyV+M7pVjqhrBwqnxCwGHBvxA8xqqkwBzTFnF /cdg== X-Gm-Message-State: AOPr4FWUBp9zClQyD/1fbnNY9vhjgC4c7ZYGatko8MkTS46HEVZxsnzLgvASWJtINqhMsYZ5+z8uVSA1OvBWWg== MIME-Version: 1.0 X-Received: by 10.157.4.174 with SMTP id 43mr9872797otm.127.1462563783160; Fri, 06 May 2016 12:43:03 -0700 (PDT) Received: by 10.202.229.210 with HTTP; Fri, 6 May 2016 12:43:03 -0700 (PDT) In-Reply-To: References: <1462125592.5535.194.camel@edumazet-glaptop3.roam.corp.google.com> <865DA393-262D-40B6-A9D3-1B978CD5F6C6@gmail.com> <1462128385.5535.200.camel@edumazet-glaptop3.roam.corp.google.com> <1462136140.5535.219.camel@edumazet-glaptop3.roam.corp.google.com> <1462201620.5535.250.camel@edumazet-glaptop3.roam.corp.google.com> <1462205669.5535.254.camel@edumazet-glaptop3.roam.corp.google.com> <1462464776.13075.18.camel@edumazet-glaptop3.roam.corp.google.com> <1462476207.13075.20.camel@edumazet-glaptop3.roam.corp.google.com> <20160506114243.4eb4f95e@redhat.com> <20160506144740.210901f5@redhat.com> Date: Fri, 6 May 2016 12:43:03 -0700 Message-ID: From: Dave Taht To: Roman Yeryomin Cc: Jesper Dangaard Brouer , Felix Fietkau , Jonathan Morton , "codel@lists.bufferbloat.net" , ath10k , make-wifi-fast@lists.bufferbloat.net, =?UTF-8?B?UmFmYcWCIE1pxYJlY2tp?= , "netdev@vger.kernel.org" , OpenWrt Development List Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Make-wifi-fast] OpenWRT wrong adjustment of fq_codel defaults (Was: [Codel] fq_codel_drop vs a udp flood) X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 May 2016 19:43:03 -0000 On Fri, May 6, 2016 at 11:56 AM, Roman Yeryomin wro= te: > On 6 May 2016 at 21:43, Roman Yeryomin wrote: >> On 6 May 2016 at 15:47, Jesper Dangaard Brouer wrote= : >>> >>> I've created a OpenWRT ticket[1] on this issue, as it seems that someon= e[2] >>> closed Felix'es OpenWRT email account (bad choice! emails bouncing). >>> Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project >>> is in some kind of conflict. >>> >>> OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349 >>> >>> [2] http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/foc= us=3D40335 >> >> OK, so, after porting the patch to 4.1 openwrt kernel and playing a >> bit with fq_codel limits I was able to get 420Mbps UDP like this: >> tc qdisc replace dev wlan0 parent :1 fq_codel flows 16 limit 256 > > Forgot to mention, I've reduced drop_batch_size down to 32 0) Not clear to me if that's the right line, there are 4 wifi queues, and the third one is the BE queue. That is too low a limit, also, for normal use. And: for the purpose of this particular UDP test, flows 16 is ok, but not ideal. 1) What's the tcp number (with a simultaneous ping) with this latest patchs= et? (I care about tcp performance a lot more than udp floods - surviving a udp flood yes, performance, no) before/after? tc -s qdisc show dev wlan0 during/after results? IF you are doing builds for the archer c7v2, I can join in on this... (?) I did do a test of the ath10k "before", fq_codel *never engaged*, and tcp induced latencies under load, e at 100mbit, cracked 600ms, while staying flat (20ms) at 100mbit. (not the same patches you are testing) on x86. I have got tcp 300Mbit out of an osx box, similar latency, have yet to get anything more on anything I currently have before/after patchsets. I'll go add flooding to the tests, I just finished a series comparing two different speed stations and life was good on that. "before" - fq_codel never engages, we see seconds of latency under load. root@apu2:~# tc -s qdisc show dev wlp4s0 qdisc mq 0: root Sent 8570563893 bytes 6326983 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn Sent 2262 bytes 17 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0 new_flows_len 0 old_flows_len 0 qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn Sent 220486569 bytes 152058 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 maxpacket 18168 drop_overlimit 0 new_flow_count 1 ecn_mark 0 new_flows_len 0 old_flows_len 1 qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn Sent 8340546509 bytes 6163431 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 maxpacket 68130 drop_overlimit 0 new_flow_count 120050 ecn_mark 0 new_flows_len 1 old_flows_len 3 qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn Sent 9528553 bytes 11477 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 maxpacket 66 drop_overlimit 0 new_flow_count 1 ecn_mark 0 new_flows_len 1 old_flows_len 0 ``` >> This is certainly better than 30Mbps but still more than two times >> less than before (900). The number that I still am not sure we got is that you were sending 900mbit udp and recieving 900mbit on the prior tests? >> TCP also improved a little (550 to ~590). The limit is probably a bit low, also. You might want to try target 20ms as well. >> >> Felix, others, do you want to see the ported patch, maybe I did somethin= g wrong? >> Doesn't look like it will save ath10k from performance regression. what was tcp "before"? (I'm sorry, such a long thread) >> >>> >>> On Fri, 6 May 2016 11:42:43 +0200 >>> Jesper Dangaard Brouer wrote: >>> >>>> Hi Felix, >>>> >>>> This is an important fix for OpenWRT, please read! >>>> >>>> OpenWRT changed the default fq_codel sch->limit from 10240 to 1024, >>>> without also adjusting q->flows_cnt. Eric explains below that you mus= t >>>> also adjust the buckets (q->flows_cnt) for this not to break. (Just >>>> adjust it to 128) >>>> >>>> Problematic OpenWRT commit in question: >>>> http://git.openwrt.org/?p=3Dopenwrt.git;a=3Dpatch;h=3D12cd6578084e >>>> 12cd6578084e ("kernel: revert fq_codel quantum override to prevent it= from causing too much cpu load with higher speed (#21326)") >>>> >>>> >>>> I also highly recommend you cherry-pick this very recent commit: >>>> net-next: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop= ()") >>>> https://git.kernel.org/davem/net-next/c/9d18562a227 >>>> >>>> This should fix very high CPU usage in-case fq_codel goes into drop mo= de. >>>> The problem is that drop mode was considered rare, and implementation >>>> wise it was chosen to be more expensive (to save cycles on normal mode= ). >>>> Unfortunately is it easy to trigger with an UDP flood. Drop mode is >>>> especially expensive for smaller devices, as it scans a 4K big array, >>>> thus 64 cache misses for small devices! >>>> >>>> The fix is to allow drop-mode to bulk-drop more packets when entering >>>> drop-mode (default 64 bulk drop). That way we don't suddenly >>>> experience a significantly higher processing cost per packet, but >>>> instead can amortize this. >>>> >>>> To Eric, should we recommend OpenWRT to adjust default (max) 64 bulk >>>> drop, given we also recommend bucket size to be 128 ? (thus the amount >>>> of memory to scan is less, but their CPU is also much smaller). >>>> >>>> --Jesper >>>> >>>> >>>> On Thu, 05 May 2016 12:23:27 -0700 Eric Dumazet wrote: >>>> >>>> > On Thu, 2016-05-05 at 19:25 +0300, Roman Yeryomin wrote: >>>> > > On 5 May 2016 at 19:12, Eric Dumazet wrot= e: >>>> > > > On Thu, 2016-05-05 at 17:53 +0300, Roman Yeryomin wrote: >>>> > > > >>>> > > >> >>>> > > >> qdisc fq_codel 0: dev eth0 root refcnt 2 limit 1024p flows 1024 >>>> > > >> quantum 1514 target 5.0ms interval 100.0ms ecn >>>> > > >> Sent 12306 bytes 128 pkt (dropped 0, overlimits 0 requeues 0) >>>> > > >> backlog 0b 0p requeues 0 >>>> > > >> maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0 >>>> > > >> new_flows_len 0 old_flows_len 0 >>>> > > > >>>> > > > >>>> > > > Limit of 1024 packets and 1024 flows is not wise I think. >>>> > > > >>>> > > > (If all buckets are in use, each bucket has a virtual queue of 1= packet, >>>> > > > which is almost the same than having no queue at all) >>>> > > > >>>> > > > I suggest to have at least 8 packets per bucket, to let Codel ha= ve a >>>> > > > chance to trigger. >>>> > > > >>>> > > > So you could either reduce number of buckets to 128 (if memory i= s >>>> > > > tight), or increase limit to 8192. >>>> > > >>>> > > Will try, but what I've posted is default, I didn't change/configu= re that. >>>> > >>>> > fq_codel has a default of 10240 packets and 1024 buckets. >>>> > >>>> > http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c#L413 >>>> > >>>> > If someone changed that in the linux variant you use, he probably sh= ould >>>> > explain the rationale. >>> >>> -- >>> Best regards, >>> Jesper Dangaard Brouer >>> MSc.CS, Principal Kernel Engineer at Red Hat >>> Author of http://www.iptv-analyzer.org >>> LinkedIn: http://www.linkedin.com/in/brouer --=20 Dave T=C3=A4ht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org