[Codel] OpenWRT wrong adjustment of fq_codel defaults (Was: fq_codel_drop vs a udp flood)

Dave Taht dave.taht at gmail.com
Mon May 16 12:04:01 EDT 2016


On Mon, May 16, 2016 at 1:14 AM, Roman Yeryomin <leroi.lists at gmail.com> wrote:
> On 16 May 2016 at 01:34, Roman Yeryomin <leroi.lists at gmail.com> wrote:
>> On 6 May 2016 at 22:43, Dave Taht <dave.taht at gmail.com> wrote:
>>> On Fri, May 6, 2016 at 11:56 AM, Roman Yeryomin <leroi.lists at gmail.com> wrote:
>>>> On 6 May 2016 at 21:43, Roman Yeryomin <leroi.lists at gmail.com> wrote:
>>>>> On 6 May 2016 at 15:47, Jesper Dangaard Brouer <brouer at redhat.com> wrote:
>>>>>>
>>>>>> I've created a OpenWRT ticket[1] on this issue, as it seems that someone[2]
>>>>>> closed Felix'es OpenWRT email account (bad choice! emails bouncing).
>>>>>> Sounds like OpenWRT and the LEDE https://www.lede-project.org/ project
>>>>>> is in some kind of conflict.
>>>>>>
>>>>>> OpenWRT ticket [1] https://dev.openwrt.org/ticket/22349
>>>>>>
>>>>>> [2] http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/40298/focus=40335
>>>>>
>>>>> OK, so, after porting the patch to 4.1 openwrt kernel and playing a
>>>>> bit with fq_codel limits I was able to get 420Mbps UDP like this:
>>>>> tc qdisc replace dev wlan0 parent :1 fq_codel flows 16 limit 256
>>>>
>>>> Forgot to mention, I've reduced drop_batch_size down to 32
>>>
>>> 0) Not clear to me if that's the right line, there are 4 wifi queues,
>>> and the third one
>>> is the BE queue.
>>
>> That was an example, sorry, should have stated that. I've applied same
>> settings to all 4 queues.
>>
>>> That is too low a limit, also, for normal use. And:
>>> for the purpose of this particular UDP test, flows 16 is ok, but not
>>> ideal.
>>
>> I played with different combinations, it doesn't make any
>> (significant) difference: 20-30Mbps, not more.
>> What numbers would you propose?
>>
>>> 1) What's the tcp number (with a simultaneous ping) with this latest patchset?
>>> (I care about tcp performance a lot more than udp floods - surviving a
>>> udp flood yes, performance, no)
>>
>> During the test (both TCP and UDP) it's roughly 5ms in average, not
>> running tests ~2ms. Actually I'm now wondering if target is working at
>> all, because I had same result with target 80ms..
>> So, yes, latency is good, but performance is poor.
>>
>>> before/after?
>>>
>>> tc -s qdisc show dev wlan0 during/after results?
>>
>> during the test:
>>
>> qdisc mq 0: root
>>  Sent 1600496000 bytes 1057194 pkt (dropped 1421568, overlimits 0 requeues 17)
>>  backlog 1545794b 1021p requeues 17
>> qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>> qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 1601271168 bytes 1057706 pkt (dropped 1422304, overlimits 0 requeues 17)
>>  backlog 1541252b 1018p requeues 17
>>   maxpacket 1514 drop_overlimit 1422304 new_flow_count 35 ecn_mark 0
>>   new_flows_len 0 old_flows_len 1
>> qdisc fq_codel 8004: parent :4 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>>
>>
>> after the test (60sec):
>>
>> qdisc mq 0: root
>>  Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 28)
>>  backlog 0b 0p requeues 28
>> qdisc fq_codel 8001: parent :1 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>> qdisc fq_codel 8002: parent :2 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>> qdisc fq_codel 8003: parent :3 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 3084996052 bytes 2037744 pkt (dropped 2770176, overlimits 0 requeues 28)
>>  backlog 0b 0p requeues 28
>>   maxpacket 1514 drop_overlimit 2770176 new_flow_count 64 ecn_mark 0
>>   new_flows_len 0 old_flows_len 1
>> qdisc fq_codel 8004: parent :4 limit 1024p flows 16 quantum 1514
>> target 80.0ms ce_threshold 32us interval 100.0ms ecn
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>   new_flows_len 0 old_flows_len 0
>>
>>
>>> IF you are doing builds for the archer c7v2, I can join in on this... (?)
>>
>> I'm not but I have c7 somewhere, so I can do a build for it and also
>> test, so we are on the same page.
>>
>>> I did do a test of the ath10k "before", fq_codel *never engaged*, and
>>> tcp induced latencies under load, e at 100mbit, cracked 600ms, while
>>> staying flat (20ms) at 100mbit. (not the same patches you are testing)
>>> on x86. I have got tcp 300Mbit out of an osx box, similar latency,
>>> have yet to get anything more on anything I currently have
>>> before/after patchsets.
>>>
>>> I'll go add flooding to the tests, I just finished a series comparing
>>> two different speed stations and life was good on that.
>>>
>>> "before" - fq_codel never engages, we see seconds of latency under load.
>>>
>>> root at apu2:~# tc -s qdisc show dev wlp4s0
>>> qdisc mq 0: root
>>>  Sent 8570563893 bytes 6326983 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514
>>> target 5.0ms interval 100.0ms ecn
>>>  Sent 2262 bytes 17 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 0
>>> qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514
>>> target 5.0ms interval 100.0ms ecn
>>>  Sent 220486569 bytes 152058 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 18168 drop_overlimit 0 new_flow_count 1 ecn_mark 0
>>>   new_flows_len 0 old_flows_len 1
>>> qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514
>>> target 5.0ms interval 100.0ms ecn
>>>  Sent 8340546509 bytes 6163431 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 68130 drop_overlimit 0 new_flow_count 120050 ecn_mark 0
>>>   new_flows_len 1 old_flows_len 3
>>> qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514
>>> target 5.0ms interval 100.0ms ecn
>>>  Sent 9528553 bytes 11477 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>>   maxpacket 66 drop_overlimit 0 new_flow_count 1 ecn_mark 0
>>>   new_flows_len 1 old_flows_len 0
>>>   ```
>>>
>>>
>>>>> This is certainly better than 30Mbps but still more than two times
>>>>> less than before (900).
>>>
>>> The number that I still am not sure we got is that you were sending
>>> 900mbit udp and recieving 900mbit on the prior tests?
>>
>> 900 was sending, AP POV (wifi client is downloading)
>>
>>>>> TCP also improved a little (550 to ~590).
>>>
>>> The limit is probably a bit low, also.  You might want to try target
>>> 20ms as well.
>>
>> I've tried limit up to 1024 and target up to 80ms
>>
>>>>>
>>>>> Felix, others, do you want to see the ported patch, maybe I did something wrong?
>>>>> Doesn't look like it will save ath10k from performance regression.
>>>
>>> what was tcp "before"? (I'm sorry, such a long thread)
>>
>> 750Mbps
>
> Michal, after retesting with your patch (sorry, it was late yesterday,
> confused compat-wireless archives) I saw the difference.
> So the progress looks like this (all with fq_codel flows 16 limit 1024
> target 20ms):
> no patches: 380Mbps UDP, 550 TCP
> Eric's (fq_codel drop) patch: 420Mbps UDP, 590 TCP (+40Mbps), latency
> 5-6ms during test
> Michal's (improve tx scheduling) patch: 580Mbps UDP, 660 TCP, latency
> up to 30-40ms during test
> after Rajkumar's proposal to "try without registering wake_tx_queue
> callback": 820Mbps UDP, 690 TCP.

And the simultaneous ping on the last test was?

> So, very close to "as before": 900Mbps UDP, 750 TCP.
> But still, I was expecting performance improvements from latest ath10k
> code, not regressions.
> I know that hw is capable of 800Mbps TCP, which I'm targeting.
>
> Regards,
> Roman
>
> p.s. sorry for confusion



-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org


More information about the Codel mailing list