[Cerowrt-devel] Fwd: perf and BQL on the linksys 1200ac.

Dave Taht dave.taht at gmail.com
Sun Nov 20 16:32:18 EST 2016


this was the last patch set I saw. I seem to recall it or something
like it was reviewed later on netdev.

---------- Forwarded message ----------
From: Marcin Wojtas <mw at semihalf.com>
Date: Sun, Jun 19, 2016 at 5:30 AM
Subject: Re: perf and BQL on the linksys 1200ac.
To: Dave Taht <dave.taht at gmail.com>
Cc: Imre Kaloz <kaloz at openwrt.org>, Toke Høiland-Jørgensen
<toke at toke.dk>, Stephen Walker <stephendwalker at gmail.com>, Gregory
Clément <gregory.clement at free-electrons.com>, Thomas Petazzoni
<thomas.petazzoni at free-electrons.com>


Hi Dave,

I have some very good news! BQL in mvneta is working fine, the reason
of lock-up was trivial - coalescing threshold set to '1' meant
interrupt per 2 patckets. After setting to '0' and having real
int-per-packet on egress, BQL mechanism can adjust properly.

An easy test:
1. ping without iperf - ~0,5 - 0,7ms
2. ping with 1G TCP iperf from armada38x to host (in both tests @linerate)
WITHOUT BQL - ~1,9ms - 2,6ms
WITH BQL - ~0,65 - 0,85ms

I attach 3 patches (they apply on top of v4.7-rc3, but they should be
ok also for older releases)
- change to int-per-packet
- add bql
- add xmit_more

Can you please test it in your setup? I'm sorry it took so long, but I
hope it will eventually help for your devices. If you don't mind I
have a small request, would it be possible that you run comparison
test with and without my patches, and share results? It would help me
for a local presentation I'm preparing now (of course all tests'
credits whatsoever will be yours).

Best regards,
Marcin

2016-06-09 19:00 GMT+02:00 Marcin Wojtas <mw at semihalf.com>:
> Ok, I'll dig it in my repo and will describe the problems with TX
> interrupts that never hit.
>
> Best regards,
> Marcin
>
> 2016-06-09 18:57 GMT+02:00 Dave Taht <dave.taht at gmail.com>:
>> If you can make your lastest patches attempt available I will try to
>> find someone to pick up the slack.
>>
>> On Thu, Jun 9, 2016 at 6:27 AM, Marcin Wojtas <mw at semihalf.com> wrote:
>>> Hi Dave,
>>>
>>> Unfortunately yes - the stall seems to be permanent, I've been
>>> debugging Marvell new ARMv8 SoC's for past 5 months (mvneta as well),
>>> but no slot for BQL debug and my other mainline patches in short
>>> perspective from now.
>>>
>>> Best regards,
>>> Marcin
>>>
>>> 2016-06-07 19:27 GMT+02:00 Dave Taht <dave.taht at gmail.com>:
>>>> I imagine that your work on bql for the mvneta stalled out?
>>>>
>>>> https://lists.bufferbloat.net/pipermail/cake/2016-June/002031.html
>>>>
>>>> On Thu, Dec 17, 2015 at 7:34 AM, Marcin Wojtas <mw at semihalf.com> wrote:
>>>>> Hi Dave,
>>>>>
>>>>> Thanks for so much details.
>>>>>
>>>>>
>>>>>>
>>>>>> (that said, BQL is much better than what we had before - and what the
>>>>>> mvneta has now... so I am very enthusiastic about it getting in there
>>>>>> and promise to help.... after the new year. )
>>>>>>
>>>>>
>>>>> I hope it to be fixed and submitted by then:)
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> given the much higher level of indirection and layering in the mvneta
>>>>>>>> driver, finding that looks hard, and I think making napi work right
>>>>>>>> first and simplifying things might be a way forward..
>>>>>>>
>>>>>>> Thanks for this point, I will take it into consideration. IMO both
>>>>>>> ingress and egress paths may seem a bit complcated at first but in
>>>>>>> fact they're not - or I know it too well and I'm not objective:)
>>>>>>
>>>>>> Well. Good. :) I don't know 'em at all, I only started playing with
>>>>>> this hardware last month in light of the turris omnia thing. I am
>>>>>> LOVING the speed of it, it's a huge jump from the mips based stuff I
>>>>>> was working with before.
>>>>>>
>>>>>> A naive question - do the rx and tx paths have to be cleaned up at the
>>>>>> same time? Does the softirq bounce or simultaneously exist for both
>>>>>> cpus?
>>>>>>
>>>>>>
>>>>>
>>>>> Of course timer-based processing for TX done in independent context.
>>>>> With new percpu irqs and XPS + RSS (with still one RX queue it's a
>>>>> preparation) commits (check out net-next), the mapping is strict and
>>>>> with per-cpu napi everything is done simultaneously on all cpus (2 on
>>>>> armada 38x and up to 4 on armada xp).
>>>>>
>>>>> As there is a common line for RXTX interrupt, rx receiving and tx
>>>>> cleaning has to be mixed in mvneta_poll and it all depends on detected
>>>>> interrupt cause.
>>>>>
>>>>> Best regards,
>>>>> Marcin
>>>>
>>>>
>>>>
>>>> --
>>>> Dave Täht
>>>> Let's go make home routers and wifi faster! With better software!
>>>> http://blog.cerowrt.org
>>
>>
>>
>> --
>> Dave Täht
>> Let's go make home routers and wifi faster! With better software!
>> http://blog.cerowrt.org


-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-net-mvneta-add-BQL-support.patch
Type: application/octet-stream
Size: 2987 bytes
Desc: not available
URL: <https://lists.bufferbloat.net/pipermail/cerowrt-devel/attachments/20161120/737d725b/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-net-mvneta-TX_DONE-interrupt-per-packet.patch
Type: application/octet-stream
Size: 1074 bytes
Desc: not available
URL: <https://lists.bufferbloat.net/pipermail/cerowrt-devel/attachments/20161120/737d725b/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-net-mvneta-add-xmit_more-support.patch
Type: application/octet-stream
Size: 1784 bytes
Desc: not available
URL: <https://lists.bufferbloat.net/pipermail/cerowrt-devel/attachments/20161120/737d725b/attachment-0002.obj>


More information about the Cerowrt-devel mailing list