[Cake] some comprehensive arm64 w/cake results

dave seddon dave.seddon.ca at gmail.com
Mon Sep 18 15:50:10 EDT 2023


G'day Mr David Reed,

Thanks for the comments.

Definitely agree with your sentiments and the tests definitely do NOT
simply represent Intel verse ARM.

Perhaps I should have been more clear about the objectives of the testing:

I'm curious to understand the performance of these lower end SoC devices,
because these are the types of devices that act as home gateway routers, as
access points, and such.  There are many many millions of these devices out
there and I don't know how well understood their performance is:
e.g. How bad is my Spectrum Internet cable modem?
e.g. I have a Unifi security gateway and it's "smart queue" performance is
pretty poor ( <200 Mb/s ).  Why is it so poor?

Obviously, with real servers ( and even virtual AWS ones ) which have real
NICs, you get things like multi-queues with RSS, and a lot more tuning
knobs, and so they can go a lot faster.

In the tests so far, the Asus CN60 device with the r8169 performs pretty
well, where the NIC is likely to be contributing positively.  The default
configuration has a bunch of off-loading enabled:

root at asus-cn60-2:/home/das# ethtool --show-features enp1s0 | grep ": on"
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ipv6: on
generic-receive-offload: on
rx-vlan-offload: on
tx-vlan-offload: on
highdma: on [fixed]

However, based on these initial tests, which are not complete, it's
certainly curious that the Pi4 is doing ~923Mbit/s with pfifo_fast and then
doing significantly less ( ~621 Mbits/sec ) with cake.  I'm interested to
understand this in more detail, where DaveT has recommended adding 20ms or
40ms.  The cake tests so far had rtt 1ms and rtt 3ms, which might be too
low.  ( If it is too low, then maybe it would make sense to remove "rtt lan
= rtt 1ms" option, as it's a misleading configuration option? )

Definitely, during the testing these little devices have the NIC IRQs all
going through core 0, so I want to explore tuning options.

root at rpi4b:/home/das# cat /proc/interrupts | grep -E '(CPU0|eth0)'
           CPU0       CPU1       CPU2       CPU3
 30:   38651749          0          0          0     GICv2 189 Level
eth0      <--- IRQs only going to CPU0
 31:   20418643          0          0          0     GICv2 190 Level
eth0

Some ideas include:
- Moving most processes of core0. e.g. Configure all the systemd slices NOT
to use core0, so core0 is essentially freed to only service the IRQs
- RPS (
https://www.kernel.org/doc/html/latest/networking/scaling.html#rps-receive-packet-steering
). e.g. Can the other cores get more involved?
- Tuning ideas from here:
https://github.com/leandromoreira/linux-network-performance-parameters.
Specifically, I was wondering about increasing netdev_budget sysctls.

The defaults are shown here

root at rpi4b:/home/das# sysctl -a | grep netdev_budget
net.core.netdev_budget = 300
net.core.netdev_budget_usecs = 8000

"Armbian's kernel isn't a particularly high performance kernel build."

Happy to discuss any recommended tuning.  Armbrian is very easy to install
on the microSD card.  ( Actually, I have the LicheePi 4A RISC-V, but can't
find a easy image to just load on a microSD card. )


Over the weekend, I reconfigured the testing setup using a lot more VLANs.
Now each device has ALL the different qdiscs configured on different VLANs
and IPs, allowing the iperf/flent tests to be run one after the other with
no need to change the qdiscs between tests.  I'm currently repeating every
combination of test, before adding the netem 20/40ms latency as DaveT
suggested.  ( Test take a while: 8 devices * 6 qdiscs = 48 tests, by 10
minute tests = 480 minutes = 8 hours )

Roughly the plan is:
1. Retest all combinations.  This is to confirm the starting position. <---
running now
2. Add netem latency 20 and 40ms, and retest all combinations.  I'm hoping
Pi4 cake performance will be closer to > 900 Mb/s
3. Apply some tuning options, and retest all combinations

Kind regards,
Dave Seddon

On Sun, Sep 17, 2023 at 6:05 PM Dave Taht <dave.taht at gmail.com> wrote:

>
> A huge thanks to dave seddon for buckling down and doing some
> comprehensive testing of a variety of arm64 gear!
>
>
> https://docs.google.com/document/d/1HxIU_TEBI6xG9jRHlr8rzyyxFEN43zMcJXUFlRuhiUI/edit#heading=h.bpvv3vr500nw
>
> --
> Oct 30:
> https://netdevconf.info/0x17/news/the-maestro-and-the-music-bof.html
> Dave Täht CSO, LibreQos
>


-- 
Regards,
Dave Seddon
+1 415 857 5102
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/cake/attachments/20230918/7bdaa3ac/attachment.html>


More information about the Cake mailing list