[Cake] Fwd: r8169: Performance regression and latency instability

Dave Taht dave.taht at gmail.com
Fri Aug 16 08:45:04 EDT 2019


---------- Forwarded message ---------
From: Juliana Rodrigueiro <juliana.rodrigueiro at intra2net.com>
Date: Fri, Aug 16, 2019 at 5:20 AM
Subject: r8169: Performance regression and latency instability
To: <netdev at vger.kernel.org>
Cc: <edumazet at google.com>, <hkallweit1 at gmail.com>


Greetings!

During migration from kernel 3.14 to 4.19, we noticed a regression on
the network performance. Under the exact same circumstances, the
standard deviation of the latency is more than double than before on the
Realtek RTL8111/8168B (10ec:8168) using the r8169 driver.

Kernel 3.14:
     # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O
STDDEV_LATENCY -m 64K -d Send
     313.37

Kernel 4.19:
     # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O
STDDEV_LATENCY -m 64K -d Send
     632.96

In contrast, we noticed small improvements in performance with other
non-Realtek network cards (igb, tg3). Which suggested a possible driver
related bug.

However after bisecting the code, I ended up with the following patch,
which was introduced in kernel 4.17 and modifies net/ipv4:

     commit 0a6b2a1dc2a2105f178255fe495eb914b09cb37a
     Author: Eric Dumazet <edumazet at google.com>
     Date:   Mon Feb 19 11:56:47 2018 -0800

         tcp: switch to GSO being always on

Could you please help me to clarify, should GSO be always on on my
device? Or does it just affect TCP? According to ethtool it is always
off, "ethtool -K eth0 gso on" has no effect, unless I switch SG on.

     # ethtool -k eth0
     Offload parameters for eth0:
     Cannot get device udp large send offload settings: Operation not
supported
     rx-checksumming: on
     tx-checksumming: off
     scatter-gather: off
     tcp-segmentation-offload: off
     udp-fragmentation-offload: off
     generic-segmentation-offload: off
     generic-receive-offload: on
     large-receive-offload: off

I validated that reverting "tcp: switch to GSO being always on"
successfully brings back the better performance for the r8169 driver.

I'm sure that reverting that commit is not the optimal solution, so I
would like to kindly ask for help to shed some light in this issue.

Best regards,
Juliana.


-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


More information about the Cake mailing list