[Bloat] Router congestion, slow ping/ack times with kernel 5.4.60
Thomas Rosenstein
thomas.rosenstein at creamfinance.com
Mon Nov 9 11:39:48 EST 2020
On 9 Nov 2020, at 12:40, Jesper Dangaard Brouer wrote:
> On Mon, 09 Nov 2020 11:09:33 +0100
> "Thomas Rosenstein" <thomas.rosenstein at creamfinance.com> wrote:
>
> Could you also provide ethtool_stats for the TX interface?
>
> Notice that the tool[1] ethtool_stats.pl support monitoring several
> interfaces at the same time, e.g. run:
>
> ethtool_stats.pl --sec 3 --dev eth4 --dev ethTX
>
> And provide output as pastebin.
I have now also repeated the same test with 3.10, here are the ethtool
outputs:
https://drive.google.com/file/d/1c98MVV0JYl6Su6xZTpqwS7m-6OlbmAFp/view?usp=sharing
and the ping times:
https://drive.google.com/file/d/1xhbGJHb5jUbPsee4frbx-c-uqh-7orXY/view?usp=sharing
Sadly the parameters we were looking at are not supported below 4.14.
but I immediatly saw 1 thing very different:
ethtool --statistics eth4 | grep discards
rx_discards_phy: 0
tx_discards_phy: 0
if we check the ethtool output from 5.9.4 were have:
rx_discards_phy: 151793
And also the outbound_pci_stalled_wr_events get more frequent the lower
the total bandwidth / the higher the ping is.
Logically there must be something blocking the the buffers, either they
are not getting freed, or not rotated correctly, or processing is too
slow.
I would exclude the processing, simply based on 0% CPU load, and also
that it doesn't happen in 3.10.
Suspicious is also, that the issue only appears after a certain time of
activity (maybe total traffic?!)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/bloat/attachments/20201109/ad994e65/attachment.html>
More information about the Bloat
mailing list