From: "Thomas Rosenstein" <thomas.rosenstein@creamfinance.com>
To: "Jesper Dangaard Brouer" <brouer@redhat.com>
Cc: "Thomas Rosenstein via Bloat" <bloat@lists.bufferbloat.net>
Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60
Date: Mon, 09 Nov 2020 15:33:46 +0100 [thread overview]
Message-ID: <27110D8E-77DF-4D10-A5EA-6430DBD55BC7@creamfinance.com> (raw)
In-Reply-To: <20201109124030.71216677@carbon>
On 9 Nov 2020, at 12:40, Jesper Dangaard Brouer wrote:
> On Mon, 09 Nov 2020 11:09:33 +0100
> "Thomas Rosenstein" <thomas.rosenstein@creamfinance.com> wrote:
>
>> On 9 Nov 2020, at 9:24, Jesper Dangaard Brouer wrote:
>>
>>> On Sat, 07 Nov 2020 14:00:04 +0100
>>> Thomas Rosenstein via Bloat <bloat@lists.bufferbloat.net> wrote:
>>>
>>>> Here's an extract from the ethtool https://pastebin.com/cabpWGFz
>>>> just
>>>> in
>>>> case there's something hidden.
>>>
>>> Yes, there is something hiding in the data from ethtool_stats.pl[1]:
>>> (10G Mellanox Connect-X cards via 10G SPF+ DAC)
>>>
>>> stat: 1 ( 1) <= outbound_pci_stalled_wr_events
>>> /sec
>>> stat: 339731557 (339,731,557) <= rx_buffer_passed_thres_phy /sec
>>>
>>> I've not seen this counter 'rx_buffer_passed_thres_phy' before,
>>> looking
>>> in the kernel driver code it is related to "rx_buffer_almost_full".
>>> The numbers per second is excessive (but it be related to a driver
>>> bug
>>> as it ends up reading "high" -> rx_buffer_almost_full_high in the
>>> extended counters).
I have now tested with a new kernel 5.9.4 build made from 3.10 with make
oldconfig and I noticed an interesting effect.
The first ca. 2 minutes the router behaves completely normal as with
3.10, after that the ping times go crazy.
I have recorded this with ethtool, and also the ping times.
Ethtool: (13 MB)
https://drive.google.com/file/d/1Ojp64UUw0zKwrgF_CisZb3BCdidAJYZo/view?usp=sharing
The transfer first was doing around 50 - 70 MB/s then once the ping
times go worse it dropped to ~12 MB/s.
ca. Line 74324 the transfer speed drops to 12 MB/s
Seems you are right about the rx_buffer_passed_thres_phy if you check
just those lines they appear more often once the speed dropped.
Not sure if that's the cause or an effect of the underlying problem!
Pings:
https://drive.google.com/file/d/16phOxM5IFU6RAl4Ua4pRqMNuLYBc4RK7/view?usp=sharing
Pause frames were activated again after the restart.
(Here a link for rerefence for the ethtool variables:
https://community.mellanox.com/s/article/understanding-mlx5-ethtool-counters)
next prev parent reply other threads:[~2020-11-09 14:33 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-04 15:23 Thomas Rosenstein
2020-11-04 16:10 ` Toke Høiland-Jørgensen
2020-11-04 16:24 ` Thomas Rosenstein
2020-11-05 0:10 ` Toke Høiland-Jørgensen
2020-11-05 8:48 ` Thomas Rosenstein
2020-11-05 11:21 ` Toke Høiland-Jørgensen
2020-11-05 12:22 ` Thomas Rosenstein
2020-11-05 12:38 ` Toke Høiland-Jørgensen
2020-11-05 12:41 ` Thomas Rosenstein
2020-11-05 12:47 ` Toke Høiland-Jørgensen
2020-11-05 13:33 ` Jesper Dangaard Brouer
2020-11-06 8:48 ` Thomas Rosenstein
2020-11-06 10:53 ` Jesper Dangaard Brouer
2020-11-06 9:18 ` Thomas Rosenstein
2020-11-06 11:18 ` Jesper Dangaard Brouer
2020-11-06 11:37 ` Thomas Rosenstein
2020-11-06 11:45 ` Toke Høiland-Jørgensen
2020-11-06 12:01 ` Thomas Rosenstein
2020-11-06 12:53 ` Jesper Dangaard Brouer
2020-11-06 14:13 ` Jesper Dangaard Brouer
2020-11-06 17:04 ` Thomas Rosenstein
2020-11-06 20:19 ` Jesper Dangaard Brouer
2020-11-07 12:37 ` Thomas Rosenstein
2020-11-07 12:40 ` Jan Ceuleers
2020-11-07 12:43 ` Thomas Rosenstein
2020-11-07 13:00 ` Thomas Rosenstein
2020-11-09 8:24 ` Jesper Dangaard Brouer
2020-11-09 10:09 ` Thomas Rosenstein
2020-11-09 11:40 ` Jesper Dangaard Brouer
2020-11-09 11:51 ` Toke Høiland-Jørgensen
2020-11-09 12:25 ` Thomas Rosenstein
2020-11-09 14:33 ` Thomas Rosenstein [this message]
2020-11-12 10:05 ` Jesper Dangaard Brouer
2020-11-12 11:26 ` Thomas Rosenstein
2020-11-12 13:31 ` Jesper Dangaard Brouer
2020-11-12 13:42 ` Thomas Rosenstein
2020-11-12 15:42 ` Jesper Dangaard Brouer
2020-11-13 6:31 ` Thomas Rosenstein
2020-11-16 11:56 ` Jesper Dangaard Brouer
2020-11-16 12:05 ` Thomas Rosenstein
2020-11-09 16:39 ` Thomas Rosenstein
2020-11-07 13:33 ` Thomas Rosenstein
2020-11-07 16:46 ` Jesper Dangaard Brouer
2020-11-07 17:01 ` Thomas Rosenstein
2020-11-07 17:26 ` Sebastian Moeller
2020-11-16 12:34 ` Jesper Dangaard Brouer
2020-11-16 12:49 ` Thomas Rosenstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/bloat.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27110D8E-77DF-4D10-A5EA-6430DBD55BC7@creamfinance.com \
--to=thomas.rosenstein@creamfinance.com \
--cc=bloat@lists.bufferbloat.net \
--cc=brouer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox