From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [45.145.95.4]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id EFCF93CB35 for ; Wed, 4 Nov 2020 19:11:01 -0500 (EST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1604535058; bh=XMy9ZBSg3CJrCkmqwIZFpBqva9+T3/92TOVEZo6jzPk=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=n5f7sM4Kz4aRbXfmrbzIb5qOaWnAHbHmhdl6Moa7YiUosRv+QQnziG+uqa2F9dom6 KXhFGWOkZvxVFbV8L3FJLlkvkWxvaNPVg9CTnK8j+uOWYa4aQ+OfoOs6UNTCFGArF3 zLY8bQdX8nmZ1a0DuCNg+FGBxKQBGYCwcI7uODirRIub6zW+eAcc9IaUTmRY0wywjj 787uBwRqtqgmMjm6sWZVTqHdYJzZRTJNqdhqjhmDy7b1H8QJ/97QsduVaKhBA0YWx9 hszwZ1QcNILdd8ITX+x/yLN5YUgn6+ue/Jf7f9pi08/D8+Z1LfmF9uq3xT03bJcoTj mT+mk9I5qlK5w== To: Thomas Rosenstein Cc: bloat@lists.bufferbloat.net In-Reply-To: References: <87imalumps.fsf@toke.dk> Date: Thu, 05 Nov 2020 01:10:58 +0100 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <871rh8vf1p.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Nov 2020 00:11:02 -0000 "Thomas Rosenstein" writes: > On 4 Nov 2020, at 17:10, Toke H=C3=B8iland-J=C3=B8rgensen wrote: > >> Thomas Rosenstein via Bloat writes: >> >>> Hi all, >>> >>> I'm coming from the lartc mailing list, here's the original text: >>> >>> =3D=3D=3D=3D=3D >>> >>> I have multiple routers which connect to multiple upstream providers,=20 >>> I >>> have noticed a high latency shift in icmp (and generally all=20 >>> connection) >>> if I run b2 upload-file --threads 40 (and I can reproduce this) >>> >>> What options do I have to analyze why this happens? >>> >>> General Info: >>> >>> Routers are connected between each other with 10G Mellanox Connect-X >>> cards via 10G SPF+ DAC cables via a 10G Switch from fs.com >>> Latency generally is around 0.18 ms between all routers (4). >>> Throughput is 9.4 Gbit/s with 0 retransmissions when tested with=20 >>> iperf3. >>> 2 of the 4 routers are connected upstream with a 1G connection=20 >>> (separate >>> port, same network card) >>> All routers have the full internet routing tables, i.e. 80k entries=20 >>> for >>> IPv6 and 830k entries for IPv4 >>> Conntrack is disabled (-j NOTRACK) >>> Kernel 5.4.60 (custom) >>> 2x Xeon X5670 @ 2.93 Ghz >>> 96 GB RAM >>> No Swap >>> CentOs 7 >>> >>> During high latency: >>> >>> Latency on routers which have the traffic flow increases to 12 - 20=20 >>> ms, >>> for all interfaces, moving of the stream (via bgp disable session)=20 >>> moves >>> also the high latency >>> iperf3 performance plumets to 300 - 400 MBits >>> CPU load (user / system) are around 0.1% >>> Ram Usage is around 3 - 4 GB >>> if_packets count is stable (around 8000 pkt/s more) >> >> I'm not sure I get you topology. Packets are going from where to=20 >> where, >> and what link is the bottleneck for the transfer you're doing? Are you >> measuring the latency along the same path? >> >> Have you tried running 'mtr' to figure out which hop the latency is=20 >> at? > > I tried to draw the topology, I hope this is okay and explains betters=20 > what's happening: > > https://drive.google.com/file/d/15oAsxiNfsbjB9a855Q_dh6YvFZBDdY5I/view?us= p=3Dsharing Ohh, right, you're pinging between two of the routers across a 10 Gbps link with plenty of capacity to spare, and *that* goes up by two orders of magnitude when you start the transfer, even though the transfer itself is <1Gbps? Am I understanding you correctly now? If so, this sounds more like a driver issue, or maybe something to do with scheduling. Does it only happen with ICMP? You could try this tool for a userspace UDP measurement: https://github.com/heistp/irtt/ Also, what happens if you ping a host on the internet (*through* the router instead of *to* it)? And which version of the Connect-X cards are you using (or rather, which driver? mlx4?) > So it must be something in the kernel tacking on a delay, I could try to= =20 > do a bisect and build like 10 kernels :) That may ultimately end up being necessary. However, when you say 'stock kernel' you mean what CentOS ships, right? If so, that's not really a 3.10 kernel - the RHEL kernels (that centos is based on) are... somewhat creative... about their versioning. So if you're switched to a vanilla upstream kernel you may find bisecting difficult :/ How did you configure the new kernel? Did you start from scratch, or is it based on the old centos config? -Toke