From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [45.145.95.4]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 6A89A3B29D for ; Thu, 5 Nov 2020 07:47:13 -0500 (EST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1604580432; bh=vm2Wmlv5v+CVcBBvT+Bq2FEq9T4kRnTOe4jsFnRAQyc=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=Ll/WJqLZqiyKR4FVLub+dSGyjhpxKFeIfIVCFMtT7HNZ1FxF18+PB0xlCgPnjToE0 l5CqAyXlUzvWOpoecIsZqTBGTbM/IHAsXSwB99Qz+N8svA2A5yqGxmruGzylWQJsRm 58EtLFnAu1zfE6q2NaKGAAJaL5sKMZs201cZy0ujMEWTAwLAYlAsRhGawEtEIxwBQ6 GPSIfRd0WUiiBWTdbbx3Sq3if6Ue81B5OI2jhjlmJ5t5i4Ks89GT2NlriauoRyhmNR YBIyvkgOkELdVxk6sry2tkGn1XcqqojRLkTxHQd1gOoJnB6M9XYZ5jkUbT2TQERG5F XpuXR0meJjY/Q== To: Thomas Rosenstein Cc: bloat@lists.bufferbloat.net In-Reply-To: References: <87imalumps.fsf@toke.dk> <871rh8vf1p.fsf@toke.dk> <81ED2A33-D366-42FC-9344-985FEE8F11BA@creamfinance.com> <87sg9ot5f1.fsf@toke.dk> <87eel8t1un.fsf@toke.dk> Date: Thu, 05 Nov 2020 13:47:11 +0100 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <875z6kt1gw.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Nov 2020 12:47:13 -0000 "Thomas Rosenstein" writes: > On 5 Nov 2020, at 13:38, Toke H=C3=B8iland-J=C3=B8rgensen wrote: > >> "Thomas Rosenstein" writes: >> >>> On 5 Nov 2020, at 12:21, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>> >>>> "Thomas Rosenstein" writes: >>>> >>>>>> If so, this sounds more like a driver issue, or maybe something to >>>>>> do >>>>>> with scheduling. Does it only happen with ICMP? You could try this >>>>>> tool >>>>>> for a userspace UDP measurement: >>>>> >>>>> It happens with all packets, therefore the transfer to backblaze=20 >>>>> with >>>>> 40 >>>>> threads goes down to ~8MB/s instead of >60MB/s >>>> >>>> Huh, right, definitely sounds like a kernel bug; or maybe the new >>>> kernel >>>> is getting the hardware into a state where it bugs out when there=20 >>>> are >>>> lots of flows or something. >>>> >>>> You could try looking at the ethtool stats (ethtool -S) while=20 >>>> running >>>> the test and see if any error counters go up. Here's a handy script=20 >>>> to >>>> monitor changes in the counters: >>>> >>>> https://github.com/netoptimizer/network-testing/blob/master/bin/ethtoo= l_stats.pl >>>> >>>>> I'll try what that reports! >>>>> >>>>>> Also, what happens if you ping a host on the internet (*through*=20 >>>>>> the >>>>>> router instead of *to* it)? >>>>> >>>>> Same issue, but twice pronounced, as it seems all interfaces are >>>>> affected. >>>>> So, ping on one interface and the second has the issue. >>>>> Also all traffic across the host has the issue, but on both sides,=20 >>>>> so >>>>> ping to the internet increased by 2x >>>> >>>> Right, so even an unloaded interface suffers? But this is the same >>>> NIC, >>>> right? So it could still be a hardware issue... >>>> >>>>> Yep default that CentOS ships, I just tested 4.12.5 there the issue >>>>> also >>>>> does not happen. So I guess I can bisect it then...(really don't=20 >>>>> want >>>>> to >>>>> =F0=9F=98=83) >>>> >>>> Well that at least narrows it down :) >>> >>> I just tested 5.9.4 seems to also fix it partly, I have long=20 >>> stretches >>> where it looks good, and then some increases again. (3.10 Stock has=20 >>> them >>> too, but not so high, rather 1-3 ms) >>> >>> for example: >>> >>> 64 bytes from x.x.x.x: icmp_seq=3D10 ttl=3D64 time=3D0.169 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D11 ttl=3D64 time=3D5.53 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D12 ttl=3D64 time=3D9.44 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D13 ttl=3D64 time=3D0.167 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D14 ttl=3D64 time=3D3.88 ms >>> >>> and then again: >>> >>> 64 bytes from x.x.x.x: icmp_seq=3D15 ttl=3D64 time=3D0.569 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D16 ttl=3D64 time=3D0.148 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D17 ttl=3D64 time=3D0.286 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D18 ttl=3D64 time=3D0.257 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D19 ttl=3D64 time=3D0.220 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D20 ttl=3D64 time=3D0.125 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D21 ttl=3D64 time=3D0.188 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D22 ttl=3D64 time=3D0.202 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D23 ttl=3D64 time=3D0.195 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D24 ttl=3D64 time=3D0.177 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D25 ttl=3D64 time=3D0.242 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D26 ttl=3D64 time=3D0.339 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D27 ttl=3D64 time=3D0.183 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D28 ttl=3D64 time=3D0.221 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D29 ttl=3D64 time=3D0.317 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D30 ttl=3D64 time=3D0.210 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D31 ttl=3D64 time=3D0.242 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D32 ttl=3D64 time=3D0.127 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D33 ttl=3D64 time=3D0.217 ms >>> 64 bytes from x.x.x.x: icmp_seq=3D34 ttl=3D64 time=3D0.184 ms >>> >>> >>> For me it looks now that there was some fix between 5.4.60 and 5.9.4=20 >>> ... >>> anyone can pinpoint it? >> >> $ git log --no-merges --oneline v5.4.60..v5.9.4|wc -l >> 72932 >> >> Only 73k commits; should be easy, right? :) >> >> (In other words no, I have no idea; I'd suggest either (a) asking on >> netdev, (b) bisecting or (c) using 5.9+ and just making peace with not >> knowing). > > Guess I'll go the easy route and let it be ... > > I'll update all routers to the 5.9.4 and see if it fixes the traffic=20 > flow - will report back once more after that. Sounds like a plan :) >> >>>>>> How did you configure the new kernel? Did you start from scratch,=20 >>>>>> or >>>>>> is >>>>>> it based on the old centos config? >>>>> >>>>> first oldconfig and from there then added additional options for=20 >>>>> IB, >>>>> NVMe, etc (which I don't really need on the routers) >>>> >>>> OK, so you're probably building with roughly the same options in=20 >>>> terms >>>> of scheduling granularity etc. That's good. Did you enable spectre >>>> mitigations etc on the new kernel? What's the output of >>>> `tail /sys/devices/system/cpu/vulnerabilities/*` ? >>> >>> mitigations are off >> >> Right, I just figured maybe you were hitting some threshold that >> involved a lot of indirect calls which slowed things down due to >> mitigations. Guess not, then... >> > > Thanks for the support :) You're welcome! -Toke