From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [45.145.95.4]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 930C93B29E for ; Wed, 4 Nov 2020 11:10:43 -0500 (EST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1604506240; bh=2P4vPeDNXleWIG1rvNzFfmUko56D+pKYUfJoWiumYHQ=; h=From:To:Subject:In-Reply-To:References:Date:From; b=m6KJcGEuWxWW3z4mMW5Jw4VIHtZhTqQNMx8Y5K8ykrtNM95BxJsEJQdaLXY9hko6i H33yV/A6e1MgwOy6tTmO9p0ECPRZY12GzPH//+pAYJdSmz1gI7t+itC3P8+Cz9K8Wc KL8wjE4UZO8b2EoPVfyqt/CyvKSGv2DFtO7clSeCHhwRLTaJWhZa7L72ObDkPdTAiA ABOzboCJwHNUGABq35w3Hwu+W8xqCbsbNzTYN0SSeqjQgbimKXFSa6kU0u5xymQMB3 XdtOh/gnofauBy6HthVhV7sMxFYOZjB2KS4m7KnNhAC/aEaCoe0SU6pOaXOITm0X0P QOkr8RYXYpZaw== To: Thomas Rosenstein , bloat@lists.bufferbloat.net In-Reply-To: References: Date: Wed, 04 Nov 2020 17:10:39 +0100 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <87imalumps.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Nov 2020 16:10:43 -0000 Thomas Rosenstein via Bloat writes: > Hi all, > > I'm coming from the lartc mailing list, here's the original text: > > ===== > > I have multiple routers which connect to multiple upstream providers, I > have noticed a high latency shift in icmp (and generally all connection) > if I run b2 upload-file --threads 40 (and I can reproduce this) > > What options do I have to analyze why this happens? > > General Info: > > Routers are connected between each other with 10G Mellanox Connect-X > cards via 10G SPF+ DAC cables via a 10G Switch from fs.com > Latency generally is around 0.18 ms between all routers (4). > Throughput is 9.4 Gbit/s with 0 retransmissions when tested with iperf3. > 2 of the 4 routers are connected upstream with a 1G connection (separate > port, same network card) > All routers have the full internet routing tables, i.e. 80k entries for > IPv6 and 830k entries for IPv4 > Conntrack is disabled (-j NOTRACK) > Kernel 5.4.60 (custom) > 2x Xeon X5670 @ 2.93 Ghz > 96 GB RAM > No Swap > CentOs 7 > > During high latency: > > Latency on routers which have the traffic flow increases to 12 - 20 ms, > for all interfaces, moving of the stream (via bgp disable session) moves > also the high latency > iperf3 performance plumets to 300 - 400 MBits > CPU load (user / system) are around 0.1% > Ram Usage is around 3 - 4 GB > if_packets count is stable (around 8000 pkt/s more) I'm not sure I get you topology. Packets are going from where to where, and what link is the bottleneck for the transfer you're doing? Are you measuring the latency along the same path? Have you tried running 'mtr' to figure out which hop the latency is at? > Here is the tc -s qdisc output: This indicates ("dropped 0" and "ecn_mark 0") that there's no backpressure on the qdisc, so something else is going on. Also, you said the issue goes away if you downgrade the kernel? That does sound odd... -Toke