From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id B44103B29E for ; Wed, 4 Nov 2020 11:24:58 -0500 (EST) Received: by mail-wm1-x32d.google.com with SMTP id 205so2883400wma.4 for ; Wed, 04 Nov 2020 08:24:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=creamfinance.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=szu+oYMOYT+1cI3r0Y9yc8RPKo3wN1oDvk5CqkTkYqs=; b=GHVA+GF+ce8BvY32DTnd50BUUphpTOvrw1apuX6roYQw8PMdioG2YFYtIao607lsUG W5F2SQROv0xWo6TPvqjsByZoAl+WCC0keuMlFFWAmTNz2Elk5BS9u60k4sE4hBfdYSdl hQrnB8EWnzgx37WgpQXKiJXyBk15K+JRhPi7s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=szu+oYMOYT+1cI3r0Y9yc8RPKo3wN1oDvk5CqkTkYqs=; b=h2h8Zuic7HNTSrVLj7Kco+LYu8wURTX7/+F5mSW9lvTKO7vNPShQPbF/zgTZb+ZPiq XKEpTwf1p74TnhIB9lxj4sj8OtbK72rBTMhT3cMa4e3WbH375xhuEo3wFpX0gyu31cG6 UfZ9Gn0bHBHWJJxExm+LwO668W+To+t1NsV8gz6fWhkLEbunIB26+sCtfqc6pJjPJhms EQRNm0mGcYbxg7duf5/EPvttZJNN3qXRV2ClhWQP9TSkiDDWgeW2RV5RiuWzkTvOCcgt XaRcxqmGZ8EMkCsHve4MnAJwKmjBc19PqhSlD5s2FGpskA5NU/t5YbcKEjJ8kq4/XICU UqPQ== X-Gm-Message-State: AOAM533en2Qypgwoq26EHIGlyUhUrTLrDW6D0DSDWYXtNrDcuQvZpQeo m12gXDybwPFo3h1eh0TZ94NxAFKiNg5xalc= X-Google-Smtp-Source: ABdhPJygHLaEG0lHG8pCduaXdzuf7NjbKTgAZDXEXUF40ZgGbaPfdYTTv1r3aKowPBoMzOJdcmb3OQ== X-Received: by 2002:a7b:c4c3:: with SMTP id g3mr5371827wmk.65.1604507097473; Wed, 04 Nov 2020 08:24:57 -0800 (PST) Received: from [10.8.100.3] (ip-185.208.132.9.cf-it.at. [185.208.132.9]) by smtp.gmail.com with ESMTPSA id h8sm3136026wro.14.2020.11.04.08.24.56 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Nov 2020 08:24:56 -0800 (PST) From: "Thomas Rosenstein" To: "Toke =?utf-8?b?SMO4aWxhbmQtSsO4cmdlbnNlbg==?=" Cc: bloat@lists.bufferbloat.net Date: Wed, 04 Nov 2020 17:24:55 +0100 X-Mailer: MailMate (1.13.2r5673) Message-ID: In-Reply-To: <87imalumps.fsf@toke.dk> References: <87imalumps.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8"; format=flowed; markup=markdown Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Nov 2020 16:24:58 -0000 On 4 Nov 2020, at 17:10, Toke H=C3=B8iland-J=C3=B8rgensen wrote: > Thomas Rosenstein via Bloat writes: > >> Hi all, >> >> I'm coming from the lartc mailing list, here's the original text: >> >> =3D=3D=3D=3D=3D >> >> I have multiple routers which connect to multiple upstream providers, = >> I >> have noticed a high latency shift in icmp (and generally all = >> connection) >> if I run b2 upload-file --threads 40 (and I can reproduce this) >> >> What options do I have to analyze why this happens? >> >> General Info: >> >> Routers are connected between each other with 10G Mellanox Connect-X >> cards via 10G SPF+ DAC cables via a 10G Switch from fs.com >> Latency generally is around 0.18 ms between all routers (4). >> Throughput is 9.4 Gbit/s with 0 retransmissions when tested with = >> iperf3. >> 2 of the 4 routers are connected upstream with a 1G connection = >> (separate >> port, same network card) >> All routers have the full internet routing tables, i.e. 80k entries = >> for >> IPv6 and 830k entries for IPv4 >> Conntrack is disabled (-j NOTRACK) >> Kernel 5.4.60 (custom) >> 2x Xeon X5670 @ 2.93 Ghz >> 96 GB RAM >> No Swap >> CentOs 7 >> >> During high latency: >> >> Latency on routers which have the traffic flow increases to 12 - 20 = >> ms, >> for all interfaces, moving of the stream (via bgp disable session) = >> moves >> also the high latency >> iperf3 performance plumets to 300 - 400 MBits >> CPU load (user / system) are around 0.1% >> Ram Usage is around 3 - 4 GB >> if_packets count is stable (around 8000 pkt/s more) > > I'm not sure I get you topology. Packets are going from where to = > where, > and what link is the bottleneck for the transfer you're doing? Are you > measuring the latency along the same path? > > Have you tried running 'mtr' to figure out which hop the latency is = > at? I tried to draw the topology, I hope this is okay and explains betters = what's happening: https://drive.google.com/file/d/15oAsxiNfsbjB9a855Q_dh6YvFZBDdY5I/view?us= p=3Dsharing There is definitly no bottleneck in any of the links, the maximum on any = link is 16k packets/sec and around 300 Mbit/s. In the iperf3 tests I can easily get up to 9.4 Gbit/s So it must be something in the kernel tacking on a delay, I could try to = do a bisect and build like 10 kernels :) > >> Here is the tc -s qdisc output: > > This indicates ("dropped 0" and "ecn_mark 0") that there's no > backpressure on the qdisc, so something else is going on. > > Also, you said the issue goes away if you downgrade the kernel? That > does sound odd... Yes, indeed. I have only recently upgraded the kernel to the 5.4.60 and = haven't had the issue before. > > -Toke