From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 079903B29D for ; Fri, 6 Nov 2020 04:18:12 -0500 (EST) Received: by mail-wr1-x429.google.com with SMTP id g12so519383wrp.10 for ; Fri, 06 Nov 2020 01:18:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=creamfinance.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HYlso2dSiAJK1jH/eZ58s5JJ2UjZvf7tn57ajL1klDg=; b=lSMrsuOn+GZfj+wUdHsf6BwgBWU8cHsfRURsjdaen4+EPFilIoVcjlsqy6UaVa0HnZ N1RwfLfE+QwJCNKFY960cH/V8XxSooLGGWFXZHPv8Txxw830+TtxAqwQA9ThcSEV8ey7 PIfNCJ4bkSp3aWvHtIy8U/0xGfROnEyXXMLfs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HYlso2dSiAJK1jH/eZ58s5JJ2UjZvf7tn57ajL1klDg=; b=gytTsP5dBEBToQeyqvjrTXsRgGxJ3yX/vZtR+XEtnul5j941R8BdDSseYvKlH7Ergw tQApFrNgHj9HHTRv/GdkFJMRnoOuE0MBDc1wUOZmMiyhxU6X/m8pkpIQzQPcAdGofv8t dm3etpPZEyj8fVhNq/xrDX0M8owdgenLoelMNKEHCuStHutf3PC3BBpS0gZitua3fgs6 l/XHdPM/oPWcOeJDi9ngi9rYB9CJmnzA5AmalDHiEKimNvlC++QjCNwwP7jYECbVJDmJ 00THvyv8lav5Xbt543kmM3bAW541CUr70e6bYtCQeUvirPqhQbBL3MNwSxqDb3pFLsIZ uX/A== X-Gm-Message-State: AOAM531BbtE1XfYa3tq0FTUKm2J/tNuO1zVV7XQScx6p2hDgy3YAEqsZ Q6XzKytA4ix1rhzIkgp8yhb7 X-Google-Smtp-Source: ABdhPJygZLgpNdtE+R6Y7lPc64oxTuF+j8CeUVb9oYmz7MftaAcCKRtTvtrcq3EevIWiN8J0yD14JA== X-Received: by 2002:adf:9204:: with SMTP id 4mr1591147wrj.241.1604654291803; Fri, 06 Nov 2020 01:18:11 -0800 (PST) Received: from [10.8.100.3] (ip-185.208.132.9.cf-it.at. [185.208.132.9]) by smtp.gmail.com with ESMTPSA id f1sm1339564wmj.3.2020.11.06.01.18.11 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Nov 2020 01:18:11 -0800 (PST) From: "Thomas Rosenstein" To: "Jesper Dangaard Brouer" Cc: Bloat , "Toke =?utf-8?b?SMO4aWxhbmQtSsO4cmdlbnNlbg==?=" Date: Fri, 06 Nov 2020 10:18:10 +0100 X-Mailer: MailMate (1.13.2r5673) Message-ID: <11812D44-BD46-4CA4-BA39-6080BD88F163@creamfinance.com> In-Reply-To: <20201105143317.78276bbc@carbon> References: <87imalumps.fsf@toke.dk> <871rh8vf1p.fsf@toke.dk> <81ED2A33-D366-42FC-9344-985FEE8F11BA@creamfinance.com> <87sg9ot5f1.fsf@toke.dk> <20201105143317.78276bbc@carbon> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Nov 2020 09:18:13 -0000 On 5 Nov 2020, at 14:33, Jesper Dangaard Brouer wrote: > On Thu, 05 Nov 2020 13:22:10 +0100 > Thomas Rosenstein via Bloat wrote: > >> On 5 Nov 2020, at 12:21, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> >>> "Thomas Rosenstein" writes: >>> >>>>> If so, this sounds more like a driver issue, or maybe something to >>>>> do with scheduling. Does it only happen with ICMP? You could try = >>>>> this >>>>> tool for a userspace UDP measurement: >>>> >>>> It happens with all packets, therefore the transfer to backblaze = >>>> with >>>> 40 threads goes down to ~8MB/s instead of >60MB/s >>> >>> Huh, right, definitely sounds like a kernel bug; or maybe the new >>> kernel is getting the hardware into a state where it bugs out when >>> there are lots of flows or something. >>> >>> You could try looking at the ethtool stats (ethtool -S) while >>> running the test and see if any error counters go up. Here's a >>> handy script to monitor changes in the counters: >>> >>> https://github.com/netoptimizer/network-testing/blob/master/bin/ethto= ol_stats.pl >>> >>>> I'll try what that reports! >>>> >>>>> Also, what happens if you ping a host on the internet (*through* >>>>> the router instead of *to* it)? >>>> >>>> Same issue, but twice pronounced, as it seems all interfaces are >>>> affected. >>>> So, ping on one interface and the second has the issue. >>>> Also all traffic across the host has the issue, but on both sides, >>>> so ping to the internet increased by 2x >>> >>> Right, so even an unloaded interface suffers? But this is the same >>> NIC, right? So it could still be a hardware issue... >>> >>>> Yep default that CentOS ships, I just tested 4.12.5 there the >>>> issue also does not happen. So I guess I can bisect it >>>> then...(really don't want to =F0=9F=98=83) >>> >>> Well that at least narrows it down :) >> >> I just tested 5.9.4 seems to also fix it partly, I have long >> stretches where it looks good, and then some increases again. (3.10 >> Stock has them too, but not so high, rather 1-3 ms) >> >> for example: >> >> 64 bytes from x.x.x.x: icmp_seq=3D10 ttl=3D64 time=3D0.169 ms >> 64 bytes from x.x.x.x: icmp_seq=3D11 ttl=3D64 time=3D5.53 ms >> 64 bytes from x.x.x.x: icmp_seq=3D12 ttl=3D64 time=3D9.44 ms >> 64 bytes from x.x.x.x: icmp_seq=3D13 ttl=3D64 time=3D0.167 ms >> 64 bytes from x.x.x.x: icmp_seq=3D14 ttl=3D64 time=3D3.88 ms >> >> and then again: >> >> 64 bytes from x.x.x.x: icmp_seq=3D15 ttl=3D64 time=3D0.569 ms >> 64 bytes from x.x.x.x: icmp_seq=3D16 ttl=3D64 time=3D0.148 ms >> 64 bytes from x.x.x.x: icmp_seq=3D17 ttl=3D64 time=3D0.286 ms >> 64 bytes from x.x.x.x: icmp_seq=3D18 ttl=3D64 time=3D0.257 ms >> 64 bytes from x.x.x.x: icmp_seq=3D19 ttl=3D64 time=3D0.220 ms >> 64 bytes from x.x.x.x: icmp_seq=3D20 ttl=3D64 time=3D0.125 ms >> 64 bytes from x.x.x.x: icmp_seq=3D21 ttl=3D64 time=3D0.188 ms >> 64 bytes from x.x.x.x: icmp_seq=3D22 ttl=3D64 time=3D0.202 ms >> 64 bytes from x.x.x.x: icmp_seq=3D23 ttl=3D64 time=3D0.195 ms >> 64 bytes from x.x.x.x: icmp_seq=3D24 ttl=3D64 time=3D0.177 ms >> 64 bytes from x.x.x.x: icmp_seq=3D25 ttl=3D64 time=3D0.242 ms >> 64 bytes from x.x.x.x: icmp_seq=3D26 ttl=3D64 time=3D0.339 ms >> 64 bytes from x.x.x.x: icmp_seq=3D27 ttl=3D64 time=3D0.183 ms >> 64 bytes from x.x.x.x: icmp_seq=3D28 ttl=3D64 time=3D0.221 ms >> 64 bytes from x.x.x.x: icmp_seq=3D29 ttl=3D64 time=3D0.317 ms >> 64 bytes from x.x.x.x: icmp_seq=3D30 ttl=3D64 time=3D0.210 ms >> 64 bytes from x.x.x.x: icmp_seq=3D31 ttl=3D64 time=3D0.242 ms >> 64 bytes from x.x.x.x: icmp_seq=3D32 ttl=3D64 time=3D0.127 ms >> 64 bytes from x.x.x.x: icmp_seq=3D33 ttl=3D64 time=3D0.217 ms >> 64 bytes from x.x.x.x: icmp_seq=3D34 ttl=3D64 time=3D0.184 ms >> >> >> For me it looks now that there was some fix between 5.4.60 and 5.9.4 >> ... anyone can pinpoint it? I have now retried with 3.10, here's the results: I'm pinging between the hosts with the traffic flow, upload is running = 55 MB/s (compared to ~20 MB/s on 5.9.4, and 8MB/s on 5.4.60): 64 bytes from x.x.x.x: icmp_seq=3D159 ttl=3D64 time=3D0.050 ms 64 bytes from x.x.x.x: icmp_seq=3D160 ttl=3D64 time=3D0.056 ms 64 bytes from x.x.x.x: icmp_seq=3D161 ttl=3D64 time=3D0.061 ms 64 bytes from x.x.x.x: icmp_seq=3D162 ttl=3D64 time=3D0.072 ms 64 bytes from x.x.x.x: icmp_seq=3D163 ttl=3D64 time=3D0.052 ms 64 bytes from x.x.x.x: icmp_seq=3D164 ttl=3D64 time=3D0.053 ms 64 bytes from x.x.x.x: icmp_seq=3D165 ttl=3D64 time=3D0.068 ms 64 bytes from x.x.x.x: icmp_seq=3D166 ttl=3D64 time=3D0.050 ms 64 bytes from x.x.x.x: icmp_seq=3D167 ttl=3D64 time=3D0.057 ms 64 bytes from x.x.x.x: icmp_seq=3D168 ttl=3D64 time=3D0.051 ms 64 bytes from x.x.x.x: icmp_seq=3D169 ttl=3D64 time=3D0.045 ms 64 bytes from x.x.x.x: icmp_seq=3D170 ttl=3D64 time=3D0.138 ms 64 bytes from x.x.x.x: icmp_seq=3D171 ttl=3D64 time=3D0.052 ms 64 bytes from x.x.x.x: icmp_seq=3D172 ttl=3D64 time=3D0.049 ms 64 bytes from x.x.x.x: icmp_seq=3D173 ttl=3D64 time=3D0.094 ms 64 bytes from x.x.x.x: icmp_seq=3D174 ttl=3D64 time=3D0.050 ms 64 bytes from x.x.x.x: icmp_seq=3D175 ttl=3D64 time=3D0.810 ms 64 bytes from x.x.x.x: icmp_seq=3D176 ttl=3D64 time=3D0.077 ms 64 bytes from x.x.x.x: icmp_seq=3D177 ttl=3D64 time=3D0.055 ms 64 bytes from x.x.x.x: icmp_seq=3D178 ttl=3D64 time=3D0.049 ms 64 bytes from x.x.x.x: icmp_seq=3D179 ttl=3D64 time=3D0.050 ms 64 bytes from x.x.x.x: icmp_seq=3D180 ttl=3D64 time=3D0.073 ms 64 bytes from x.x.x.x: icmp_seq=3D181 ttl=3D64 time=3D0.065 ms 64 bytes from x.x.x.x: icmp_seq=3D182 ttl=3D64 time=3D0.123 ms 64 bytes from x.x.x.x: icmp_seq=3D183 ttl=3D64 time=3D0.045 ms 64 bytes from x.x.x.x: icmp_seq=3D184 ttl=3D64 time=3D0.054 ms 64 bytes from x.x.x.x: icmp_seq=3D185 ttl=3D64 time=3D0.063 ms 64 bytes from x.x.x.x: icmp_seq=3D186 ttl=3D64 time=3D0.044 ms 64 bytes from x.x.x.x: icmp_seq=3D187 ttl=3D64 time=3D0.075 ms 64 bytes from x.x.x.x: icmp_seq=3D188 ttl=3D64 time=3D0.076 ms 64 bytes from x.x.x.x: icmp_seq=3D189 ttl=3D64 time=3D0.065 ms 64 bytes from x.x.x.x: icmp_seq=3D190 ttl=3D64 time=3D0.049 ms The response times with traffic flow are around 3 times better than on = newer kernels without the traffic flow. I have a feeling that maybe not all config options were correctly moved = to the newer kernel. Or there's a big bug somewhere ... (which would seem rather weird for me = to be the first one to discover this) I'll rebuild the 5.9 kernel on one of the 3.10 kernel and see if it = makes a difference ... > > I have some bpftrace tools to measure these kind of latency spikes = > here: > [1] = > https://github.com/xdp-project/xdp-project/blob/master/areas/latency/ > > The tool you want is: softirq_net_latency.bt > [2] = > https://github.com/xdp-project/xdp-project/blob/master/areas/latency/so= ftirq_net_latency.bt > > Example output see[3]: > [3] https://bugzilla.redhat.com/show_bug.cgi?id=3D1795049#c8 > > Based on the kernel versions, I don't expect this to be same latency > issue as described in the bugzilla[3] case (as IIRC it was fixed in > 4.19). It can still be similar issue, where some userspace process is > reading information from the kernel (/sys/fs/cgroup/memory/memory.stat > in BZ case) that blocks softirq from running, and result in these > latency spikes. > > > > Install guide to bpftrace[4]: > [4] = > https://github.com/xdp-project/xdp-project/blob/master/areas/mem/bpftra= ce/INSTALL.org > -- = > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer