[Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

Thu Nov 5 08:33:17 EST 2020

On Thu, 05 Nov 2020 13:22:10 +0100
Thomas Rosenstein via Bloat <bloat at lists.bufferbloat.net> wrote:

> On 5 Nov 2020, at 12:21, Toke Høiland-Jørgensen wrote:
> 
> > "Thomas Rosenstein" <thomas.rosenstein at creamfinance.com> writes:
> >  
> >>> If so, this sounds more like a driver issue, or maybe something to 
> >>> do with scheduling. Does it only happen with ICMP? You could try this
> >>> tool for a userspace UDP measurement:  
> >>
> >> It happens with all packets, therefore the transfer to backblaze with 
> >> 40 threads goes down to ~8MB/s instead of >60MB/s  
> >
> > Huh, right, definitely sounds like a kernel bug; or maybe the new 
> > kernel is getting the hardware into a state where it bugs out when
> > there are lots of flows or something.
> >
> > You could try looking at the ethtool stats (ethtool -S) while
> > running the test and see if any error counters go up. Here's a
> > handy script to monitor changes in the counters:
> >
> > https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl
> >  
> >> I'll try what that reports!
> >>  
> >>> Also, what happens if you ping a host on the internet (*through*
> >>> the router instead of *to* it)?  
> >>
> >> Same issue, but twice pronounced, as it seems all interfaces are
> >> affected.
> >> So, ping on one interface and the second has the issue.
> >> Also all traffic across the host has the issue, but on both sides,
> >> so ping to the internet increased by 2x  
> >
> > Right, so even an unloaded interface suffers? But this is the same 
> > NIC, right? So it could still be a hardware issue...
> >  
> >> Yep default that CentOS ships, I just tested 4.12.5 there the
> >> issue also does not happen. So I guess I can bisect it
> >> then...(really don't want to 😃)  
> >
> > Well that at least narrows it down :)  
> 
> I just tested 5.9.4 seems to also fix it partly, I have long
> stretches where it looks good, and then some increases again. (3.10
> Stock has them too, but not so high, rather 1-3 ms)
> 
> for example:
> 
> 64 bytes from x.x.x.x: icmp_seq=10 ttl=64 time=0.169 ms
> 64 bytes from x.x.x.x: icmp_seq=11 ttl=64 time=5.53 ms
> 64 bytes from x.x.x.x: icmp_seq=12 ttl=64 time=9.44 ms
> 64 bytes from x.x.x.x: icmp_seq=13 ttl=64 time=0.167 ms
> 64 bytes from x.x.x.x: icmp_seq=14 ttl=64 time=3.88 ms
> 
> and then again:
> 
> 64 bytes from x.x.x.x: icmp_seq=15 ttl=64 time=0.569 ms
> 64 bytes from x.x.x.x: icmp_seq=16 ttl=64 time=0.148 ms
> 64 bytes from x.x.x.x: icmp_seq=17 ttl=64 time=0.286 ms
> 64 bytes from x.x.x.x: icmp_seq=18 ttl=64 time=0.257 ms
> 64 bytes from x.x.x.x: icmp_seq=19 ttl=64 time=0.220 ms
> 64 bytes from x.x.x.x: icmp_seq=20 ttl=64 time=0.125 ms
> 64 bytes from x.x.x.x: icmp_seq=21 ttl=64 time=0.188 ms
> 64 bytes from x.x.x.x: icmp_seq=22 ttl=64 time=0.202 ms
> 64 bytes from x.x.x.x: icmp_seq=23 ttl=64 time=0.195 ms
> 64 bytes from x.x.x.x: icmp_seq=24 ttl=64 time=0.177 ms
> 64 bytes from x.x.x.x: icmp_seq=25 ttl=64 time=0.242 ms
> 64 bytes from x.x.x.x: icmp_seq=26 ttl=64 time=0.339 ms
> 64 bytes from x.x.x.x: icmp_seq=27 ttl=64 time=0.183 ms
> 64 bytes from x.x.x.x: icmp_seq=28 ttl=64 time=0.221 ms
> 64 bytes from x.x.x.x: icmp_seq=29 ttl=64 time=0.317 ms
> 64 bytes from x.x.x.x: icmp_seq=30 ttl=64 time=0.210 ms
> 64 bytes from x.x.x.x: icmp_seq=31 ttl=64 time=0.242 ms
> 64 bytes from x.x.x.x: icmp_seq=32 ttl=64 time=0.127 ms
> 64 bytes from x.x.x.x: icmp_seq=33 ttl=64 time=0.217 ms
> 64 bytes from x.x.x.x: icmp_seq=34 ttl=64 time=0.184 ms
> 
> 
> For me it looks now that there was some fix between 5.4.60 and 5.9.4
> ... anyone can pinpoint it?

I have some bpftrace tools to measure these kind of latency spikes here:
 [1] https://github.com/xdp-project/xdp-project/blob/master/areas/latency/

The tool you want is: softirq_net_latency.bt
 [2] https://github.com/xdp-project/xdp-project/blob/master/areas/latency/softirq_net_latency.bt

Example output see[3]:
 [3] https://bugzilla.redhat.com/show_bug.cgi?id=1795049#c8

Based on the kernel versions, I don't expect this to be same latency
issue as described in the bugzilla[3] case (as IIRC it was fixed in
4.19).  It can still be similar issue, where some userspace process is
reading information from the kernel (/sys/fs/cgroup/memory/memory.stat
in BZ case) that blocks softirq from running, and result in these
latency spikes.

Install guide to bpftrace[4]:
 [4] https://github.com/xdp-project/xdp-project/blob/master/areas/mem/bpftrace/INSTALL.org
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer