[Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

Jesper Dangaard Brouer brouer at redhat.com
Mon Nov 16 06:56:40 EST 2020


On Fri, 13 Nov 2020 07:31:26 +0100
"Thomas Rosenstein" <thomas.rosenstein at creamfinance.com> wrote:

> On 12 Nov 2020, at 16:42, Jesper Dangaard Brouer wrote:
> 
> > On Thu, 12 Nov 2020 14:42:59 +0100
> > "Thomas Rosenstein" <thomas.rosenstein at creamfinance.com> wrote:
> >  
> >>> Notice "Adaptive" setting is on.  My long-shot theory(2) is that 
> >>> this
> >>> adaptive algorithm in the driver code can guess wrong (due to not
> >>> taking TSO into account) and cause issues for
> >>>
> >>> Try to turn this adaptive algorithm off:
> >>>
> >>>   ethtool -C eth4 adaptive-rx off adaptive-tx off
> >>>  
> > [...]  
> >>>>
> >>>> rx-usecs: 32  
> >>>
> >>> When you run off "adaptive-rx" you will get 31250 interrupts/sec
> >>> (calc 1/(32/10^6) = 31250).
> >>>  
> >>>> rx-frames: 64  
> > [...]  
> >>>> tx-usecs-irq: 0
> >>>> tx-frames-irq: 0
> >>>>  
> >>> [...]  
> >>
> >> I have now updated the settings to:
> >>
> >> ethtool -c eth4
> >> Coalesce parameters for eth4:
> >> Adaptive RX: off  TX: off
> >> stats-block-usecs: 0
> >> sample-interval: 0
> >> pkt-rate-low: 0
> >> pkt-rate-high: 0
> >>
> >> rx-usecs: 0  
> >
> > Please put a value in rx-usecs, like 20 or 10.
> > The value 0 is often used to signal driver to do adaptive.  
> 
> Ok, put it now to 10.

Setting it to 10 is a little aggressive, as you ask it to generate
100,000 interrupts per sec.  (Watch with 'vmstat 1' to see it.)

 1/(10/10^6) = 100000 interrupts/sec

> Goes a bit quicker (transfer up to 26 MB/s), but discards and pci stalls 
> are still there.

Why are you measuring in (26) MBytes/sec ? (equal 208 Mbit/s)

If you still have ethtool PHY-discards, then you still have a problem.

> Ping times are noticable improved:

Okay so this means these changes did have a positive effect.  So, this
can be related to OS is not getting activated fast-enough by NIC
interrupts.

 
> 64 bytes from x.x.x.x: icmp_seq=39 ttl=64 time=0.172 ms
> 64 bytes from x.x.x.x: icmp_seq=40 ttl=64 time=0.414 ms
> 64 bytes from x.x.x.x: icmp_seq=41 ttl=64 time=0.183 ms
> 64 bytes from x.x.x.x: icmp_seq=42 ttl=64 time=1.41 ms
> 64 bytes from x.x.x.x: icmp_seq=43 ttl=64 time=0.172 ms
> 64 bytes from x.x.x.x: icmp_seq=44 ttl=64 time=0.228 ms
> 64 bytes from x.x.x.x: icmp_seq=46 ttl=64 time=0.120 ms
> 64 bytes from x.x.x.x: icmp_seq=47 ttl=64 time=1.47 ms
> 64 bytes from x.x.x.x: icmp_seq=48 ttl=64 time=0.162 ms
> 64 bytes from x.x.x.x: icmp_seq=49 ttl=64 time=0.160 ms
> 64 bytes from x.x.x.x: icmp_seq=50 ttl=64 time=0.158 ms
> 64 bytes from x.x.x.x: icmp_seq=51 ttl=64 time=0.113 ms

Can you try to test if disabling TSO, GRO and GSO makes a difference?

 ethtool -K eth4 gso off gro off tso off


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer



More information about the Bloat mailing list