[Bloat] Router congestion, slow ping/ack times with kernel 5.4.60
Jesper Dangaard Brouer
brouer at redhat.com
Mon Nov 16 06:56:40 EST 2020
On Fri, 13 Nov 2020 07:31:26 +0100
"Thomas Rosenstein" <thomas.rosenstein at creamfinance.com> wrote:
> On 12 Nov 2020, at 16:42, Jesper Dangaard Brouer wrote:
>
> > On Thu, 12 Nov 2020 14:42:59 +0100
> > "Thomas Rosenstein" <thomas.rosenstein at creamfinance.com> wrote:
> >
> >>> Notice "Adaptive" setting is on. My long-shot theory(2) is that
> >>> this
> >>> adaptive algorithm in the driver code can guess wrong (due to not
> >>> taking TSO into account) and cause issues for
> >>>
> >>> Try to turn this adaptive algorithm off:
> >>>
> >>> ethtool -C eth4 adaptive-rx off adaptive-tx off
> >>>
> > [...]
> >>>>
> >>>> rx-usecs: 32
> >>>
> >>> When you run off "adaptive-rx" you will get 31250 interrupts/sec
> >>> (calc 1/(32/10^6) = 31250).
> >>>
> >>>> rx-frames: 64
> > [...]
> >>>> tx-usecs-irq: 0
> >>>> tx-frames-irq: 0
> >>>>
> >>> [...]
> >>
> >> I have now updated the settings to:
> >>
> >> ethtool -c eth4
> >> Coalesce parameters for eth4:
> >> Adaptive RX: off TX: off
> >> stats-block-usecs: 0
> >> sample-interval: 0
> >> pkt-rate-low: 0
> >> pkt-rate-high: 0
> >>
> >> rx-usecs: 0
> >
> > Please put a value in rx-usecs, like 20 or 10.
> > The value 0 is often used to signal driver to do adaptive.
>
> Ok, put it now to 10.
Setting it to 10 is a little aggressive, as you ask it to generate
100,000 interrupts per sec. (Watch with 'vmstat 1' to see it.)
1/(10/10^6) = 100000 interrupts/sec
> Goes a bit quicker (transfer up to 26 MB/s), but discards and pci stalls
> are still there.
Why are you measuring in (26) MBytes/sec ? (equal 208 Mbit/s)
If you still have ethtool PHY-discards, then you still have a problem.
> Ping times are noticable improved:
Okay so this means these changes did have a positive effect. So, this
can be related to OS is not getting activated fast-enough by NIC
interrupts.
> 64 bytes from x.x.x.x: icmp_seq=39 ttl=64 time=0.172 ms
> 64 bytes from x.x.x.x: icmp_seq=40 ttl=64 time=0.414 ms
> 64 bytes from x.x.x.x: icmp_seq=41 ttl=64 time=0.183 ms
> 64 bytes from x.x.x.x: icmp_seq=42 ttl=64 time=1.41 ms
> 64 bytes from x.x.x.x: icmp_seq=43 ttl=64 time=0.172 ms
> 64 bytes from x.x.x.x: icmp_seq=44 ttl=64 time=0.228 ms
> 64 bytes from x.x.x.x: icmp_seq=46 ttl=64 time=0.120 ms
> 64 bytes from x.x.x.x: icmp_seq=47 ttl=64 time=1.47 ms
> 64 bytes from x.x.x.x: icmp_seq=48 ttl=64 time=0.162 ms
> 64 bytes from x.x.x.x: icmp_seq=49 ttl=64 time=0.160 ms
> 64 bytes from x.x.x.x: icmp_seq=50 ttl=64 time=0.158 ms
> 64 bytes from x.x.x.x: icmp_seq=51 ttl=64 time=0.113 ms
Can you try to test if disabling TSO, GRO and GSO makes a difference?
ethtool -K eth4 gso off gro off tso off
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
More information about the Bloat
mailing list