[Bloat] Router congestion, slow ping/ack times with kernel 5.4.60

Wed Nov 4 10:23:12 EST 2020

Hi all,

I'm coming from the lartc mailing list, here's the original text:

=====

I have multiple routers which connect to multiple upstream providers, I 
have noticed a high latency shift in icmp (and generally all connection) 
if I run b2 upload-file --threads 40 (and I can reproduce this)

What options do I have to analyze why this happens?

General Info:

Routers are connected between each other with 10G Mellanox Connect-X 
cards via 10G SPF+ DAC cables via a 10G Switch from fs.com
Latency generally is around 0.18 ms between all routers (4).
Throughput is 9.4 Gbit/s with 0 retransmissions when tested with iperf3.
2 of the 4 routers are connected upstream with a 1G connection (separate 
port, same network card)
All routers have the full internet routing tables, i.e. 80k entries for 
IPv6 and 830k entries for IPv4
Conntrack is disabled (-j NOTRACK)
Kernel 5.4.60 (custom)
2x Xeon X5670 @ 2.93 Ghz
96 GB RAM
No Swap
CentOs 7

During high latency:

Latency on routers which have the traffic flow increases to 12 - 20 ms, 
for all interfaces, moving of the stream (via bgp disable session) moves 
also the high latency
iperf3 performance plumets to 300 - 400 MBits
CPU load (user / system) are around 0.1%
Ram Usage is around 3 - 4 GB
if_packets count is stable (around 8000 pkt/s more)

for b2 upload-file with 10 threads I can achieve 60 MB/s consistently, 
with 40 threads the performance drops to 8 MB/s

I do not believe that 40 tcp streams should be any problem for a machine 
of that size.

Thanks for any ideas, help, pointers, things I can verify / check / 
provide additional!

=======

So far I have tested:

1) Use Stock Kernel 3.10.0-541 -> issue does not happen
2) setup fq_codel on the interfaces:

Here is the tc -s qdisc output:

qdisc fq_codel 8005: dev eth4 root refcnt 193 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 8374229144 bytes 10936167 pkt (dropped 0, overlimits 0 requeues 
6127)
  backlog 0b 0p requeues 6127
   maxpacket 25398 drop_overlimit 0 new_flow_count 15441 ecn_mark 0
   new_flows_len 0 old_flows_len 0
qdisc fq_codel 8008: dev eth5 root refcnt 193 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 1072480080 bytes 1012973 pkt (dropped 0, overlimits 0 requeues 
735)
  backlog 0b 0p requeues 735
   maxpacket 19682 drop_overlimit 0 new_flow_count 15963 ecn_mark 0
   new_flows_len 0 old_flows_len 0
qdisc fq_codel 8004: dev eth4.2300 root refcnt 2 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 8441021899 bytes 11021070 pkt (dropped 0, overlimits 0 requeues 
0)
  backlog 0b 0p requeues 0
   maxpacket 68130 drop_overlimit 0 new_flow_count 257055 ecn_mark 0
   new_flows_len 0 old_flows_len 0
qdisc fq_codel 8006: dev eth5.2501 root refcnt 2 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 571984459 bytes 2148377 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
   maxpacket 7570 drop_overlimit 0 new_flow_count 11300 ecn_mark 0
   new_flows_len 0 old_flows_len 0
qdisc fq_codel 8007: dev eth5.2502 root refcnt 2 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 1401322222 bytes 1966724 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
   maxpacket 19682 drop_overlimit 0 new_flow_count 76653 ecn_mark 0
   new_flows_len 0 old_flows_len 0

I have no statistics / metrics that would point to a slow down on the 
server, cpu / load / network / packets / memory all show normal very low 
load.
Is there other, (hidden) metrics I can collect to analyze this issue 
further?

Thanks
Thomas