General list for discussing Bufferbloat
 help / color / mirror / Atom feed
* [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60
@ 2020-11-04 15:23 Thomas Rosenstein
  2020-11-04 16:10 ` Toke Høiland-Jørgensen
  2020-11-16 12:34 ` Jesper Dangaard Brouer
  0 siblings, 2 replies; 47+ messages in thread
From: Thomas Rosenstein @ 2020-11-04 15:23 UTC (permalink / raw)
  To: bloat

Hi all,

I'm coming from the lartc mailing list, here's the original text:

=====

I have multiple routers which connect to multiple upstream providers, I 
have noticed a high latency shift in icmp (and generally all connection) 
if I run b2 upload-file --threads 40 (and I can reproduce this)

What options do I have to analyze why this happens?

General Info:

Routers are connected between each other with 10G Mellanox Connect-X 
cards via 10G SPF+ DAC cables via a 10G Switch from fs.com
Latency generally is around 0.18 ms between all routers (4).
Throughput is 9.4 Gbit/s with 0 retransmissions when tested with iperf3.
2 of the 4 routers are connected upstream with a 1G connection (separate 
port, same network card)
All routers have the full internet routing tables, i.e. 80k entries for 
IPv6 and 830k entries for IPv4
Conntrack is disabled (-j NOTRACK)
Kernel 5.4.60 (custom)
2x Xeon X5670 @ 2.93 Ghz
96 GB RAM
No Swap
CentOs 7

During high latency:

Latency on routers which have the traffic flow increases to 12 - 20 ms, 
for all interfaces, moving of the stream (via bgp disable session) moves 
also the high latency
iperf3 performance plumets to 300 - 400 MBits
CPU load (user / system) are around 0.1%
Ram Usage is around 3 - 4 GB
if_packets count is stable (around 8000 pkt/s more)


for b2 upload-file with 10 threads I can achieve 60 MB/s consistently, 
with 40 threads the performance drops to 8 MB/s

I do not believe that 40 tcp streams should be any problem for a machine 
of that size.

Thanks for any ideas, help, pointers, things I can verify / check / 
provide additional!

=======


So far I have tested:

1) Use Stock Kernel 3.10.0-541 -> issue does not happen
2) setup fq_codel on the interfaces:

Here is the tc -s qdisc output:

qdisc fq_codel 8005: dev eth4 root refcnt 193 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 8374229144 bytes 10936167 pkt (dropped 0, overlimits 0 requeues 
6127)
  backlog 0b 0p requeues 6127
   maxpacket 25398 drop_overlimit 0 new_flow_count 15441 ecn_mark 0
   new_flows_len 0 old_flows_len 0
qdisc fq_codel 8008: dev eth5 root refcnt 193 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 1072480080 bytes 1012973 pkt (dropped 0, overlimits 0 requeues 
735)
  backlog 0b 0p requeues 735
   maxpacket 19682 drop_overlimit 0 new_flow_count 15963 ecn_mark 0
   new_flows_len 0 old_flows_len 0
qdisc fq_codel 8004: dev eth4.2300 root refcnt 2 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 8441021899 bytes 11021070 pkt (dropped 0, overlimits 0 requeues 
0)
  backlog 0b 0p requeues 0
   maxpacket 68130 drop_overlimit 0 new_flow_count 257055 ecn_mark 0
   new_flows_len 0 old_flows_len 0
qdisc fq_codel 8006: dev eth5.2501 root refcnt 2 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 571984459 bytes 2148377 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
   maxpacket 7570 drop_overlimit 0 new_flow_count 11300 ecn_mark 0
   new_flows_len 0 old_flows_len 0
qdisc fq_codel 8007: dev eth5.2502 root refcnt 2 limit 10240p flows 1024 
quantum 1514 target 5.0ms interval 100.0ms ecn
  Sent 1401322222 bytes 1966724 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
   maxpacket 19682 drop_overlimit 0 new_flow_count 76653 ecn_mark 0
   new_flows_len 0 old_flows_len 0


I have no statistics / metrics that would point to a slow down on the 
server, cpu / load / network / packets / memory all show normal very low 
load.
Is there other, (hidden) metrics I can collect to analyze this issue 
further?

Thanks
Thomas





^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2020-11-16 12:49 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-04 15:23 [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 Thomas Rosenstein
2020-11-04 16:10 ` Toke Høiland-Jørgensen
2020-11-04 16:24   ` Thomas Rosenstein
2020-11-05  0:10     ` Toke Høiland-Jørgensen
2020-11-05  8:48       ` Thomas Rosenstein
2020-11-05 11:21         ` Toke Høiland-Jørgensen
2020-11-05 12:22           ` Thomas Rosenstein
2020-11-05 12:38             ` Toke Høiland-Jørgensen
2020-11-05 12:41               ` Thomas Rosenstein
2020-11-05 12:47                 ` Toke Høiland-Jørgensen
2020-11-05 13:33             ` Jesper Dangaard Brouer
2020-11-06  8:48               ` Thomas Rosenstein
2020-11-06 10:53                 ` Jesper Dangaard Brouer
2020-11-06  9:18               ` Thomas Rosenstein
2020-11-06 11:18                 ` Jesper Dangaard Brouer
2020-11-06 11:37                   ` Thomas Rosenstein
2020-11-06 11:45                     ` Toke Høiland-Jørgensen
2020-11-06 12:01                       ` Thomas Rosenstein
2020-11-06 12:53                       ` Jesper Dangaard Brouer
2020-11-06 14:13                         ` Jesper Dangaard Brouer
2020-11-06 17:04                           ` Thomas Rosenstein
2020-11-06 20:19                             ` Jesper Dangaard Brouer
2020-11-07 12:37                               ` Thomas Rosenstein
2020-11-07 12:40                                 ` Jan Ceuleers
2020-11-07 12:43                                   ` Thomas Rosenstein
2020-11-07 13:00                                   ` Thomas Rosenstein
2020-11-09  8:24                                     ` Jesper Dangaard Brouer
2020-11-09 10:09                                       ` Thomas Rosenstein
2020-11-09 11:40                                         ` Jesper Dangaard Brouer
2020-11-09 11:51                                           ` Toke Høiland-Jørgensen
2020-11-09 12:25                                           ` Thomas Rosenstein
2020-11-09 14:33                                           ` Thomas Rosenstein
2020-11-12 10:05                                             ` Jesper Dangaard Brouer
2020-11-12 11:26                                               ` Thomas Rosenstein
2020-11-12 13:31                                                 ` Jesper Dangaard Brouer
2020-11-12 13:42                                                   ` Thomas Rosenstein
2020-11-12 15:42                                                     ` Jesper Dangaard Brouer
2020-11-13  6:31                                                       ` Thomas Rosenstein
2020-11-16 11:56                                                         ` Jesper Dangaard Brouer
2020-11-16 12:05                                                           ` Thomas Rosenstein
2020-11-09 16:39                                           ` Thomas Rosenstein
2020-11-07 13:33                                 ` Thomas Rosenstein
2020-11-07 16:46                                 ` Jesper Dangaard Brouer
2020-11-07 17:01                                   ` Thomas Rosenstein
2020-11-07 17:26                                     ` Sebastian Moeller
2020-11-16 12:34 ` Jesper Dangaard Brouer
2020-11-16 12:49   ` Thomas Rosenstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox