From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <thomas.rosenstein@creamfinance.com>
Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com
 [IPv6:2a00:1450:4864:20::333])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id BB3423B29D
 for <bloat@lists.bufferbloat.net>; Fri,  6 Nov 2020 06:37:52 -0500 (EST)
Received: by mail-wm1-x333.google.com with SMTP id p19so766540wmg.0
 for <bloat@lists.bufferbloat.net>; Fri, 06 Nov 2020 03:37:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=creamfinance.com; s=google;
 h=from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version; bh=Q0MLz4m5J8F49A/aPX00qY8h8EnHB56jnWkzuf/N8tI=;
 b=Ows+FgSuCpHECIwW96kPp9HSkJsj5h4laRwUeF02PyVoVWwza+9OP+bZA9Tz5a4iYO
 inWGV2k9LMhVoltqNIs5ayeIJ5KpcpE+jPa0Fb6wnqFWczljg3AN85bJwWx2mIyt3gM/
 Yi6zlqTRizrJ8xXgLG/dYIDq6AaeXHok3GlRA=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references:mime-version;
 bh=Q0MLz4m5J8F49A/aPX00qY8h8EnHB56jnWkzuf/N8tI=;
 b=DQpx26OykikdG7RGTrusGYE76qsZtHk+Ncz2nE4VlEQlqmqm9M0EqMDuLRqLiuJ2yM
 wCzjp0HT0CbdJqyomeWAcSDUWovrnRfn1cM2Vd18NuOoSezQosj2ZZBmjFSAK+LeRSRp
 VBCaRFBeI0o7ds4YVUn2nPHP4dmDNji2q9+vOhIxPFpYaSAxzNHSsCLsjeJ2E20y6iiO
 2juyDVi1tlpKXIDTZ6e3hEzw1vNqn1fZLq8DHyc0+EgaggRDjGbGJBr4DfSHPCivw/Y+
 EYJ0jhlx+KHKIaX7yaHOKbzugoB1cXvKXBApUh+428xoN9Ib5D7EsTl5Q2JOwqwBg5RM
 O7pQ==
X-Gm-Message-State: AOAM5330l9CynHt3hSMswWrTczLjyNOT4Bk53U+/B2DB8faFNflOBqwk
 rr7jffkNEWCexousb7DKF0p9
X-Google-Smtp-Source: ABdhPJwgc6BR6BKU1zBcWXIbSy6QQjUi6rHXBhclNlGfCCpcC7dO/rmTDuvdjqin7f9TFuHPzCim9g==
X-Received: by 2002:a1c:2203:: with SMTP id i3mr2043753wmi.144.1604662671417; 
 Fri, 06 Nov 2020 03:37:51 -0800 (PST)
Received: from [10.8.100.3] (ip-185.208.132.9.cf-it.at. [185.208.132.9])
 by smtp.gmail.com with ESMTPSA id k18sm1725017wrx.96.2020.11.06.03.37.50
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 06 Nov 2020 03:37:50 -0800 (PST)
From: "Thomas Rosenstein" <thomas.rosenstein@creamfinance.com>
To: "Jesper Dangaard Brouer" <brouer@redhat.com>
Cc: Bloat <bloat@lists.bufferbloat.net>,
 "Toke =?utf-8?b?SMO4aWxhbmQtSsO4cmdlbnNlbg==?=" <toke@toke.dk>
Date: Fri, 06 Nov 2020 12:37:49 +0100
X-Mailer: MailMate (1.13.2r5673)
Message-ID: <CE72EAE2-F23A-4CA5-BB56-91491E64CD05@creamfinance.com>
In-Reply-To: <20201106121840.7959ae4b@carbon>
References: <ED588EA6-9DC5-45BE-82CD-D84F5919C057@creamfinance.com>
 <87imalumps.fsf@toke.dk>
 <ED77E328-D5E6-45F7-9733-47B97EAE6810@creamfinance.com>
 <871rh8vf1p.fsf@toke.dk>
 <81ED2A33-D366-42FC-9344-985FEE8F11BA@creamfinance.com>
 <87sg9ot5f1.fsf@toke.dk>
 <D00929D6-E0BF-4C69-AD71-4986D3FB7857@creamfinance.com>
 <20201105143317.78276bbc@carbon>
 <11812D44-BD46-4CA4-BA39-6080BD88F163@creamfinance.com>
 <20201106121840.7959ae4b@carbon>
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; markup=markdown
Subject: Re: [Bloat] Router congestion,
	slow ping/ack times with kernel 5.4.60
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 06 Nov 2020 11:37:52 -0000


On 6 Nov 2020, at 12:18, Jesper Dangaard Brouer wrote:

> On Fri, 06 Nov 2020 10:18:10 +0100
> "Thomas Rosenstein" <thomas.rosenstein@creamfinance.com> wrote:
>
>>>> I just tested 5.9.4 seems to also fix it partly, I have long
>>>> stretches where it looks good, and then some increases again. (3.10
>>>> Stock has them too, but not so high, rather 1-3 ms)
>>>>
>
> That you have long stretches where latency looks good is interesting
> information.   My theory is that your system have a periodic userspace
> process that does a kernel syscall that takes too long, blocking
> network card from processing packets. (Note it can also be a kernel
> thread).

The weird part is, I first only updated router-02 and pinged to 
router-04 (out of traffic flow), there I noticed these long stretches of 
ok ping.

When I updated also router-03 and router-04, the old behaviour kind of 
was back, this confused me.

Could this be related to netlink? I have gobgpd running on these 
routers, which injects routes via netlink.
But the churn rate during the tests is very minimal, maybe 30 - 40 
routes every second.

Otherwise we got: salt-minion, collectd, node_exporter, sshd

>
> Another theory is the NIC HW does strange things, but it is not very
> likely.  E.g. delaying the packets before generating the IRQ 
> interrupt,
> which hide it from my IRQ-to-softirq latency tool.
>
> A question: What traffic control qdisc are you using on your system?

kernel 4+ uses pfifo, but there's no dropped packets
I have also tested with fq_codel, same behaviour and also no weirdness 
in the packets queue itself

kernel 3.10 uses mq, and for the vlan interfaces noqueue


Here's the mail archive link for the question on lartc :

https://www.spinics.net/lists/lartc/msg23774.html

>
> What you looked at the obvious case if any of your qdisc report a 
> large
> backlog? (during the incidents)

as said above, nothing in qdiscs or reported

>
>
>>>> for example:
>>>>
>>>> 64 bytes from x.x.x.x: icmp_seq=10 ttl=64 time=0.169 ms
>>>> 64 bytes from x.x.x.x: icmp_seq=11 ttl=64 time=5.53 ms
>>>> 64 bytes from x.x.x.x: icmp_seq=12 ttl=64 time=9.44 ms
>>>> 64 bytes from x.x.x.x: icmp_seq=13 ttl=64 time=0.167 ms
>>>> 64 bytes from x.x.x.x: icmp_seq=14 ttl=64 time=3.88 ms
>>>>
>>>> and then again:
>>>>
>>>> 64 bytes from x.x.x.x: icmp_seq=15 ttl=64 time=0.569 ms
>>>> 64 bytes from x.x.x.x: icmp_seq=16 ttl=64 time=0.148 ms
>>>> 64 bytes from x.x.x.x: icmp_seq=17 ttl=64 time=0.286 ms
>>>> 64 bytes from x.x.x.x: icmp_seq=18 ttl=64 time=0.257 ms
>>>> 64 bytes from x.x.x.x: icmp_seq=19 ttl=64 time=0.220 ms
>
> These very low ping times tell me that you are measuring very close to
> the target machine, which is good.  Here on the bufferbloat list, we 
> are
> always suspicious of network equipment being use in these kind of
> setups.  As experience tells us that this can be the cause of
> bufferbloat latency.
yes, I'm just testing across two machines connected directly to the same 
switch,
basically that's the best case scenario apart from direct connection.

I do also use a VLAN on this interface, so the pings go through the vlan 
stack!

>
> You mention some fs.com switches (your desc below signature), can you
> tell us more?

It's a fs.com N5850-48S6Q

48 Port 10 Gbit + 6 port 40 Gbit

there are only 6 ports with 10 G in use, and 2 with 1 G, basically no 
traffic

>
>
> [...]
>> I have a feeling that maybe not all config options were correctly 
>> moved
>> to the newer kernel.
>>
>> Or there's a big bug somewhere ... (which would seem rather weird for 
>> me
>> to be the first one to discover this)
>
> I really appreciate that you report this.  This is a periodic issue,
> that often result in people not reporting this.
>
> Even if we find this to be caused by some process running on your
> system, or a bad config, this it is really important that we find the
> root-cause.
>
>> I'll rebuild the 5.9 kernel on one of the 3.10 kernel and see if it
>> makes a difference ...
>
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
>
> On Wed, 04 Nov 2020 16:23:12 +0100
> Thomas Rosenstein via Bloat <bloat@lists.bufferbloat.net> wrote:
>
>> General Info:
>>
>> Routers are connected between each other with 10G Mellanox Connect-X
>> cards via 10G SPF+ DAC cables via a 10G Switch from fs.com
>> Latency generally is around 0.18 ms between all routers (4).
>> Throughput is 9.4 Gbit/s with 0 retransmissions when tested with 
>> iperf3.
>> 2 of the 4 routers are connected upstream with a 1G connection 
>> (separate
>> port, same network card)
>> All routers have the full internet routing tables, i.e. 80k entries 
>> for
>> IPv6 and 830k entries for IPv4
>> Conntrack is disabled (-j NOTRACK)
>> Kernel 5.4.60 (custom)
>> 2x Xeon X5670 @ 2.93 Ghz
>> 96 GB RAM