From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 9A3CF3B2A4 for ; Fri, 6 Nov 2020 07:54:08 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604667248; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pITvh9a1TTnAvbcekG7ri1aSXKLPr1rsccuOn65jxEg=; b=HdkacEHV9ucimPwjLYRT9+J+R+aGTHmXwmy0qYbLsDPnGz+oePFPnPI+i2+09Yhl2hkOsz 8WmqcoyCdVY9MTC5mkwnsZScr9M2t8LQKY02rxMY59PhvREabjZFaP0klAdXYd5IizXd6W Ctvh2uGvUjZRm8DevOQcese7nMZR5qk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-352-lkO37onaOGWGWsrwVxSkRA-1; Fri, 06 Nov 2020 07:54:06 -0500 X-MC-Unique: lkO37onaOGWGWsrwVxSkRA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3E86E802B7D; Fri, 6 Nov 2020 12:54:05 +0000 (UTC) Received: from carbon (unknown [10.36.110.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1164810098A1; Fri, 6 Nov 2020 12:53:59 +0000 (UTC) Date: Fri, 6 Nov 2020 13:53:58 +0100 From: Jesper Dangaard Brouer To: Toke =?UTF-8?B?SMO4aWxhbmQtSsO4cmdlbnNlbg==?= Cc: Thomas Rosenstein , Bloat , brouer@redhat.com Message-ID: <20201106135358.09f6c281@carbon> In-Reply-To: <87blgaso84.fsf@toke.dk> References: <87imalumps.fsf@toke.dk> <871rh8vf1p.fsf@toke.dk> <81ED2A33-D366-42FC-9344-985FEE8F11BA@creamfinance.com> <87sg9ot5f1.fsf@toke.dk> <20201105143317.78276bbc@carbon> <11812D44-BD46-4CA4-BA39-6080BD88F163@creamfinance.com> <20201106121840.7959ae4b@carbon> <87blgaso84.fsf@toke.dk> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=brouer@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Nov 2020 12:54:08 -0000 On Fri, 06 Nov 2020 12:45:31 +0100 Toke H=C3=B8iland-J=C3=B8rgensen wrote: > "Thomas Rosenstein" writes: >=20 > > On 6 Nov 2020, at 12:18, Jesper Dangaard Brouer wrote: > > =20 > >> On Fri, 06 Nov 2020 10:18:10 +0100 > >> "Thomas Rosenstein" wrote: > >> =20 > >>>>> I just tested 5.9.4 seems to also fix it partly, I have long > >>>>> stretches where it looks good, and then some increases again. (3.10 > >>>>> Stock has them too, but not so high, rather 1-3 ms) > >>>>> =20 > >> > >> That you have long stretches where latency looks good is interesting > >> information. My theory is that your system have a periodic userspace > >> process that does a kernel syscall that takes too long, blocking > >> network card from processing packets. (Note it can also be a kernel > >> thread). =20 > > [...] > > > > Could this be related to netlink? I have gobgpd running on these=20 > > routers, which injects routes via netlink. > > But the churn rate during the tests is very minimal, maybe 30 - 40=20 > > routes every second. Yes, this could be related. The internal data-structure for FIB lookups is a fibtrie which is a compressed patricia tree, related to radix tree idea. Thus, I can imagine that the kernel have to rebuild/rebalance the tree with all these updates. > > > > Otherwise we got: salt-minion, collectd, node_exporter, sshd =20 >=20 > collectd may be polling the interface stats; try turning that off? It should be fairly easy for you to test the theory if any of these services (except sshd) is causing this, by turning them off individually. --=20 Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer