From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [45.145.95.4]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 3473C3B29D for ; Fri, 6 Nov 2020 06:45:34 -0500 (EST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1604663132; bh=IA9CWKuBHMoc8IoEYJBU3JVrvnAeABBWzmWbUS68nTQ=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=cAbCrNnoRkX+botTCL/SsRx6lwGTFgaCIJm8e/eda2IEC1K70DtVtlmR0/nQhgfV+ TCaBks3gBhPAEJfQvp5+38D/ZJCQv19i0pGnHOw2Hnn8010AMWI/q1nwOEXFf10Kqg 7Vnk5nrD1kkrAk3CFGLWOaCF02k7Py7JwejaSD4P3TSQGJRKAShWUaGhPdD9+tme57 MWfcG+c+O2S/cT5C8TkOPkhzHGX0jZYCrE7a4D1ga/4v09wl67x4QrmP/xL1J1nsAv bIX6NuFIGZAciyZ4+NGoRuVF5/276y4uygxmSnGv5IgYdaXRjO2Yn1jO1Ozk2T1WlI 26jMkmR/mATzQ== To: Thomas Rosenstein , Jesper Dangaard Brouer Cc: Bloat In-Reply-To: References: <87imalumps.fsf@toke.dk> <871rh8vf1p.fsf@toke.dk> <81ED2A33-D366-42FC-9344-985FEE8F11BA@creamfinance.com> <87sg9ot5f1.fsf@toke.dk> <20201105143317.78276bbc@carbon> <11812D44-BD46-4CA4-BA39-6080BD88F163@creamfinance.com> <20201106121840.7959ae4b@carbon> Date: Fri, 06 Nov 2020 12:45:31 +0100 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <87blgaso84.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Nov 2020 11:45:34 -0000 "Thomas Rosenstein" writes: > On 6 Nov 2020, at 12:18, Jesper Dangaard Brouer wrote: > >> On Fri, 06 Nov 2020 10:18:10 +0100 >> "Thomas Rosenstein" wrote: >> >>>>> I just tested 5.9.4 seems to also fix it partly, I have long >>>>> stretches where it looks good, and then some increases again. (3.10 >>>>> Stock has them too, but not so high, rather 1-3 ms) >>>>> >> >> That you have long stretches where latency looks good is interesting >> information. My theory is that your system have a periodic userspace >> process that does a kernel syscall that takes too long, blocking >> network card from processing packets. (Note it can also be a kernel >> thread). > > The weird part is, I first only updated router-02 and pinged to > router-04 (out of traffic flow), there I noticed these long stretches of > ok ping. > > When I updated also router-03 and router-04, the old behaviour kind of > was back, this confused me. > > Could this be related to netlink? I have gobgpd running on these > routers, which injects routes via netlink. > But the churn rate during the tests is very minimal, maybe 30 - 40 > routes every second. > > Otherwise we got: salt-minion, collectd, node_exporter, sshd collectd may be polling the interface stats; try turning that off? >> >> Another theory is the NIC HW does strange things, but it is not very >> likely. E.g. delaying the packets before generating the IRQ >> interrupt, >> which hide it from my IRQ-to-softirq latency tool. >> >> A question: What traffic control qdisc are you using on your system? > > kernel 4+ uses pfifo, but there's no dropped packets > I have also tested with fq_codel, same behaviour and also no weirdness > in the packets queue itself > > kernel 3.10 uses mq, and for the vlan interfaces noqueue Do you mean that you only have a single pfifo qdisc on kernel 4+? Why is it not using mq? Was there anything in the ethtool stats? -Toke