From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 31EE43B2A4 for ; Mon, 9 Nov 2020 11:39:52 -0500 (EST) Received: by mail-wm1-x32a.google.com with SMTP id c9so8568182wml.5 for ; Mon, 09 Nov 2020 08:39:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=creamfinance.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version; bh=mg1VuJdVju/yKqiWZiaPoOx5f8G4inqUwek/TMAzdKU=; b=YlZUx6hzw//Twvcu/rI80aC322CCPEr41F1eU6tYodqdUIYcmBvYeB/RDMwq0Jx2iA lZonS9uZ3owWBI8HGPwdTcrbnRLmOVhfJyjohUFfOp12D10oQly1IxHr3HzeqGIndzre trOV+8ZitoPRBIF6A0SHKZUybm9QTyCWhg9IE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version; bh=mg1VuJdVju/yKqiWZiaPoOx5f8G4inqUwek/TMAzdKU=; b=UxrlxAHBMzM2SULlvKMsZPX9bK7D4t/EgbeqOIdbUP34YH2mpxOLNDHbaHOibuLxDY EtJEoo7xVdl4rEgMNAFrE3g+ie/ZXG0Navv7rN8unfUUAN6oWFBnrOTNYh3/FzZ65L2g VMSAwtWOJc/qEcnw4PIHNDi4bUH5uSRRdKxbyVAqj34o9jC0SQ6bdZjgqCpKSJR04v6G wAQrmoJA2Ui1cWY/yorpm1xVSK3YaBXn4L+WvKJXEmUA4mKX862fTUb8bLQsm7RF1rsX bVMfFAn4C7/M/APHh0A39C20QSN10l2lkMCCuLJJs7/ak08eahNNn3Ka+rcdlr6v7+9U XBhA== X-Gm-Message-State: AOAM530uKc202Q5gaOembMh7SoBwPI03lO1j/rBDpD2LPgPg17Pb6zSJ cLc3tUZ9PPi1OWzABWcYvwWp X-Google-Smtp-Source: ABdhPJx/p4wG+3xSdGByvw/WrDP/dx3sTebGfHnyBGyDFNzy8oQnLv+W5FiFd+EgeK3uiCR8gI7q7Q== X-Received: by 2002:a7b:cbd7:: with SMTP id n23mr13354wmi.142.1604939990689; Mon, 09 Nov 2020 08:39:50 -0800 (PST) Received: from [10.8.100.3] (ip-185.208.132.9.cf-it.at. [185.208.132.9]) by smtp.gmail.com with ESMTPSA id j13sm14137570wru.86.2020.11.09.08.39.49 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 09 Nov 2020 08:39:49 -0800 (PST) From: "Thomas Rosenstein" To: "Jesper Dangaard Brouer" Cc: "Thomas Rosenstein via Bloat" Date: Mon, 09 Nov 2020 17:39:48 +0100 X-Mailer: MailMate (1.13.2r5673) Message-ID: <81F2B741-3F76-4AE1-84EB-DC65AD2BB798@creamfinance.com> In-Reply-To: <20201109124030.71216677@carbon> References: <87imalumps.fsf@toke.dk> <871rh8vf1p.fsf@toke.dk> <81ED2A33-D366-42FC-9344-985FEE8F11BA@creamfinance.com> <87sg9ot5f1.fsf@toke.dk> <20201105143317.78276bbc@carbon> <11812D44-BD46-4CA4-BA39-6080BD88F163@creamfinance.com> <20201106121840.7959ae4b@carbon> <87blgaso84.fsf@toke.dk> <20201106135358.09f6c281@carbon> <20201106151324.5f506574@carbon> <1E70B6D2-1212-43FA-989A-03B657EEE2F2@creamfinance.com> <20201106211940.4c30ccc9@carbon> <6963be0e-3eb5-5875-b53c-66033f50dc2d@gmail.com> <12D28386-7C00-4A31-91E4-37083C1674F9@creamfinance.com> <20201109092428.293104ea@carbon> <7723D882-4DAB-4A70-9D00-DF1976872AC2@creamfinance.com> <20201109124030.71216677@carbon> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=_MailMate_C1579090-0B06-4116-A402-ECAF7A765676_=" Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Nov 2020 16:39:52 -0000 --=_MailMate_C1579090-0B06-4116-A402-ECAF7A765676_= Content-Type: text/plain; format=flowed; markup=markdown Content-Transfer-Encoding: quoted-printable On 9 Nov 2020, at 12:40, Jesper Dangaard Brouer wrote: > On Mon, 09 Nov 2020 11:09:33 +0100 > "Thomas Rosenstein" wrote: > > Could you also provide ethtool_stats for the TX interface? > > Notice that the tool[1] ethtool_stats.pl support monitoring several > interfaces at the same time, e.g. run: > > ethtool_stats.pl --sec 3 --dev eth4 --dev ethTX > > And provide output as pastebin. I have now also repeated the same test with 3.10, here are the ethtool = outputs: https://drive.google.com/file/d/1c98MVV0JYl6Su6xZTpqwS7m-6OlbmAFp/view?us= p=3Dsharing and the ping times: https://drive.google.com/file/d/1xhbGJHb5jUbPsee4frbx-c-uqh-7orXY/view?us= p=3Dsharing Sadly the parameters we were looking at are not supported below 4.14. but I immediatly saw 1 thing very different: ethtool --statistics eth4 | grep discards rx_discards_phy: 0 tx_discards_phy: 0 if we check the ethtool output from 5.9.4 were have: rx_discards_phy: 151793 And also the outbound_pci_stalled_wr_events get more frequent the lower = the total bandwidth / the higher the ping is. Logically there must be something blocking the the buffers, either they = are not getting freed, or not rotated correctly, or processing is too = slow. I would exclude the processing, simply based on 0% CPU load, and also = that it doesn't happen in 3.10. Suspicious is also, that the issue only appears after a certain time of = activity (maybe total traffic?!) --=_MailMate_C1579090-0B06-4116-A402-ECAF7A765676_= Content-Type: text/html Content-Transfer-Encoding: quoted-printable

On 9 Nov 2020, at 12:40, Jesper Dangaard Brouer wrote:

On Mon, 09 Nov 2020 11:09:33 +0100
"Thomas Rosenstein" thomas.rosenstein@creamfinance.com wrote:

Could you also provide ethtool_stats for the TX interface= ?

Notice that the tool[1] ethtool_stats.pl support monitori= ng several
interfaces at the same time, e.g. run:

ethtool_stats.pl --sec 3 --dev eth4 --dev ethTX

And provide output as pastebin.

I have now also repeated the same test with 3.10, here ar= e the ethtool outputs:

https://d= rive.google.com/file/d/1c98MVV0JYl6Su6xZTpqwS7m-6OlbmAFp/view?usp=3Dshari= ng

and the ping times:

https://d= rive.google.com/file/d/1xhbGJHb5jUbPsee4frbx-c-uqh-7orXY/view?usp=3Dshari= ng

Sadly the parameters we were looking at are not supported= below 4.14.

but I immediatly saw 1 thing very different:

ethtool --statistics eth4 | grep discards
rx_discards_phy: 0
tx_discards_phy: 0

if we check the ethtool output from 5.9.4 were have:

 rx_discards_phy:=
 151793

And also the outbound_pci_stalled_wr_events get more freq= uent the lower the total bandwidth / the higher the ping is.
Logically there must be something blocking the the buffers, either they a= re not getting freed, or not rotated correctly, or processing is too slow= =2E
I would exclude the processing, simply based on 0% CPU load, and also tha= t it doesn't happen in 3.10.
Suspicious is also, that the issue only appears after a certain time of a= ctivity (maybe total traffic?!)

--=_MailMate_C1579090-0B06-4116-A402-ECAF7A765676_=--