From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 631653B29E for ; Mon, 16 Nov 2020 07:05:26 -0500 (EST) Received: by mail-wr1-x431.google.com with SMTP id l1so18397259wrb.9 for ; Mon, 16 Nov 2020 04:05:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=creamfinance.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YO3+qxbbJFzbqz0j4jU5oqRn6tcJ1cKGJpNsKwUHQlg=; b=jTsbzBo3chsvWezlaVAqUSSwVW2A8Mb1pRmGxMbak/becQC2IC8+djaBaxyToHexx7 L83pHYP1KQY0Phz7vWRdx6J7GrRy0rpvuK9X3jGLrVjrTtYj+VhkTX77ACrcFOVjKnv2 vptIiGpWOKemzTHeqb09esMiBANMJD3FBwhic= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YO3+qxbbJFzbqz0j4jU5oqRn6tcJ1cKGJpNsKwUHQlg=; b=YbRUUzSN5QX7qOTWeiNf9NH7kM3Upfh5158otv/XTXJMUXW32tn/hhCnXP1tH6Qxgt qqwPBLNkcUFpIghjjBSaCRf7QiZfmjs2nz2qiG3cqiG5rtWZ3Vth59V4KgW8qRRlfp4l vyfzi3MDfLZPmZuIzS2QmRzbHuCbvEfEX6hHG5nTI35slVn9Ho+0242T3S1Vkf0uVwCn GXKtIQ8sdhYA05wfY5uk1XFf5TP3M+X9nASlCJW1tpO5V3BpB9exedQMbyEfD4AJOV5t yCifeaMGf6lyzN6/Lz8+E0RgKv67ouf2c+z7V90bbC9x9MHRHRiIKjyAJY8dQzesby9u 7j9w== X-Gm-Message-State: AOAM531HY+wc9z8TYFUIrAQezb6ZwX9G2e2k3BDX5/+9AHZNKahnn58E /4QptVHswzVO0Iq5BOT2JUUX X-Google-Smtp-Source: ABdhPJxDkRDUH5qJayC7Si8W5sNfjhjNJrIqrnRVJO/5mcvHEajJ6qi4QcTq3Fyf1rRZSyUvWcZ1ZA== X-Received: by 2002:adf:e950:: with SMTP id m16mr20295190wrn.0.1605528325295; Mon, 16 Nov 2020 04:05:25 -0800 (PST) Received: from [10.37.129.2] (ip-185.208.132.9.cf-it.at. [185.208.132.9]) by smtp.gmail.com with ESMTPSA id o205sm19672240wma.25.2020.11.16.04.05.24 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Nov 2020 04:05:24 -0800 (PST) From: "Thomas Rosenstein" To: "Jesper Dangaard Brouer" Cc: Bufferbloat , "Saeed Mahameed" , "Tariq Toukan" Date: Mon, 16 Nov 2020 13:05:23 +0100 X-Mailer: MailMate (1.13.2r5673) Message-ID: In-Reply-To: <20201116125640.0840429b@carbon> References: <20201106121840.7959ae4b@carbon> <87blgaso84.fsf@toke.dk> <20201106135358.09f6c281@carbon> <20201106151324.5f506574@carbon> <1E70B6D2-1212-43FA-989A-03B657EEE2F2@creamfinance.com> <20201106211940.4c30ccc9@carbon> <6963be0e-3eb5-5875-b53c-66033f50dc2d@gmail.com> <12D28386-7C00-4A31-91E4-37083C1674F9@creamfinance.com> <20201109092428.293104ea@carbon> <7723D882-4DAB-4A70-9D00-DF1976872AC2@creamfinance.com> <20201109124030.71216677@carbon> <27110D8E-77DF-4D10-A5EA-6430DBD55BC7@creamfinance.com> <20201112110508.59f402e7@carbon> <20201112143111.2bb697bf@carbon> <20201112164236.3f94702a@carbon> <8B5A1292-72E2-4715-838C-5823AFA6B7BD@creamfinance.com> <20201116125640.0840429b@carbon> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; markup=markdown Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Nov 2020 12:05:26 -0000 On 16 Nov 2020, at 12:56, Jesper Dangaard Brouer wrote: > On Fri, 13 Nov 2020 07:31:26 +0100 > "Thomas Rosenstein" wrote: > >> On 12 Nov 2020, at 16:42, Jesper Dangaard Brouer wrote: >> >>> On Thu, 12 Nov 2020 14:42:59 +0100 >>> "Thomas Rosenstein" wrote: >>> >>>>> Notice "Adaptive" setting is on. My long-shot theory(2) is that >>>>> this >>>>> adaptive algorithm in the driver code can guess wrong (due to not >>>>> taking TSO into account) and cause issues for >>>>> >>>>> Try to turn this adaptive algorithm off: >>>>> >>>>> ethtool -C eth4 adaptive-rx off adaptive-tx off >>>>> >>> [...] >>>>>> >>>>>> rx-usecs: 32 >>>>> >>>>> When you run off "adaptive-rx" you will get 31250 interrupts/sec >>>>> (calc 1/(32/10^6) =3D 31250). >>>>> >>>>>> rx-frames: 64 >>> [...] >>>>>> tx-usecs-irq: 0 >>>>>> tx-frames-irq: 0 >>>>>> >>>>> [...] >>>> >>>> I have now updated the settings to: >>>> >>>> ethtool -c eth4 >>>> Coalesce parameters for eth4: >>>> Adaptive RX: off TX: off >>>> stats-block-usecs: 0 >>>> sample-interval: 0 >>>> pkt-rate-low: 0 >>>> pkt-rate-high: 0 >>>> >>>> rx-usecs: 0 >>> >>> Please put a value in rx-usecs, like 20 or 10. >>> The value 0 is often used to signal driver to do adaptive. >> >> Ok, put it now to 10. > > Setting it to 10 is a little aggressive, as you ask it to generate > 100,000 interrupts per sec. (Watch with 'vmstat 1' to see it.) > > 1/(10/10^6) =3D 100000 interrupts/sec > >> Goes a bit quicker (transfer up to 26 MB/s), but discards and pci = >> stalls >> are still there. > > Why are you measuring in (26) MBytes/sec ? (equal 208 Mbit/s) yep 208 MBits > > If you still have ethtool PHY-discards, then you still have a problem. > >> Ping times are noticable improved: > > Okay so this means these changes did have a positive effect. So, this > can be related to OS is not getting activated fast-enough by NIC > interrupts. > > >> 64 bytes from x.x.x.x: icmp_seq=3D39 ttl=3D64 time=3D0.172 ms >> 64 bytes from x.x.x.x: icmp_seq=3D40 ttl=3D64 time=3D0.414 ms >> 64 bytes from x.x.x.x: icmp_seq=3D41 ttl=3D64 time=3D0.183 ms >> 64 bytes from x.x.x.x: icmp_seq=3D42 ttl=3D64 time=3D1.41 ms >> 64 bytes from x.x.x.x: icmp_seq=3D43 ttl=3D64 time=3D0.172 ms >> 64 bytes from x.x.x.x: icmp_seq=3D44 ttl=3D64 time=3D0.228 ms >> 64 bytes from x.x.x.x: icmp_seq=3D46 ttl=3D64 time=3D0.120 ms >> 64 bytes from x.x.x.x: icmp_seq=3D47 ttl=3D64 time=3D1.47 ms >> 64 bytes from x.x.x.x: icmp_seq=3D48 ttl=3D64 time=3D0.162 ms >> 64 bytes from x.x.x.x: icmp_seq=3D49 ttl=3D64 time=3D0.160 ms >> 64 bytes from x.x.x.x: icmp_seq=3D50 ttl=3D64 time=3D0.158 ms >> 64 bytes from x.x.x.x: icmp_seq=3D51 ttl=3D64 time=3D0.113 ms > > Can you try to test if disabling TSO, GRO and GSO makes a difference? > > ethtool -K eth4 gso off gro off tso off > I had a call yesterday with Mellanox and we added the following boot = options: intel_idle.max_cstate=3D0 processor.max_cstate=3D1 idle=3Dpoll This completely solved the problem, but now we run with a heater and = energy consumer, nearly 2x Watts on the outlet. I had no discards, super pings during transfer(< 0.100 ms), no outliers, = and good transfer rates > 50 MB/s So it seems to be related to C-State management in newer kernel version = being too agressive. I would like to try to tune here a bit, maybe we can get some input = which knobs to turn? I will read here: = https://www.kernel.org/doc/html/latest/admin-guide/pm/cpuidle.html#idle-s= tates-representation and related docs, I think there will be a few helpful hints. > > -- = > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer