From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 80A503B29D for ; Mon, 9 Nov 2020 05:09:36 -0500 (EST) Received: by mail-wm1-x334.google.com with SMTP id h2so7413343wmm.0 for ; Mon, 09 Nov 2020 02:09:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=creamfinance.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=C8ccQJDiQk6ya8uLYo34YZ4V905KvFidzJhK17+YgSY=; b=AoOLrj2XbbAnv9ixBiZQ2tlTdMFIBTMVgAs9Fp84dfek/BRq/Cnp+pwSR+AYy2Rk1d daK17pJ0TPXJCbLvrJo+EUav94GO+hj+zrSjzRB44FFpeyvQ3RTbj7+7OTzNUps5+E2U pm/p4lT0ZU/eJgV3diL4RIUG+IadfgwXTCOOw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=C8ccQJDiQk6ya8uLYo34YZ4V905KvFidzJhK17+YgSY=; b=CRe/aDNMbzqSIWakv/F2RmYLfG5uo2oQ1Hlg98m8t/6kRcNnM8TsPuomY19iPrwxsR E8jlwmfs6j6emFmNikuO1Ei2YjBr6N0ctE8Y6IcSQu5tJVXBRDK1yPVZc7+ubiLPDzFC Q5AE24icShCy6lw2uL4L4Hi6RDp1QmCxTSQI9PE1iOWGYeldJNVH1Zzq86OQTkVE0s4n u7PucZMzwrNNEI/sbRaG/NzniP70w5cTsufQShaxd0T7WxA5Nljv1cUpTmSbYzqA5xZa 19/eopWWHuAaONF1KV4rBIUubay2th+RhlOVIppnXCpzkFGA7hI7fRsLS+AKgX8aDJPf ZOng== X-Gm-Message-State: AOAM533AEBMhHWMqZ8FythsJtHbIb+wBF6rxa1y9obZ9JHGdaaKlS2Hy 4ChYWtfYWCfFsCZoydH7Nhbn X-Google-Smtp-Source: ABdhPJyRxX4GOVx8kXtLeBSktx+iWCMneALnrGBg6U9qzVKJw6LBMFcvMXVJtiKxPZqzNYpwWCPbQA== X-Received: by 2002:a1c:448:: with SMTP id 69mr12983440wme.12.1604916575335; Mon, 09 Nov 2020 02:09:35 -0800 (PST) Received: from [10.8.100.3] (ip-185.208.132.9.cf-it.at. [185.208.132.9]) by smtp.gmail.com with ESMTPSA id c6sm9581265wrh.74.2020.11.09.02.09.34 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 09 Nov 2020 02:09:34 -0800 (PST) From: "Thomas Rosenstein" To: "Jesper Dangaard Brouer" Cc: "Thomas Rosenstein via Bloat" Date: Mon, 09 Nov 2020 11:09:33 +0100 X-Mailer: MailMate (1.13.2r5673) Message-ID: <7723D882-4DAB-4A70-9D00-DF1976872AC2@creamfinance.com> In-Reply-To: <20201109092428.293104ea@carbon> References: <87imalumps.fsf@toke.dk> <871rh8vf1p.fsf@toke.dk> <81ED2A33-D366-42FC-9344-985FEE8F11BA@creamfinance.com> <87sg9ot5f1.fsf@toke.dk> <20201105143317.78276bbc@carbon> <11812D44-BD46-4CA4-BA39-6080BD88F163@creamfinance.com> <20201106121840.7959ae4b@carbon> <87blgaso84.fsf@toke.dk> <20201106135358.09f6c281@carbon> <20201106151324.5f506574@carbon> <1E70B6D2-1212-43FA-989A-03B657EEE2F2@creamfinance.com> <20201106211940.4c30ccc9@carbon> <6963be0e-3eb5-5875-b53c-66033f50dc2d@gmail.com> <12D28386-7C00-4A31-91E4-37083C1674F9@creamfinance.com> <20201109092428.293104ea@carbon> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; markup=markdown Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Nov 2020 10:09:36 -0000 On 9 Nov 2020, at 9:24, Jesper Dangaard Brouer wrote: > On Sat, 07 Nov 2020 14:00:04 +0100 > Thomas Rosenstein via Bloat wrote: > >> Here's an extract from the ethtool https://pastebin.com/cabpWGFz just = >> in >> case there's something hidden. > > Yes, there is something hiding in the data from ethtool_stats.pl[1]: > (10G Mellanox Connect-X cards via 10G SPF+ DAC) > > stat: 1 ( 1) <=3D outbound_pci_stalled_wr_events = > /sec > stat: 339731557 (339,731,557) <=3D rx_buffer_passed_thres_phy /sec > > I've not seen this counter 'rx_buffer_passed_thres_phy' before, = > looking > in the kernel driver code it is related to "rx_buffer_almost_full". > The numbers per second is excessive (but it be related to a driver bug > as it ends up reading "high" -> rx_buffer_almost_full_high in the > extended counters). > > stat: 29583661 ( 29,583,661) <=3D rx_bytes /sec > stat: 30343677 ( 30,343,677) <=3D rx_bytes_phy /sec > > You are receiving with 236 Mbit/s in 10Gbit/s link. There is a > difference between what the OS sees (rx_bytes) and what the NIC > hardware sees (rx_bytes_phy) (diff approx 6Mbit/s). > > stat: 19552 ( 19,552) <=3D rx_packets /sec > stat: 19950 ( 19,950) <=3D rx_packets_phy /sec Could these packets be from VLAN interfaces that are not used in the OS? > > Above RX packet counters also indicated HW is seeing more packets that > OS is receiving. > > Next counters is likely your problem: > > stat: 718 ( 718) <=3D tx_global_pause /sec > stat: 954035 ( 954,035) <=3D tx_global_pause_duration /sec > stat: 714 ( 714) <=3D tx_pause_ctrl_phy /sec As far as I can see that's only the TX, and we are only doing RX on this = interface - so maybe that's irrelevant? > > It looks like you have enabled Ethernet Flow-Control, and something is > causing pause frames to be generated. It seem strange that this = > happen > on a 10Gbit/s link with only 236 Mbit/s. > > The TX byte counters are also very strange: > > stat: 26063 ( 26,063) <=3D tx_bytes /sec > stat: 71950 ( 71,950) <=3D tx_bytes_phy /sec Also, it's TX, and we are only doing RX, as I said already somewhere, = it's async routing, so the TX data comes via another router back. > > -- = > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer > > [1] = > https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool= _stats.pl > > Strange size distribution: > stat: 19922 ( 19,922) <=3D rx_1519_to_2047_bytes_phy /sec > stat: 14 ( 14) <=3D rx_65_to_127_bytes_phy /sec