From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id E85B63B29D for ; Mon, 9 Nov 2020 03:24:38 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604910278; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FvTTn6B7P5k4wFY8LZ5SV3qljTY7e6RO5J/M9btscCE=; b=N44t+K9c6eCQHpXJmPZpTVXHgwFWTB5VUrdkXZ/E0uzBrA53oGCwfUUM4v2tpYLaq5eGgB 69XN6mDGffAW+9WBRwaR8AwXO/zNTi3XcPCjM19XuaxEySAynPnUWmB0NnttK3rHD4LEVa PRjki7hYDJrPsn7abRcKrLE4wcg9I/w= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-338-thO5UzM-M9ysP3eduEbqxQ-1; Mon, 09 Nov 2020 03:24:35 -0500 X-MC-Unique: thO5UzM-M9ysP3eduEbqxQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 70847186DD21; Mon, 9 Nov 2020 08:24:34 +0000 (UTC) Received: from carbon (unknown [10.36.110.8]) by smtp.corp.redhat.com (Postfix) with ESMTP id 65F181002C01; Mon, 9 Nov 2020 08:24:29 +0000 (UTC) Date: Mon, 9 Nov 2020 09:24:28 +0100 From: Jesper Dangaard Brouer To: Thomas Rosenstein via Bloat Cc: brouer@redhat.com, Thomas Rosenstein , "Jan Ceuleers" , Saeed Mahameed , Tariq Toukan , kheib@redhat.com Message-ID: <20201109092428.293104ea@carbon> In-Reply-To: <12D28386-7C00-4A31-91E4-37083C1674F9@creamfinance.com> References: <87imalumps.fsf@toke.dk> <871rh8vf1p.fsf@toke.dk> <81ED2A33-D366-42FC-9344-985FEE8F11BA@creamfinance.com> <87sg9ot5f1.fsf@toke.dk> <20201105143317.78276bbc@carbon> <11812D44-BD46-4CA4-BA39-6080BD88F163@creamfinance.com> <20201106121840.7959ae4b@carbon> <87blgaso84.fsf@toke.dk> <20201106135358.09f6c281@carbon> <20201106151324.5f506574@carbon> <1E70B6D2-1212-43FA-989A-03B657EEE2F2@creamfinance.com> <20201106211940.4c30ccc9@carbon> <6963be0e-3eb5-5875-b53c-66033f50dc2d@gmail.com> <12D28386-7C00-4A31-91E4-37083C1674F9@creamfinance.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=brouer@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Nov 2020 08:24:39 -0000 On Sat, 07 Nov 2020 14:00:04 +0100 Thomas Rosenstein via Bloat wrote: > Here's an extract from the ethtool https://pastebin.com/cabpWGFz just in > case there's something hidden. Yes, there is something hiding in the data from ethtool_stats.pl[1]: (10G Mellanox Connect-X cards via 10G SPF+ DAC) stat: 1 ( 1) <= outbound_pci_stalled_wr_events /sec stat: 339731557 (339,731,557) <= rx_buffer_passed_thres_phy /sec I've not seen this counter 'rx_buffer_passed_thres_phy' before, looking in the kernel driver code it is related to "rx_buffer_almost_full". The numbers per second is excessive (but it be related to a driver bug as it ends up reading "high" -> rx_buffer_almost_full_high in the extended counters). stat: 29583661 ( 29,583,661) <= rx_bytes /sec stat: 30343677 ( 30,343,677) <= rx_bytes_phy /sec You are receiving with 236 Mbit/s in 10Gbit/s link. There is a difference between what the OS sees (rx_bytes) and what the NIC hardware sees (rx_bytes_phy) (diff approx 6Mbit/s). stat: 19552 ( 19,552) <= rx_packets /sec stat: 19950 ( 19,950) <= rx_packets_phy /sec Above RX packet counters also indicated HW is seeing more packets that OS is receiving. Next counters is likely your problem: stat: 718 ( 718) <= tx_global_pause /sec stat: 954035 ( 954,035) <= tx_global_pause_duration /sec stat: 714 ( 714) <= tx_pause_ctrl_phy /sec It looks like you have enabled Ethernet Flow-Control, and something is causing pause frames to be generated. It seem strange that this happen on a 10Gbit/s link with only 236 Mbit/s. The TX byte counters are also very strange: stat: 26063 ( 26,063) <= tx_bytes /sec stat: 71950 ( 71,950) <= tx_bytes_phy /sec -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer [1] https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl Strange size distribution: stat: 19922 ( 19,922) <= rx_1519_to_2047_bytes_phy /sec stat: 14 ( 14) <= rx_65_to_127_bytes_phy /sec