From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 22B533B2A4 for ; Mon, 9 Nov 2020 07:25:27 -0500 (EST) Received: by mail-wr1-x434.google.com with SMTP id s8so1378153wrw.10 for ; Mon, 09 Nov 2020 04:25:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=creamfinance.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZzVK403GRz7PCNwNFLIaLA28q7dJ+S0VHxWrQnz5/bk=; b=iDDj0Ae8lLr3JkN+NeRhum/a4MSZg4Ny7TdPVA1Nl61bozliGjjlh4xKJA1P4yhWR7 lXMzIvOj2wUtR0HYgnH0vHUWagBZm6F/rqYNsZhBLp7KBs6A7nallaZCWt+hkWA2JcIW bjJEQHDnVrmN7EXO1qoGH7G8dY7F/ANa3D2Bs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZzVK403GRz7PCNwNFLIaLA28q7dJ+S0VHxWrQnz5/bk=; b=cPchiVCQ14Iuhvg0A5Q/n8uXpoXHtsdFC3zLmGhs82vGynxqO3X8A5i4dEtlUdvZoz ullys3QB+iB7AQciK3J4lmOXZ7vHsKAUsypWWJppQ+4sPU/LrRUxRG2hH+MnS2qyObrh IFQY3lttHNqxKunCHNxmvW4AhjMCW8RecIJs0L+YQat2EeXjSInHH6eggkPm4zb1RGHb m0ieXsoC8+Bu0RksXhgNQE5unBJyOAEo/I3KOwWZYPcPZBuYetV8FL0ON346dILH1MxF 0S2EqXar0y6RvSRp+a2rjMYpRAHZYdv6lrrOc1yQNY+WEbWamUvWGab5OW9EPKTdk0Kr PlDQ== X-Gm-Message-State: AOAM533zmiU2vZV0xINeWdQqn3ssatIdN9WJVIinK4ijkY7W9p2o2Cf7 PjWt9c/AkAcPZAgEk7PmElXG X-Google-Smtp-Source: ABdhPJytPEsXt2U3o8DnH1NeCRJlCf5IXY84GoWL9nRhT5xD9TVNORIUd7znq2/W1mvAPirC10c16w== X-Received: by 2002:adf:9461:: with SMTP id 88mr17130145wrq.171.1604924724447; Mon, 09 Nov 2020 04:25:24 -0800 (PST) Received: from [10.8.100.3] (ip-185.208.132.9.cf-it.at. [185.208.132.9]) by smtp.gmail.com with ESMTPSA id v8sm12667952wmg.28.2020.11.09.04.25.23 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 09 Nov 2020 04:25:23 -0800 (PST) From: "Thomas Rosenstein" To: "Jesper Dangaard Brouer" Cc: "Thomas Rosenstein via Bloat" Date: Mon, 09 Nov 2020 13:25:22 +0100 X-Mailer: MailMate (1.13.2r5673) Message-ID: <4EC64BB9-985A-40FA-B454-28B88A0A1E8E@creamfinance.com> In-Reply-To: <20201109124030.71216677@carbon> References: <87imalumps.fsf@toke.dk> <871rh8vf1p.fsf@toke.dk> <81ED2A33-D366-42FC-9344-985FEE8F11BA@creamfinance.com> <87sg9ot5f1.fsf@toke.dk> <20201105143317.78276bbc@carbon> <11812D44-BD46-4CA4-BA39-6080BD88F163@creamfinance.com> <20201106121840.7959ae4b@carbon> <87blgaso84.fsf@toke.dk> <20201106135358.09f6c281@carbon> <20201106151324.5f506574@carbon> <1E70B6D2-1212-43FA-989A-03B657EEE2F2@creamfinance.com> <20201106211940.4c30ccc9@carbon> <6963be0e-3eb5-5875-b53c-66033f50dc2d@gmail.com> <12D28386-7C00-4A31-91E4-37083C1674F9@creamfinance.com> <20201109092428.293104ea@carbon> <7723D882-4DAB-4A70-9D00-DF1976872AC2@creamfinance.com> <20201109124030.71216677@carbon> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; markup=markdown Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] Router congestion, slow ping/ack times with kernel 5.4.60 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Nov 2020 12:25:27 -0000 On 9 Nov 2020, at 12:40, Jesper Dangaard Brouer wrote: > On Mon, 09 Nov 2020 11:09:33 +0100 > "Thomas Rosenstein" wrote: > >> On 9 Nov 2020, at 9:24, Jesper Dangaard Brouer wrote: >> >>> On Sat, 07 Nov 2020 14:00:04 +0100 >>> Thomas Rosenstein via Bloat wrote: >>> >>>> Here's an extract from the ethtool https://pastebin.com/cabpWGFz = >>>> just >>>> in >>>> case there's something hidden. >>> >>> Yes, there is something hiding in the data from ethtool_stats.pl[1]: >>> (10G Mellanox Connect-X cards via 10G SPF+ DAC) >>> >>> stat: 1 ( 1) <=3D outbound_pci_stalled_wr_events= = >>> /sec >>> stat: 339731557 (339,731,557) <=3D rx_buffer_passed_thres_phy /se= c >>> >>> I've not seen this counter 'rx_buffer_passed_thres_phy' before, = >>> looking >>> in the kernel driver code it is related to "rx_buffer_almost_full". >>> The numbers per second is excessive (but it be related to a driver = >>> bug >>> as it ends up reading "high" -> rx_buffer_almost_full_high in the >>> extended counters). > > Notice this indication is a strong red-flag that something is wrong. > > > Okay, but as this is a router you also need to transmit this > (asymmetric) traffic out another interface right. The asymmetric traffic comes back on another router, this is router-02, = traffic from internet comes back on router-01, I also added the interfaces names. See the updated diagram: = https://drive.google.com/file/d/15oAsxiNfsbjB9a855Q_dh6YvFZBDdY5I/view?us= p=3Dsharing > > Could you also provide ethtool_stats for the TX interface? > > Notice that the tool[1] ethtool_stats.pl support monitoring several > interfaces at the same time, e.g. run: > > ethtool_stats.pl --sec 3 --dev eth4 --dev ethTX > > And provide output as pastebin. I have disabled pause control, like Toke said via: ethtool -A eth4 autoneg off rx off tx off ethtool -A eth5 autoneg off rx off tx off Afterwards an ethtool output, first "without" traffic for a few seconds, = then with the problematic flow. Since the output is > 512KB I had to upload it on gdrive: https://drive.google.com/file/d/1EVKt1LseaBuD40QE-SqFvqYSeWUEcGA_/view?us= p=3Dsharing > > >>> [1] >>> https://github.com/netoptimizer/network-testing/blob/master/bin/ethto= ol_stats.pl >>> >>> Strange size distribution: >>> stat: 19922 ( 19,922) <=3D rx_1519_to_2047_bytes_phy /sec >>> stat: 14 ( 14) <=3D rx_65_to_127_bytes_phy /sec I assume it's because of the VLAN Tagging, and therefore 1522 bytes per = packet with mtu of 1500? >> > > -- = > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer