From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [IPv6:2a0c:4d80:42:2001::664]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 4A3263B2A4 for ; Thu, 4 Jun 2020 06:56:58 -0400 (EDT) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1591268216; bh=OGFv1U5jCNEF4p8kI3eaXeTRGWNJKFgOBbLhy6UnKSo=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=JUmbL1Pbxakw2N8t8FzUZ8a8PXXMJeoLqxMXWU1UmKByQqxrM7MXJqPNfon/R2EHN JFBLYyq1/d5MZP0MdtvRnB0sO/xDe84TvF2KV1YxTIfmP337xJLZiXBoJLesDr/Hbw uSjH2PQnF4x+e20rphRsnEBAY0sPaSReyoEVHMl9e7UWeoTioEpIx2SKlXo4EWO2SI EP4B5B6FHuZJYjNnNH642s3LBER2ujVolpWxyfUum03lH6D8V0bpUQfs+CyoCA41Rn DEdKfE+Ee0Y6Zw2OpsoqUr0evsWhrsqDDr3sH7vx+4OJtO0nuvrzvIAndfP0dBoYdw OspAz+j2sU8EA== To: Jonathan Morton , dave.collier-brown@indexexchange.com Cc: bloat Subject: Re: [Bloat] What's a good non-intrusive way to look at bloat (and perhaps things like gout (:-)) In-Reply-To: References: Date: Thu, 04 Jun 2020 12:56:48 +0200 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <87k10njdzj.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain X-List-Received-Date: Thu, 04 Jun 2020 10:56:58 -0000 Jonathan Morton writes: >> On 4 Jun, 2020, at 1:21 am, Dave Collier-Brown wrote: >> >> We've good tools to measure network performance under stress, by the >> simple expedient of stressing it, but is there a good approach I >> could recommend to my company to monitor a bunch of reasonably modern >> links, without the measurement significantly affecting their state? >> >> I don't mind increasing bandwidth usage, but I'm downright grumpy >> about adding to the service time: I have a transaction that times out >> for gross slowness if it takes much more that an tenth of a second, >> and it involves a scatter-gather interaction with at least 10 >> customers in that time. >> >> I'm topically interested in bloat, but really we should understand >> "everything" about our links. If they can get the bloats like cattle, >> they can probably get the gout, like King Henry the Eighth (;-)) >> >> My platform is Centos 8, and I have lots of Smarter Colleagues to >> help. > > My first advice would be to browse pollere.net for tools - like pping > (passive ping), which monitors the latency of flows in transit. That > should give you some interesting information without adding any load > at all. There is also connmon (https://github.com/pollere/connmon). Ah, good idea, totally forgot about Kathy's tools! :) I figure one could probably implement something like connmon in eBPF (as an XDP or TC hook program) and have it run as an always-on monitor with fairly low overhead. Dave, if you have development resources to throw at this, I'll be happy to help with pointers on how to get the eBPF bits working. I believe CentOS 8.2+ should have the needed kernel support... Of course, you could also just use the connmon utility as-is if you have CPU cycles to spare for the extra overhead (it looks like it's using libpcap to capture the packets and process them in userspace). -Toke