From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf34.google.com (mail-qv1-xf34.google.com [IPv6:2607:f8b0:4864:20::f34]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 8319E3B2A4 for ; Fri, 11 Jun 2021 18:14:22 -0400 (EDT) Received: by mail-qv1-xf34.google.com with SMTP id w4so9915936qvr.11 for ; Fri, 11 Jun 2021 15:14:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:subject:reply-to:to:references:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=6vUJdxLQLAj5X/b6NkrToNvYf0oN5YbieCl9zC/6Q6g=; b=MiWsTYpmSLS/m8JKTp8M8UGtEpNTUKiPRdCmwtPTvcmuxQmlTIIEB8tu7udbK8SJFD mN4pfNxGzsjiCTcbn5wlNHP9c0t+yMFjlRjgv/JstzQIyOTgQ9Cgpy0uaBU17wZz5gWI RqwdrHKQ7Zv9V6Ht6dH3UBUQxwIvfxrZOP1Y/G38mBg2R65VqBe+sCZHAvBMXAd0fjki B2XukohQzDsMK8C+6lb6oMBX2v3McQkTv5UvVhiVbzjmaiEZ0nmGA42eWMT+EtUwLSZ+ OK8j9mngN751Lkivhs06TArBreDeudpk/WYiNwwEXcTB+lsYrZ01sqfvrXjw/mSJClQS arXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:reply-to:to:references:message-id :date:user-agent:mime-version:in-reply-to:content-language; bh=6vUJdxLQLAj5X/b6NkrToNvYf0oN5YbieCl9zC/6Q6g=; b=WdxOmNkAS/RTP/AD26dNwkjxmT3K8Fg1q3Sh5wc+Zry4Ta0Rsu5Z95MQvkmoIE/XZb yaMIVQflP+u92rKUNuoukTS9A3ad1e6aCDuPOaq++6DIeYt1AqjwBPh4MPx55h3XCOws ItYA8XNOixdcn67NVBmRCOUq/fvFEAQwPSTnpGhW6jr2Kl+MdptTvk4seG3ZgIAsk2/B egOAvnqNlQ2nA6vWsM1DFMO9e5NPL3KT8X49ex9hWRneEoZEh827FwqQnfvcjNrNV5hG jawlnTkfWuYI4hzKpiPXr+K7OlXanmhm0PQenXERNegJz7pbrcZkvEWvK2BmX+nxKp84 aExw== X-Gm-Message-State: AOAM531e6o3rJYdwrpD8Fwe5LnadPSXZRc8Us8r9/swoLuFPmo+6EZzH lAqcQ06REXAwJ8bpPbWna6pFftGaPOs= X-Google-Smtp-Source: ABdhPJyy1H+WixzFMvPDZDgAYske/4vsswstjlRxwE/aiyLV36O6Psn0jU1XWyhFEXMPVb/gdYCgIw== X-Received: by 2002:a05:6214:c88:: with SMTP id r8mr7139893qvr.58.1623449662056; Fri, 11 Jun 2021 15:14:22 -0700 (PDT) Received: from [192.168.7.122] ([99.241.212.236]) by smtp.gmail.com with ESMTPSA id 85sm5339727qko.14.2021.06.11.15.14.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 11 Jun 2021 15:14:21 -0700 (PDT) From: David Collier-Brown Reply-To: davecb@spamcop.net To: starlink@lists.bufferbloat.net References: <950B8EAF-90B9-41A6-951D-91821F591D41@teklibre.net> Message-ID: <01a7bed2-6f49-3d7d-eb5a-209031ee8070@gmail.com> Date: Fri, 11 Jun 2021 18:14:20 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: <950B8EAF-90B9-41A6-951D-91821F591D41@teklibre.net> Content-Type: multipart/alternative; boundary="------------9335E1398497330FAC9C7BC9" Content-Language: en-US Subject: Re: [Starlink] Fwd: Microstate Accounting and the Nyquist problem X-BeenThere: starlink@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Starlink has bufferbloat. Bad." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jun 2021 22:14:22 -0000 This is a multi-part message in MIME format. --------------9335E1398497330FAC9C7BC9 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit OK, /Oh Smarter Colleagues/, the challenge to you is to say if there is a "natural" place to capture state changes to get the data we want, and if so, is it common or similar enough between drivers to be worthy of attention? --dave On 2021-06-09 9:15 a.m., Dave Taht wrote: > > >> Begin forwarded message: >> >> *From: *David Collier-Brown > > >> *Subject: **Microstate Accounting and the Nyquist problem* >> *Date: *June 9, 2021 at 4:44:14 AM PDT >> *To: *Dave Taht > >> *Cc: *Dave Collier-Brown > > >> *Reply-To: *davecb@spamcop.net >> >> A million years ago (roughly around Solaris 9), Sun was suffering >> from the same problems in measuring their dispatcher as you are with >> "sloshing". >> >> A CPU would be 100% busy in one microsecond, 10% busy in the next >> gazillion, and the average CPU utilization for our sample period >> would be /maybe/ 10.1, if the sampler happened to sample right when >> the spike was happening. >> >> This was utterly useless for things like the fair-share scheduler, so >> it got fixed in Solaris 10, by having the dispatcher record the time >> a process (well, kernel thread) had spent in a state when the state >> changed. >> >> Initially "microstate accounting" could be toggled on and off, but >> the branch-around cost more time than always doing the calculation >> (as discovered by my mad friend Fred) and the kernel folks left it >> on. It's on to this day. >> >> In Simon Sundberg's talk, the opportunity to measure occurs every >> 1,000 packets, when a suitable timestamp is provided. While the eBPF >> program can look at every packet and do after-the-fact book-keeping >> in a map, that's only good if the phenomenon you're measuring is >> persistent enough that it's around for ~2,000 packets. >> >> I'm going to suggest that the right place to record the information >> you want is right where the event happens.  Preferably in c code, as >> performance is easy to mess up, but perhaps with an eBPF mechanism to >> export it. >> >> In previous Solaris work, I reliably found that exporting kstats was >> a darn sight harder than collecting them, and in Eric's blog post[1] >> he notes that converting time is expensive and best done long after >> collecting, when someone wanted to read the data. >> >> There was an effort to do kstats in Linux[2], but it had supposedly >> poor performance, and actual trouble when the clock frequency changed. >> >> Is there, in your opinion, a "natural" place to capture state changes >> to get the data you want, and if so, is it common or similar enough >> between drivers to be worthy of attention? >> >> --dave >> >> >> References: >> >> 1. Solaris: >> http://dtrace.org/blogs/eschrock/2004/10/13/microstate-accounting-in-solaris-10/ >> >> 2. A failing Linux effort: https://lwn.net/Articles/127296/, >> https://sourceforge.net/projects/microstate/ >> >> -- >> David Collier-Brown, | Always do right. This will gratify >> System Programmer and Author | some people and astonish the rest >> davecb@spamcop.net | -- Mark Twain > --------------9335E1398497330FAC9C7BC9 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit

OK, Oh Smarter Colleagues, the challenge to you is to say if there is a "natural" place to capture state changes to get the data we want, and if so, is it common or similar enough between drivers to be worthy of attention?

--dave

On 2021-06-09 9:15 a.m., Dave Taht wrote:


Begin forwarded message:

From: David Collier-Brown <davecb.42@gmail.com>
Subject: Microstate Accounting and the Nyquist problem
Date: June 9, 2021 at 4:44:14 AM PDT
To: Dave Taht <davet@teklibre.net>
Cc: Dave Collier-Brown <dave.collier-brown@indexexchange.com>
Reply-To: davecb@spamcop.net

A million years ago (roughly around Solaris 9), Sun was suffering from the same problems in measuring their dispatcher as you are with "sloshing".

A CPU would be 100% busy in one microsecond, 10% busy in the next gazillion, and the average CPU utilization for our sample period would be maybe 10.1, if the sampler happened to sample right when the spike was happening.

This was utterly useless for things like the fair-share scheduler, so it got fixed in Solaris 10, by having the dispatcher record the time a process (well, kernel thread) had spent in a state when the state changed.

Initially "microstate accounting" could be toggled on and off, but the branch-around cost more time than always doing the calculation (as discovered by my mad friend Fred) and the kernel folks left it on. It's on to this day.

In Simon Sundberg's talk, the opportunity to measure occurs every 1,000 packets, when a suitable timestamp is provided. While the eBPF program can look at every packet and do after-the-fact book-keeping in a map, that's only good if the phenomenon you're measuring is persistent enough that it's around for ~2,000 packets.

I'm going to suggest that the right place to record the information you want is right where the event happens.  Preferably in c code, as performance is easy to mess up, but perhaps with an eBPF mechanism to export it.

In previous Solaris work, I reliably found that exporting kstats was a darn sight harder than collecting them, and in Eric's blog post[1] he notes that converting time is expensive and best done long after collecting, when someone wanted to read the data.

There was an effort to do kstats in Linux[2], but it had supposedly poor performance, and actual trouble when the clock frequency changed.

Is there, in your opinion, a "natural" place to capture state changes to get the data you want, and if so, is it common or similar enough between drivers to be worthy of attention?

--dave


References:

  1. Solaris: http://dtrace.org/blogs/eschrock/2004/10/13/microstate-accounting-in-solaris-10/
  2. A failing Linux effort: https://lwn.net/Articles/127296/, https://sourceforge.net/projects/microstate/
-- 
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb@spamcop.net           |                      -- Mark Twain

--------------9335E1398497330FAC9C7BC9--