OK, /Oh Smarter Colleagues/, the challenge to you is to say if there is 
a "natural" place to capture state changes to get the data we want, and 
if so, is it common or similar enough between drivers to be worthy of 
attention?

--dave

On 2021-06-09 9:15 a.m., Dave Taht wrote:
>
>
>> Begin forwarded message:
>>
>> *From: *David Collier-Brown <davecb.42@gmail.com 
>> <mailto:davecb.42@gmail.com>>
>> *Subject: **Microstate Accounting and the Nyquist problem*
>> *Date: *June 9, 2021 at 4:44:14 AM PDT
>> *To: *Dave Taht <davet@teklibre.net <mailto:davet@teklibre.net>>
>> *Cc: *Dave Collier-Brown <dave.collier-brown@indexexchange.com 
>> <mailto:dave.collier-brown@indexexchange.com>>
>> *Reply-To: *davecb@spamcop.net <mailto:davecb@spamcop.net>
>>
>> A million years ago (roughly around Solaris 9), Sun was suffering 
>> from the same problems in measuring their dispatcher as you are with 
>> "sloshing".
>>
>> A CPU would be 100% busy in one microsecond, 10% busy in the next 
>> gazillion, and the average CPU utilization for our sample period 
>> would be /maybe/ 10.1, if the sampler happened to sample right when 
>> the spike was happening.
>>
>> This was utterly useless for things like the fair-share scheduler, so 
>> it got fixed in Solaris 10, by having the dispatcher record the time 
>> a process (well, kernel thread) had spent in a state when the state 
>> changed.
>>
>> Initially "microstate accounting" could be toggled on and off, but 
>> the branch-around cost more time than always doing the calculation 
>> (as discovered by my mad friend Fred) and the kernel folks left it 
>> on. It's on to this day.
>>
>> In Simon Sundberg's talk, the opportunity to measure occurs every 
>> 1,000 packets, when a suitable timestamp is provided. While the eBPF 
>> program can look at every packet and do after-the-fact book-keeping 
>> in a map, that's only good if the phenomenon you're measuring is 
>> persistent enough that it's around for ~2,000 packets.
>>
>> I'm going to suggest that the right place to record the information 
>> you want is right where the event happens.  Preferably in c code, as 
>> performance is easy to mess up, but perhaps with an eBPF mechanism to 
>> export it.
>>
>> In previous Solaris work, I reliably found that exporting kstats was 
>> a darn sight harder than collecting them, and in Eric's blog post[1] 
>> he notes that converting time is expensive and best done long after 
>> collecting, when someone wanted to read the data.
>>
>> There was an effort to do kstats in Linux[2], but it had supposedly 
>> poor performance, and actual trouble when the clock frequency changed.
>>
>> Is there, in your opinion, a "natural" place to capture state changes 
>> to get the data you want, and if so, is it common or similar enough 
>> between drivers to be worthy of attention?
>>
>> --dave
>>
>>
>> References:
>>
>>  1. Solaris:
>>     http://dtrace.org/blogs/eschrock/2004/10/13/microstate-accounting-in-solaris-10/
>>
>>  2. A failing Linux effort: https://lwn.net/Articles/127296/,
>>     https://sourceforge.net/projects/microstate/
>>
>> -- 
>> David Collier-Brown,         | Always do right. This will gratify
>> System Programmer and Author | some people and astonish the rest
>> davecb@spamcop.net            |                      -- Mark Twain
>