On Jun 12, 2021, 00:14 +0200, David Collier-Brown <davecb.42@gmail.com>, wrote:

OK, Oh Smarter Colleagues, the challenge to you is to say if there is a "natural" place to capture state changes to get the data we want, and if so, is it common or similar enough between drivers to be worthy of attention?

--dave

On 2021-06-09 9:15 a.m., Dave Taht wrote:
Begin forwarded message:

From: David Collier-Brown <davecb.42@gmail.com>

Subject: Microstate Accounting and the Nyquist problem

Date: June 9, 2021 at 4:44:14 AM PDT

To: Dave Taht <davet@teklibre.net>

Cc: Dave Collier-Brown <dave.collier-brown@indexexchange.com>

Reply-To: davecb@spamcop.net
A million years ago (roughly around Solaris 9), Sun was suffering from the same problems in measuring their dispatcher as you are with "sloshing".

A CPU would be 100% busy in one microsecond, 10% busy in the next gazillion, and the average CPU utilization for our sample period would be maybe 10.1, if the sampler happened to sample right when the spike was happening.

This was utterly useless for things like the fair-share scheduler, so it got fixed in Solaris 10, by having the dispatcher record the time a process (well, kernel thread) had spent in a state when the state changed.

Initially "microstate accounting" could be toggled on and off, but the branch-around cost more time than always doing the calculation (as discovered by my mad friend Fred) and the kernel folks left it on. It's on to this day.

In Simon Sundberg's talk, the opportunity to measure occurs every 1,000 packets, when a suitable timestamp is provided. While the eBPF program can look at every packet and do after-the-fact book-keeping in a map, that's only good if the phenomenon you're measuring is persistent enough that it's around for ~2,000 packets.

I'm going to suggest that the right place to record the information you want is right where the event happens. Preferably in c code, as performance is easy to mess up, but perhaps with an eBPF mechanism to export it.

In previous Solaris work, I reliably found that exporting kstats was a darn sight harder than collecting them, and in Eric's blog post[1] he notes that converting time is expensive and best done long after collecting, when someone wanted to read the data.

There was an effort to do kstats in Linux[2], but it had supposedly poor performance, and actual trouble when the clock frequency changed.

Is there, in your opinion, a "natural" place to capture state changes to get the data you want, and if so, is it common or similar enough between drivers to be worthy of attention?

--dave

References:

Solaris: http://dtrace.org/blogs/eschrock/2004/10/13/microstate-accounting-in-solaris-10/

A failing Linux effort: https://lwn.net/Articles/127296/, https://sourceforge.net/projects/microstate/
--  
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb@spamcop.net           |                      -- Mark Twain
_______________________________________________
Starlink mailing list
Starlink@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/starlink