From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.taht.net (mail.taht.net [IPv6:2a01:7e00:e000:2d4:f00f:f00f:b33b:b33b]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id A198F3B2A4 for ; Fri, 11 Jun 2021 18:39:10 -0400 (EDT) Received: from smtpclient.apple (unknown [IPv6:2600:380:453a:c52a:21fa:3e4c:aefa:8e8d]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.taht.net (Postfix) with ESMTPSA id D07AF22C0D; Fri, 11 Jun 2021 22:39:08 +0000 (UTC) From: Dave Taht Message-Id: <391A8897-5A1F-424A-9DD0-01B66824887B@teklibre.net> Content-Type: multipart/alternative; boundary="Apple-Mail=_EF7428F4-FEDA-4BF0-B301-DF34E48E9BD2" Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.80.0.2.43\)) Date: Fri, 11 Jun 2021 15:39:06 -0700 In-Reply-To: Cc: starlink@lists.bufferbloat.net, davecb@spamcop.net To: Mike Puchol References: <950B8EAF-90B9-41A6-951D-91821F591D41@teklibre.net> <01a7bed2-6f49-3d7d-eb5a-209031ee8070@gmail.com> X-Mailer: Apple Mail (2.3654.80.0.2.43) Subject: Re: [Starlink] Microstate Accounting and the Nyquist problem X-BeenThere: starlink@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Starlink has bufferbloat. Bad." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jun 2021 22:39:11 -0000 --Apple-Mail=_EF7428F4-FEDA-4BF0-B301-DF34E48E9BD2 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jun 11, 2021, at 3:34 PM, Mike Puchol wrote: >=20 > We know that Starlink recalculates topology every 15 seconds (this = guy, who obviously has way too much spare time, came up with an indirect = observation of this interval: = https://blog.beerriot.com/2021/02/14/starlink-raster-scan/ = ) >=20 > If we could align with this, we could at least know when potential = changes in path delays happen, and try to observe other changes that = happen at a similar cadence. >=20 > Other thoughts, try to plug more details out of the gRPC data, setup = GPS-synced probes with a device at the exit PoP, measure differences = between time-sync probes to an array of endpoints. >=20 It=E2=80=99s ironic that the device has to have gps in it, and thus = should be able to provide perfect time to clients directly behind it, = isn=E2=80=99t. I haven=E2=80=99t captured a dhcp or dhcpv6 transaction yet myself, do they have a ntp option? What gps software or driver might they have used? (esr=E2=80=99s gpsd is = quite popular, but there are others)=20 What=E2=80=99s the gps chip? > Has nobody attacked the JTAG connector on a Dishy yet? >=20 > Best, >=20 > Mike > On Jun 12, 2021, 00:14 +0200, David Collier-Brown = , wrote: >> OK, Oh Smarter Colleagues, the challenge to you is to say if there is = a "natural" place to capture state changes to get the data we want, and = if so, is it common or similar enough between drivers to be worthy of = attention? >>=20 >> --dave >>=20 >> On 2021-06-09 9:15 a.m., Dave Taht wrote: >>>=20 >>>=20 >>>> Begin forwarded message: >>>>=20 >>>> From: David Collier-Brown > >>>> Subject: Microstate Accounting and the Nyquist problem >>>> Date: June 9, 2021 at 4:44:14 AM PDT >>>> To: Dave Taht > >>>> Cc: Dave Collier-Brown > >>>> Reply-To: davecb@spamcop.net >>>>=20 >>>> A million years ago (roughly around Solaris 9), Sun was suffering = from the same problems in measuring their dispatcher as you are with = "sloshing". >>>>=20 >>>> A CPU would be 100% busy in one microsecond, 10% busy in the next = gazillion, and the average CPU utilization for our sample period would = be maybe 10.1, if the sampler happened to sample right when the spike = was happening. >>>>=20 >>>> This was utterly useless for things like the fair-share scheduler, = so it got fixed in Solaris 10, by having the dispatcher record the time = a process (well, kernel thread) had spent in a state when the state = changed. >>>>=20 >>>> Initially "microstate accounting" could be toggled on and off, but = the branch-around cost more time than always doing the calculation (as = discovered by my mad friend Fred) and the kernel folks left it on. It's = on to this day. >>>>=20 >>>> In Simon Sundberg's talk, the opportunity to measure occurs every = 1,000 packets, when a suitable timestamp is provided. While the eBPF = program can look at every packet and do after-the-fact book-keeping in a = map, that's only good if the phenomenon you're measuring is persistent = enough that it's around for ~2,000 packets. >>>>=20 >>>> I'm going to suggest that the right place to record the information = you want is right where the event happens. Preferably in c code, as = performance is easy to mess up, but perhaps with an eBPF mechanism to = export it. >>>>=20 >>>> In previous Solaris work, I reliably found that exporting kstats = was a darn sight harder than collecting them, and in Eric's blog post[1] = he notes that converting time is expensive and best done long after = collecting, when someone wanted to read the data. >>>>=20 >>>> There was an effort to do kstats in Linux[2], but it had supposedly = poor performance, and actual trouble when the clock frequency changed. >>>>=20 >>>> Is there, in your opinion, a "natural" place to capture state = changes to get the data you want, and if so, is it common or similar = enough between drivers to be worthy of attention? >>>>=20 >>>> --dave >>>>=20 >>>>=20 >>>>=20 >>>> References: >>>>=20 >>>> Solaris: = http://dtrace.org/blogs/eschrock/2004/10/13/microstate-accounting-in-solar= is-10/ = >>>> A failing Linux effort: https://lwn.net/Articles/127296/ = , https://sourceforge.net/projects/microstate/ = >>>> -- =20 >>>> David Collier-Brown, | Always do right. This will gratify >>>> System Programmer and Author | some people and astonish the rest >>>> davecb@spamcop.net | = -- Mark Twain >>>=20 >> _______________________________________________ >> Starlink mailing list >> Starlink@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/starlink > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink --Apple-Mail=_EF7428F4-FEDA-4BF0-B301-DF34E48E9BD2 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Jun 11, 2021, at 3:34 PM, Mike Puchol <mike@starlink.sx> = wrote:

We know that Starlink recalculates topology = every 15 seconds (this guy, who obviously has way too much spare time, = came up with an indirect observation of this interval: https://blog.beerriot.com/2021/02/14/starlink-raster-scan/&= nbsp;)

If we could align with this, we could at least know when potential = changes in path delays happen, and try to observe other changes that = happen at a similar cadence.

Other thoughts, try to plug more details out of the gRPC data, setup = GPS-synced probes with a device at the exit PoP, measure differences = between time-sync probes to an array of endpoints.


It=E2=80=99s ironic that the device has to have gps in = it, and thus should be  able to provide perfect time to clients = directly behind it, isn=E2=80=99t.

I = haven=E2=80=99t captured a dhcp or dhcpv6 transaction yet = myself,
do they have a ntp option?

What gps software or driver might they have used? = (esr=E2=80=99s gpsd is quite popular, but there are = others) 

What=E2=80=99s the gps = chip?


Has nobody attacked the JTAG connector on a Dishy yet?

Best,

Mike
On Jun 12, 2021, 00:14 = +0200, David Collier-Brown <davecb.42@gmail.com>, wrote:

OK, Oh Smarter = Colleagues, the challenge to you is to say if there is a "natural" = place to capture state changes to get the data we want, and if so, is it = common or similar enough between drivers to be worthy of attention?

--dave

On 2021-06-09 9:15 a.m., Dave Taht = wrote:


Begin forwarded message:

From: David Collier-Brown <davecb.42@gmail.com>
Subject: Microstate Accounting and the Nyquist problem
Date: June 9, 2021 at 4:44:14 AM PDT
To: Dave Taht <davet@teklibre.net>
Cc: Dave Collier-Brown <dave.collier-brown@indexexchange.com>
Reply-To: davecb@spamcop.net

A million years ago (roughly around = Solaris 9), Sun was suffering from the same problems in measuring their = dispatcher as you are with "sloshing".

A CPU would be = 100% busy in one microsecond, 10% busy in the next gazillion, and the = average CPU utilization for our sample period would be maybe 10.1, if the sampler happened to sample right when = the spike was happening.

This was utterly useless for = things like the fair-share scheduler, so it got fixed in Solaris 10, by = having the dispatcher record the time a process (well, kernel thread) = had spent in a state when the state changed.

Initially "microstate accounting" could be toggled on and = off, but the branch-around cost more time than always doing the = calculation (as discovered by my mad friend Fred) and the kernel folks = left it on. It's on to this day.

In Simon Sundberg's = talk, the opportunity to measure occurs every 1,000 packets, when a = suitable timestamp is provided. While the eBPF program can look at every = packet and do after-the-fact book-keeping in a map, that's only good if = the phenomenon you're measuring is persistent enough that it's around = for ~2,000 packets.

I'm going to suggest that the right = place to record the information you want is right where the event = happens.  Preferably in c code, as performance is easy to mess up, = but perhaps with an eBPF mechanism to export it.

In = previous Solaris work, I reliably found that exporting kstats was a darn = sight harder than collecting them, and in Eric's blog post[1] he notes = that converting time is expensive and best done long after collecting, = when someone wanted to read the data.

There was an = effort to do kstats in Linux[2], but it had supposedly poor performance, = and actual trouble when the clock frequency changed.

Is there, in your opinion, a "natural" place to capture state = changes to get the data you want, and if so, is it common or similar = enough between drivers to be worthy of attention?

--dave


References:

  1. Solaris: http://dtrace.org/blogs/eschrock/2004/10/13/micro= state-accounting-in-solaris-10/
  2. A failing Linux effort: https://lwn.net/Articles/127296/, https://sourceforge.net/projects/microstate/<= br class=3D"">
-- =20
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb@spamcop.net           |              =
        -- Mark Twain

_______________________________________________
Starlink mailing list
Starlink@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/starlink
_______________________________________________
Starlink = mailing list
Starlink@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/starlink

= --Apple-Mail=_EF7428F4-FEDA-4BF0-B301-DF34E48E9BD2--