From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.taht.net (mail.taht.net [IPv6:2a01:7e00:e000:2d4:f00f:f00f:b33b:b33b]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 09F463B2A4 for ; Fri, 11 Jun 2021 19:00:58 -0400 (EDT) Received: from smtpclient.apple (unknown [IPv6:2600:380:456e:e5c8:8dbd:2d0:c21:dbe6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.taht.net (Postfix) with ESMTPSA id A29EB22C2B; Fri, 11 Jun 2021 23:00:55 +0000 (UTC) From: Dave Taht Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_47CFE383-7EBA-471C-8855-22081F8D2248" Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.80.0.2.43\)) Date: Fri, 11 Jun 2021 16:00:52 -0700 In-Reply-To: <391A8897-5A1F-424A-9DD0-01B66824887B@teklibre.net> Cc: starlink@lists.bufferbloat.net To: Mike Puchol References: <950B8EAF-90B9-41A6-951D-91821F591D41@teklibre.net> <01a7bed2-6f49-3d7d-eb5a-209031ee8070@gmail.com> <391A8897-5A1F-424A-9DD0-01B66824887B@teklibre.net> X-Mailer: Apple Mail (2.3654.80.0.2.43) Subject: Re: [Starlink] Microstate Accounting and the Nyquist problem X-BeenThere: starlink@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Starlink has bufferbloat. Bad." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jun 2021 23:00:59 -0000 --Apple-Mail=_47CFE383-7EBA-471C-8855-22081F8D2248 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jun 11, 2021, at 3:39 PM, Dave Taht wrote: >=20 >=20 >=20 >> On Jun 11, 2021, at 3:34 PM, Mike Puchol > wrote: >>=20 >> We know that Starlink recalculates topology every 15 seconds (this = guy, who obviously has way too much spare time, came up with an indirect = observation of this interval: = https://blog.beerriot.com/2021/02/14/starlink-raster-scan/ = ) >>=20 >> If we could align with this, we could at least know when potential = changes in path delays happen, and try to observe other changes that = happen at a similar cadence. >>=20 >> Other thoughts, try to plug more details out of the gRPC data, setup = GPS-synced probes with a device at the exit PoP, measure differences = between time-sync probes to an array of endpoints. >>=20 >=20 > It=E2=80=99s ironic that the device has to have gps in it, and thus = should be able to provide perfect time to clients directly behind it, = isn=E2=80=99t. >=20 > I haven=E2=80=99t captured a dhcp or dhcpv6 transaction yet myself, > do they have a ntp option? >=20 > What gps software or driver might they have used? (esr=E2=80=99s gpsd = is quite popular, but there are others)=20 >=20 > What=E2=80=99s the gps chip? >=20 It would be good to have solid time everywhere, as I am seeing clocks = not synced even close to 40ms accuracy of late. BTW, Eric Raymond (esr) is also one of the driving forces behind ntpsec, = along with gary and a few other people now on our list. For more details, see: https://www.ntpsec.org/ Once upon a time, I sat in esr's basement hearing him rip much crud out = of the old ntpd codebase over the course of a very few days. The shouts = =E2=80=9CWhat? WHAAAT?=E2=80=9D and most of the other pithy comments he = made never made the git log.=20 Over the years following the codebase got better and better, but = adoption has been slow. An intro to that woefully underfunded project: NTPsec project - a secure, hardened, and improved implementation of = Network Time Protocol derived from NTP Classic, Dave Mills=E2=80=99s = original. NTPsec, as its name implies, is a more secure NTP. Our goal is to = deliver code that can be used with confidence in deployments with the = most stringent security, availability, and assurance requirements. Towards that end we apply best practices and state-of-the art technology = in code auditing, verification, and testing. We begin with the most = important best practice: true open-source code review. The NTPsec code = is available in a public git repository. One of our goals is to support = broader community participation. >=20 >> Has nobody attacked the JTAG connector on a Dishy yet? I reached out to one of the teardown folk (mike (mikeonsoftware?)) = months ago to get the debris but the rightest answer was to drill down = into it on a still-alive ones. >>=20 >> Best, >>=20 >> Mike >> On Jun 12, 2021, 00:14 +0200, David Collier-Brown = >, wrote: >>> OK, Oh Smarter Colleagues, the challenge to you is to say if there = is a "natural" place to capture state changes to get the data we want, = and if so, is it common or similar enough between drivers to be worthy = of attention? >>>=20 >>> --dave >>>=20 >>> On 2021-06-09 9:15 a.m., Dave Taht wrote: >>>>=20 >>>>=20 >>>>> Begin forwarded message: >>>>>=20 >>>>> From: David Collier-Brown > >>>>> Subject: Microstate Accounting and the Nyquist problem >>>>> Date: June 9, 2021 at 4:44:14 AM PDT >>>>> To: Dave Taht > >>>>> Cc: Dave Collier-Brown > >>>>> Reply-To: davecb@spamcop.net >>>>>=20 >>>>> A million years ago (roughly around Solaris 9), Sun was suffering = from the same problems in measuring their dispatcher as you are with = "sloshing". >>>>>=20 >>>>> A CPU would be 100% busy in one microsecond, 10% busy in the next = gazillion, and the average CPU utilization for our sample period would = be maybe 10.1, if the sampler happened to sample right when the spike = was happening. >>>>>=20 >>>>> This was utterly useless for things like the fair-share scheduler, = so it got fixed in Solaris 10, by having the dispatcher record the time = a process (well, kernel thread) had spent in a state when the state = changed. >>>>>=20 >>>>> Initially "microstate accounting" could be toggled on and off, but = the branch-around cost more time than always doing the calculation (as = discovered by my mad friend Fred) and the kernel folks left it on. It's = on to this day. >>>>>=20 >>>>> In Simon Sundberg's talk, the opportunity to measure occurs every = 1,000 packets, when a suitable timestamp is provided. While the eBPF = program can look at every packet and do after-the-fact book-keeping in a = map, that's only good if the phenomenon you're measuring is persistent = enough that it's around for ~2,000 packets. >>>>>=20 >>>>> I'm going to suggest that the right place to record the = information you want is right where the event happens. Preferably in c = code, as performance is easy to mess up, but perhaps with an eBPF = mechanism to export it. >>>>>=20 >>>>> In previous Solaris work, I reliably found that exporting kstats = was a darn sight harder than collecting them, and in Eric's blog post[1] = he notes that converting time is expensive and best done long after = collecting, when someone wanted to read the data. >>>>>=20 >>>>> There was an effort to do kstats in Linux[2], but it had = supposedly poor performance, and actual trouble when the clock frequency = changed. >>>>>=20 >>>>> Is there, in your opinion, a "natural" place to capture state = changes to get the data you want, and if so, is it common or similar = enough between drivers to be worthy of attention? >>>>>=20 >>>>> --dave >>>>>=20 >>>>>=20 >>>>>=20 >>>>> References: >>>>>=20 >>>>> Solaris: = http://dtrace.org/blogs/eschrock/2004/10/13/microstate-accounting-in-solar= is-10/ = >>>>> A failing Linux effort: https://lwn.net/Articles/127296/ = , https://sourceforge.net/projects/microstate/ = >>>>> -- =20 >>>>> David Collier-Brown, | Always do right. This will gratify >>>>> System Programmer and Author | some people and astonish the rest >>>>> davecb@spamcop.net | = -- Mark Twain >>>>=20 >>> _______________________________________________ >>> Starlink mailing list >>> Starlink@lists.bufferbloat.net = >>> https://lists.bufferbloat.net/listinfo/starlink = >> _______________________________________________ >> Starlink mailing list >> Starlink@lists.bufferbloat.net = >> https://lists.bufferbloat.net/listinfo/starlink >=20 > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink = --Apple-Mail=_47CFE383-7EBA-471C-8855-22081F8D2248 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Jun 11, 2021, at 3:39 PM, Dave Taht <davet@teklibre.net> = wrote:



On Jun 11, 2021, at 3:34 PM, = Mike Puchol <mike@starlink.sx> wrote:

We = know that Starlink recalculates topology every 15 seconds (this guy, who = obviously has way too much spare time, came up with an indirect = observation of this interval: https://blog.beerriot.com/2021/02/14/starlink-raster-scan/&= nbsp;)

If we could align with this, we = could at least know when potential changes in path delays happen, and = try to observe other changes that happen at a similar cadence.

Other thoughts, try to plug more details out = of the gRPC data, setup GPS-synced probes with a device at the exit PoP, = measure differences between time-sync probes to an array of = endpoints.


It=E2=80=99s ironic that the device has to have gps in = it, and thus should be  able to provide perfect time to clients = directly behind it, isn=E2=80=99t.

I = haven=E2=80=99t captured a dhcp or dhcpv6 transaction yet = myself,
do they have a ntp option?

What gps software or driver might they have used? = (esr=E2=80=99s gpsd is quite popular, but there are = others) 

What=E2=80=99s the gps = chip?


It would be good to have solid time everywhere, as = I am seeing clocks not synced even close to 40ms accuracy of = late.

BTW, Eric Raymond (esr) is = also one of the driving forces behind ntpsec, along with gary and a few = other people now on our list.

For = more details, see:

Once upon a time, I sat in esr's basement hearing = him rip much crud out of the old ntpd codebase over the course of a very = few days. The shouts =E2=80=9CWhat? WHAAAT?=E2=80=9D and most of the = other pithy comments he made never made the git log. 

Over the years following the codebase got better = and better, but adoption has been slow.

An intro to that woefully underfunded = project:

NTPsec project - a secure, hardened, and improved = implementation of Network Time Protocol derived from NTP Classic, Dave = Mills=E2=80=99s original.
NTPsec, as its name implies, is a more secure NTP. Our goal = is to deliver code that can be used with confidence in deployments with = the most stringent security, availability, and assurance = requirements.
Towards = that end we apply best practices and state-of-the art technology in code = auditing, verification, and testing. We begin with the most important = best practice: true open-source code review. The NTPsec code is = available in a public git repository. One of our goals is to support = broader community participation.


Has nobody attacked the JTAG = connector on a Dishy = yet?
I reached out to one of the teardown folk (mike = (mikeonsoftware?)) months ago to get the debris but the rightest answer = was to drill down into it on a still-alive ones.


Best,

Mike
On Jun 12, 2021, 00:14 +0200, = David Collier-Brown <davecb.42@gmail.com>, wrote:

OK, Oh Smarter = Colleagues, the challenge to you is to say if there is a "natural" = place to capture state changes to get the data we want, and if so, is it = common or similar enough between drivers to be worthy of attention?

--dave

On 2021-06-09 9:15 a.m., Dave Taht wrote:


Begin forwarded = message:

From: David Collier-Brown <davecb.42@gmail.com>
Subject: Microstate Accounting and the = Nyquist problem
Date: June 9, 2021 at 4:44:14 AM PDT
To: Dave Taht <davet@teklibre.net>
Cc: Dave Collier-Brown <dave.collier-brown@indexexchange.com>

A million years ago (roughly around Solaris 9), Sun was = suffering from the same problems in measuring their dispatcher as you = are with "sloshing".

A CPU would be 100% busy in one = microsecond, 10% busy in the next gazillion, and the average CPU = utilization for our sample period would be maybe 10.1, if the sampler = happened to sample right when the spike was happening.

This was utterly useless for things like the fair-share = scheduler, so it got fixed in Solaris 10, by having the dispatcher = record the time a process (well, kernel thread) had spent in a state = when the state changed.

Initially = "microstate accounting" could be toggled on and off, but the = branch-around cost more time than always doing the calculation (as = discovered by my mad friend Fred) and the kernel folks left it on. It's = on to this day.

In Simon Sundberg's talk, the = opportunity to measure occurs every 1,000 packets, when a suitable = timestamp is provided. While the eBPF program can look at every packet = and do after-the-fact book-keeping in a map, that's only good if the = phenomenon you're measuring is persistent enough that it's around for = ~2,000 packets.

I'm going to suggest that the right = place to record the information you want is right where the event = happens.  Preferably in c code, as performance is easy to mess up, = but perhaps with an eBPF mechanism to export it.

In = previous Solaris work, I reliably found that exporting kstats was a darn = sight harder than collecting them, and in Eric's blog post[1] he notes = that converting time is expensive and best done long after collecting, = when someone wanted to read the data.

There was an = effort to do kstats in Linux[2], but it had supposedly poor performance, = and actual trouble when the clock frequency changed.

Is there, in your opinion, a "natural" place to capture state = changes to get the data you want, and if so, is it common or similar = enough between drivers to be worthy of attention?

--dave


References:

  1. Solaris: http://dtrace.org/blogs/eschrock/2004/10/13/micro= state-accounting-in-solaris-10/
  2. A = failing Linux effort: https://lwn.net/Articles/127296/, https://sourceforge.net/projects/microstate/<= br class=3D"">
-- =20
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb@spamcop.net           |              =
        -- Mark Twain

____________________________________________= ___
Starlink mailing list
Starlink@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/starlink
______________________________________= _________
Starlink mailing list
Starlink@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/starlink

_______________________________________________
Starlink mailing list
Starlink@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/starlink

= --Apple-Mail=_47CFE383-7EBA-471C-8855-22081F8D2248--