[Starlink] fiber IXPs in space

Sun Apr 16 20:51:56 EDT 2023

On 17/04/2023 11:22 am, David Lang wrote:
> On Mon, 17 Apr 2023, Ulrich Speidel wrote:
>
>> On 17/04/2023 10:03 am, David Lang wrote:
>>> On Mon, 17 Apr 2023, Ulrich Speidel via Starlink wrote:
>>>
>>>> On 17/04/2023 5:54 am, David Fernández via Starlink wrote:
>>>>> In case you put a DNS server in the satellite, so that it replies
>>>>> instead of a DNS server on ground, the RTT is reduced by half.
>>>>>
>>>>> The idea would be that the satellite inspects IP packets and when it
>>>>> detects a DNS query, instead of forwarding the packet to ground
>>>>> station, it just answers back to the sender of the query.
>>>> Understood - it's just that the gain you have from this is quite 
>>>> small. DNS queries only happen the first time a host needs to 
>>>> resolve a name, and then again after cache expiry much later, so 
>>>> they account for only a tiny fraction of the traffic, and also for 
>>>> only a small amount of the total delay in page loads. RTT isn't 
>>>> really the big issue in Starlink - yes it's larger than it perhaps 
>>>> needs to be, and bufferbloat seems to be present, but compared to 
>>>> GEO, it's now in the range seen for terrestrial Internet.
>>>
>>> DNS time is more significant than you think, due to the fact that so 
>>> many websites pull data from many different locations, you end up 
>>> with a lot of DNS queries when hitting a new site for the first time 
>>> (and many of these queries are serial not parallel) so it adds quite 
>>> a bit to the first rendering time of a page.
>> But most people don't hit new sites most of the time, and a lot of 
>> cascading loads hit the same CDNs you've seen previously.
>
> the timeouts on DNS are short enough that they hit them every day when 
> they wake up

That's OK (and has to be that way or else DNS changes would never 
propagate).

But, an end client, typically hits the same site many times within a 
relatively short time window. For this, it does only need to do do one 
lookup. The client will then cache the entry. If it needs to look up 
again 15 minutes plus later doesn't really matter in terms of a LEO 
system - the client will be talking to a different satellite by then.

Also, the percentage of DNS queries relating to CDN servers is very high 
nowadays, because CDN use is so pervasive that people will claim that 
there is an "Internet outage" when a CDN goes down.

>
>>>>> CDNs or even datacenters (Cloud) in GEO or LEO is even more complex.
>>>>
>>>> Indeed. In so many ways.
>>>>
>>>> Mind though that CDNs are generally tied in with DNS nowadays, and 
>>>> there's another snag: Take two users, Alice in the UK and Bob in 
>>>> New Zealand - pretty much antipodean, using Starlink in bent-pipe 
>>>> configuration, i.e., their traffic goes through, say, the London 
>>>> gateway in the UK and the Clevedon gateway in NZ. Now imagine both 
>>>> trying to resolve the same CDN hostname some time apart, but via 
>>>> the same satellite DNS as the satellite has moved from the UK to NZ 
>>>> in the interim. Say Alice resolves first and gets the IP address of 
>>>> a CDN server in the UK. If the satellite DNS now caches this, and 
>>>> Bob queries the same hostname, he gets directed to a server in the 
>>>> UK literally a world away instead of the Auckland one closest to 
>>>> him. So unless each satellite carries a geolocated copy of the 
>>>> world's DNS entries with it and makes a decision based on user 
>>>> location, you have a problem.
>>>
>>> This is true when the DNS answer is dynamic, but such cases also 
>>> have short cache timeouts. Even with a 90 min orbit, a 15 min 
>>> timeout would significantly lessen the impact (and I would expect 
>>> that an orbital DNS would detect short timeouts and treat them as a 
>>> signal to shorten the timeout even more)
>>
>> Timeout where? At the end user client or at the satellite?
>
> at the DNS cache and at the client. If you are using DNS to redirect 
> people to the closest/least loaded site, you need to have your DNS 
> timeouts set short so that you can change where they go with minimal 
> downtime. Many clients refuse to honor extremely short timeouts (IIRC 
> about 15 min is the low end)
>
>> At the end user client, a short timeout makes no sense at all because 
>> their host-to-CDN-IP server mapping shouldn't really change in bent 
>> pipe - only the sat hop changes.
>>
>> If the timeout is meant to be on the satellite, it means that the 
>> satellite knows nothing about anything when it arrives to assist you, 
>> and needs to query some sort of (probably ground-based) DNS server 
>> anyway.
>>
>> Also, the assumption that a satellite will return to the same spot 
>> after a full orbital period (of say 90 minutes) only applies to 
>> satellites in equatorial orbits (or polar orbits, and then only to 
>> the poles). In all other cases, the Earth's rotation will assure that 
>> the satellite's return to the same location takes many orbital periods.
>
> when the satellite first comes into an area, it won't know what's 
> appropriate to cach for the area, but it will start caching when 
> people start using it, the first person suffers the full hit, but 
> everyone after that benefits.

Yes, full understood. That's how it works on terrestrial DNS (and even 
on GEO this would be a really good argument). But on LEO, that benefit 
only materialises if the second client and any others in the same area 
get to query the same satellite that handled the first client's query, 
because that's where the information would be cached if we had a DNS 
server of sorts on each bird. For this to happen, the second client and 
any subsequent ones have to query within minutes if not seconds of the 
first one. What is the probability for this to happen? This depends on 
the total number of active users hanging off that satellite and the 
popularity of the target host/site among them. The larger the number of 
users and the higher the site popularity, the more likely that cached 
entries will see a second or subsequent query. "Active" in this context 
means users navigating to new sites during the visibility window of that 
satellite.

Practically speaking, we know from various sources that each Starlink 
satellite provides - ballpark - a couple of dozen Gb/s in capacity, and 
that active users on a "busy" satellite see a couple of dozen Mb/s of 
that. "Busy" means most active users, and so we can conclude that the 
number of users per satellite who use any site is at most around 1000. 
The subset of users navigating to new sites among them is probably in 
the low 100's at best. If we're excluding new sites that aren't dynamic, 
we're probably down to a couple of dozen new static sites being queried 
per satellite pass. How many of these queries will be duplicates? Not a 
lot. If we're including sites that are dynamic, we're still not getting 
a huge probability of cache entry re-use.

>
> DNS data is not that large, getting enough storage into the satellites 
> to serve 90% of the non-dynamic data should not be a big deal. The 
> dynamic data expires fast enough (and can be detected as being dynamic 
> and expired faster in the satellite) that I'm not worried about 
> serving data from one side of the world to the other.
Yes, but the only advantage we'd get here is faster resolution for a 
very small subset of DNS queries.

-- 
****************************************************************
Dr. Ulrich Speidel

School of Computer Science

Room 303S.594 (City Campus)

The University of Auckland
u.speidel at auckland.ac.nz
http://www.cs.auckland.ac.nz/~ulrich/
****************************************************************