From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from au-smtp-delivery-117.mimecast.com (au-smtp-delivery-117.mimecast.com [103.96.23.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id D63C83B2A4 for ; Sun, 14 May 2023 02:06:50 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=auckland.ac.nz; s=mimecast20200506; t=1684044408; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NhQgoXTY5SWWBSkfJ0xCtErFKZG/WzcPnPoPCwafMxY=; b=US7xrm+GSAXoxexqMhkH4mzV96Fe52ImPkmFHPPKzKDjEmakzs0BGxpwFOzrbc2JbwT+N4 ORUfyGjssJA5j/IpB6UHbBzNvgs49+pJGioJu0ivKtSx3a+O3pPXbPWuU0VK9CzpUM706Q SmVazwXko+xL9Awm4SE620iXkJSOXpU= Received: from AUS01-SY4-obe.outbound.protection.outlook.com (mail-sy4aus01lp2170.outbound.protection.outlook.com [104.47.71.170]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id au-mta-84-0NUHK-YINcS-P0JSDouBOA-1; Sun, 14 May 2023 16:06:46 +1000 X-MC-Unique: 0NUHK-YINcS-P0JSDouBOA-1 Received: from SY4PR01MB6979.ausprd01.prod.outlook.com (2603:10c6:10:142::13) by SYBPR01MB6697.ausprd01.prod.outlook.com (2603:10c6:10:12c::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6387.29; Sun, 14 May 2023 06:06:45 +0000 Received: from SY4PR01MB6979.ausprd01.prod.outlook.com ([fe80::68d5:4e6b:745e:197e]) by SY4PR01MB6979.ausprd01.prod.outlook.com ([fe80::68d5:4e6b:745e:197e%7]) with mapi id 15.20.6387.029; Sun, 14 May 2023 06:06:45 +0000 Message-ID: <48b00469-0dbb-54c4-bedb-3aecbf714a1a@auckland.ac.nz> Date: Sun, 14 May 2023 18:06:42 +1200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 To: David Lang Cc: "starlink@lists.bufferbloat.net" References: <0no84q43-s4n6-45n8-50or-12o3rq104n99@ynat.uz> From: Ulrich Speidel In-Reply-To: <0no84q43-s4n6-45n8-50or-12o3rq104n99@ynat.uz> X-ClientProxiedBy: SY5PR01CA0011.ausprd01.prod.outlook.com (2603:10c6:10:1fa::17) To SY4PR01MB6979.ausprd01.prod.outlook.com (2603:10c6:10:142::13) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SY4PR01MB6979:EE_|SYBPR01MB6697:EE_ X-MS-Office365-Filtering-Correlation-Id: 51eec8c1-92f8-4dcb-ee42-08db54416561 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0 X-Microsoft-Antispam-Message-Info: sL3N9R9wJCa1+TgLctukklYbKPZOk/eajfwmw6TZb2NtdW/3NXp4C6GODvUsh/9o2TTcstgeoR9o0eE33XyF5XgXLPfU1l/i68oID/OJHsPguZEsqLfj+ldNhthK8Oz3gSwM8q/jVk0rWdUDouryOy+KIEN/0LIhhfRmS4kDgKCbC1hf7DF4sUEk5tfoY66iv6u83o4XpNq7H9sA87omlR65tKunFpA0ihal3fD/2pRPj/Pogw0Av2ybKDdBNckIq1yMF5mQFaXgENhglW3S4niPuU/2+A5CocGzWopUtaRlvpySnm0LqOGA1UAafdFZVTySdCnGOguGuKosNuvGIV6hMIl2NFjphWp//DGwBwjK/F4yhMnUOhHsv+xNfq+Kk3RDEbYNsq0l+JOFCl3p3b93COh3+q+L+ZtE+FQ2ONHqw02ZPVL35T3GNiAGr/ttoKclbYTdLzCig4UhEW152PYQ89RVVHanBBZAt0vTRz4TeV+DlCerppKaeWnMl5/3wZb0Yy4YaQzHxIKDI+pfgAXH3pG5tO7ZdxX+ORz2qZ7kzPW/w87MCsIj0w8mOC9iksTgDnCQfqCFchHjYViH4ldgp3RMhycE9hizoxYUwcg8oV7vipiB+EVSEydENGcnm9vACLvHU1tJXvjQIDR87g== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SY4PR01MB6979.ausprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(4636009)(376002)(136003)(396003)(346002)(39860400002)(366004)(451199021)(31686004)(66476007)(6916009)(4326008)(66556008)(478600001)(66946007)(6506007)(186003)(53546011)(6512007)(38100700002)(36756003)(83380400001)(2616005)(66574015)(8936002)(8676002)(2906002)(6666004)(6486002)(966005)(786003)(316002)(86362001)(31696002)(5660300002)(41300700001)(45980500001)(43740500002); DIR:OUT; SFP:1101 X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?b3E5YXlFb3F5TlJmS3FlcTdqaXNCOFJoTmRSVk84cVV3a0JwaDJINUs1K2xt?= =?utf-8?B?Z0tUeHFaaFZPbHk4TTdoeWpyc1FSTGV5TjNoT0ZqWXlweEd0RmdFY3FpZWVp?= =?utf-8?B?UnA0bnl3VDN5V1IxaVdkSVgwWjZaK2w4OWxzZHZNVktDTGtkSjRNTDdjYWIr?= =?utf-8?B?MitDNFpPaWpLQVpra0lHUmNRWUxuN3RuT2hORmdiYlh0ZjZ5aTFGeHFIT0xW?= =?utf-8?B?engrOVAvTlNINmNpQUx3Tk9pOUZSR0p6THZqTmx0NGFxVUpKSjB5emhhWjFx?= =?utf-8?B?NTV5elMwSklCY3h0OE9BQTJlNGY2c21hVVVkemU2eVp4c2xFdkVzOU1KWUxG?= =?utf-8?B?YWJ1YWp5dnhYeWpPeUErWkNwQ21yRkxjTmJtazdHR2lwa2l6eUZTM004WWpV?= =?utf-8?B?MHdla2NnUFdHaXVKY0dIeE9YQ3BOcHJpZ3NHcC9qbTgycnFSUGVpeithQm8r?= =?utf-8?B?L1BwellzdnZObHd0YlNqRTNRdEgwTDVieUFNNDMybXpzNjBzUE1BWCtpZll4?= =?utf-8?B?ZWFJeGxSOXA5RlA3VHQ0UEdobWF0aFFpRnYvYW9VU290dmlxblJtNllkelVP?= =?utf-8?B?N3RHdEQ3QlZndDhOWm9wTlR2cStlNFRSTEFwM2g3UXJHV0s3Qk00SURxRFBl?= =?utf-8?B?b0tHOVNHbE1vckFLdU5NYWVvRjdsRy9sRzFCTWFkcVR2aVZmNWxuU3NPUEVF?= =?utf-8?B?eG9qVHN0T3AwVG9sMXBMTWJUdzl5SFZSam9wbmpYWDVPdnZkVWhRNlJER0J3?= =?utf-8?B?R21vN1pOaUFvNElDM2NkZE96ZmdhWjBWS3orWUhicVprTjRKL3ZPTDZySHNj?= =?utf-8?B?dHpjd1QvUEVZK041a2Z3M25jSGZGU1Y4OU5SQzBISXNkOGNYZ2kwRzI1Vyta?= =?utf-8?B?SEhsb2lDUmNyOGJ5cDJEaDdvdmRYeURqd3BCTjdjNzZiOFZSaFBQZkt4QkY1?= =?utf-8?B?Ri9TNW10aFRqOUt6UklCV094RnJJWlhqWCtuR2ZXNmp0SDJxSUh5akMrTEZ2?= =?utf-8?B?anJRc3dOaktVQlV2eUR5ajlnNm04dnp3cW0rRG5nbUw4RkdoaTYzNVZabHNY?= =?utf-8?B?cWcwcGRqRFl2M0M2QnFnV1Rac0VLUnJyVUZJc3hTVnVSb1paK1g1WVVqcVFE?= =?utf-8?B?Vy9qRjJYcHJzdU5HczdvZlZ4Z0RxeFYyNU84TVA1Y25hUTFhY0d0OGd4dHZK?= =?utf-8?B?OFczbVBiREtDS2FQK0FhbmlyQWlVWDRBeWU0QjNQMmNWTnBvTDdYQit2RTRi?= =?utf-8?B?djFNOXFNalFxUHBjZTgyaVZyTmc5aTZHaSt5RWUvWXlpeDkrNjhKdy9tNm1W?= =?utf-8?B?aTB3c0tnZkgrcm9RaXlTQmRuOTVXaWp4RDdBd1dlc1JvZzhHeVRZM2RRbW5l?= =?utf-8?B?bzFHTXV2RUdDYTdsVDZIUWlva3MvOHRoUVdURmdpNWVGNEx4dmozU1UyKzhk?= =?utf-8?B?eTZzOVZUWm5ibHNVQitwS1Vxa29JakdwQk5QaFVNV1F4dVNoS1F5K1dVZDhW?= =?utf-8?B?SFY1a2R2RW42L2p4NDYrVXJDMU0vL2RhUlVSeE1UNXhta1VyYU9GQ3hLR3Fu?= =?utf-8?B?WDVwMzB4SFpoUFk3ajloL09OZy9pT2gxc1BnelBUY1kvOTJCUHpVYno3aklu?= =?utf-8?B?WG9Tci9qU1kxVTM3eWY0RkI3ZURtdjZPcGcwNEdSYVY2YXJJTkk1Q09yQ25S?= =?utf-8?B?MGJ5b3Zxb2xsZ2lmM0VoQXQ4cERvSU9RNVE0dTYvOWQ1YWdlblQ1RHVHZ0F5?= =?utf-8?B?bjBpWWdJc1Jodkw2K3gxUm1FZGw5Z0dyVGlHVStja1NPZi9LWFp4bGlPNnpz?= =?utf-8?B?VWRWVzVaODlnMVRyRWVYbzNIK2Z2eWovcU9Ma3E0dElaRlJhcDhsVld5ZmM5?= =?utf-8?B?TnNoKzBzTlZOMEloWk5BWUpGV29RTU9lbk9pYVhicFNVeURuRWI4QTByaXJB?= =?utf-8?B?QW5Rd1ZlSDNxK1UyalBHSVkvK1dwN3N2b3c5cEVrL3VQckUrOUJDUVVnR3Rx?= =?utf-8?B?QzgzeUpoNUFTcXhLVk5ZT0s3Q1BIdnlMR2U2NFV2Y0FTZjlCcGRxM0xudVFy?= =?utf-8?B?bEFvanhNTFFOVnRxUU1HZ1BZMUtVc1ZXOVc2VkFPNmUxalpHcnpQZ2I1ek5q?= =?utf-8?B?QWtDd3VjVVd5VXUzUGNoNjFWeHdUcm9BT3JhSGFXVDJ6dGNWOHdBZXlRQnRY?= =?utf-8?B?Z25wVWVWcUpmVXFsNTgrUjltV3FIbkF4YWhsdlVaYjFwVlpNMU02Qi9HMEx2?= =?utf-8?B?Smk1bTloSFV4WktoMlQrWjVocXdnPT0=?= X-OriginatorOrg: auckland.ac.nz X-MS-Exchange-CrossTenant-Network-Message-Id: 51eec8c1-92f8-4dcb-ee42-08db54416561 X-MS-Exchange-CrossTenant-AuthSource: SY4PR01MB6979.ausprd01.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 May 2023 06:06:45.4357 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: d1b36e95-0d50-42e9-958f-b63fa906beaa X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: B9o7dGSkTKee8EUbK7ixQvlCvxLJdUYbPrxcOFEo3o/LgV235G/hfV/smXsln7QD7ydg/R0JzxiXlPedpbtFWuZdXefg8LfqhqpJ6N2bJ84= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SYBPR01MB6697 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: auckland.ac.nz Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Starlink] Starlink hidden buffers X-BeenThere: starlink@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Starlink has bufferbloat. Bad." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 May 2023 06:06:51 -0000 On 14/05/2023 10:57 am, David Lang wrote: > On Sat, 13 May 2023, Ulrich Speidel via Starlink wrote: > >> Here's a bit of a question to you all. See what you make of it. >> >> I've been thinking a bit about the latencies we see in the Starlink=20 >> network. This is why this list exist (right, Dave?). So what do we know? >> >> 1) We know that RTTs can be in the 100's of ms even in what appear to=20 >> be bent-pipe scenarios where the physical one-way path should be well=20 >> under 3000 km, with physical RTT under 20 ms. >> 2) We know from plenty of traceroutes that these RTTs accrue in the=20 >> Starlink network, not between the Starlink handover point (POP) to=20 >> the Internet. >> 3) We know that they aren't an artifact of the Starlink WiFi router=20 >> (our traceroutes were done through their Ethernet adaptor, which=20 >> bypasses the router), so they must be delays on the satellites or the=20 >> teleports. > > the ethernet adapter bypasses the wifi, but not the router, you have=20 > to cut the cable and replace the plug to bypass the router Good point - but you still don't get the WiFi buffering here. Or at=20 least we don't seem to, looking at the difference between running with=20 and without the adapter. > >> 4) We know that processing delay isn't a huge factor because we also=20 >> see RTTs well under 30 ms. >> 5) That leaves queuing delays. >> >> This issue has been known for a while now. Starlink have been=20 >> innovating their heart out around pretty much everything here - and=20 >> yet, this bufferbloat issue hasn't changed, despite Dave proposing=20 >> what appears to be an easy fix compared to a lot of other things they=20 >> have done. So what are we possibly missing here? >> >> Going back to first principles: The purpose of a buffer on a network=20 >> device is to act as a shock absorber against sudden traffic bursts.=20 >> If I want to size that buffer correctly, I need to know at the very=20 >> least (paraphrasing queueing theory here) something about my packet=20 >> arrival process. > > The question is over what timeframe. If you have a huge buffer, you=20 > can buffer 10s of seconds of traffic and eventually send it. That will=20 > make benchmarks look good, but not the user experience. The rapid drop=20 > in RAM prices (beyond merely a free fall) and the benchmark scores=20 > that heavily penalized any dropped packets encouraged buffers to get=20 > larger than is sane. > > it's still a good question to define what is sane, the longer the=20 > buffer, the mor of a chance of finding time to catch up, but having=20 > packets in the buffer that have timed out (i.e. DNS queries tend to=20 > time out after 3 seconds, TCP will give up and send replacement=20 > packets, making the initial packets meaningless) is counterproductive.=20 > What is the acceptable delay to your users? > > Here at the bufferbloat project, we tend to say that buffers past a=20 > few 10s of ms worth of traffic are probably bad and are aiming to=20 > single-digit ms in many cases. Taken as read. > >> If I look at conventional routers, then that arrival process involves=20 >> traffic generated by a user population that changes relatively=20 >> slowly: WiFi users come and go. One at a time. Computers in a company=20 >> get turned on and off and rebooted, but there are no instantaneous=20 >> jumps in load - you don't suddenly have a hundred users in the middle=20 >> of watching Netflix turning up that weren't there a second ago. Most=20 >> of what we know about Internet traffic behaviour is based on this=20 >> sort of network, and this is what we've designed our queuing systems=20 >> around, right? > > not true, for businesses, every hour as meetings start and let out,=20 > and as people arrive in the morning, arrive back from lunch, you have=20 > very sharp changes in the traffic. And herein lies the crunch: All of these things that you list happen=20 over much longer timeframes than a switch to a different satellite.=20 Also, folk coming back from lunch would start with something like=20 cwnd=3D10. Users whose TCP connections get switched over to a different=20 satellite by some underlying tunneling protocol could have much larger=20 cwnd. > > at home you have less changes in users, but you also may have less=20 > bandwidth (although many tech enthusiasts have more bandwidth than=20 > many companies, two of my last 3 jobs have had <400Mb at their main=20 > office with hundreds of employees while many people would consider=20 > that 'slow' for home use). As such a parent arriving home with a=20 > couple of kids will make a drastic change to the network usage in a=20 > very short time. I think you've missed my point - I'm talking about changes in network=20 mid-flight, not people coming home and getting started over a period of=20 a few minutes. The change you see in a handover is sudden and probably=20 width sub-second ramp-up. And it's something that doesn't just happen=20 when people come home or return from lunch - it happens every few minutes. > > > but the active quueing systems that we are designing (cake, fq_codel)=20 > handle these conditions very well because they don't try to guess what=20 > the usage is going to be, they just look at the packets that they have=20 > to process and figure out how to dispatch them out in the best way. Understood - I've followed your work. > > because we have observed that latency tends to be more noticable for=20 > short connections (DNS, checking if cached web pages are up to date,=20 > etc), our algorithms give a slight priority to new-low-traffic=20 > connections over long-running-high-traffic connections rather than=20 > just splitting the bandwidth evenly across all connections, and can=20 > even go further to split bandwith between endpoints, not just=20 > connections (with endpoints being a configurable definition) > > without active queue management, the default is FIFO, which allows the=20 > high-user-impact, short connection packets to sit in a queue behind=20 > the low-user-impace, bulk data transfers. For benchmarks,=20 > a-packet-is-a-packet and they all count, so until you have enough=20 > buffering that you start having expired packets in flight, it doesn't=20 > matter, but for the user experience, there can be a huge difference. All understood - you're preaching to the converted. It's just that I=20 think Starlink may be a different ballpark. Put another way: If you have a protocol (TCP) that is designed to=20 reasonably expect that its current cwnd is OK to use for now is put into=20 a situation where there are relatively frequent, huge and lasting step=20 changes in available BDP within subsecond periods, are your underlying=20 assumptions still valid? I suspect they're handing over whole cells, not individual users, at a=20 time. > > David Lang > --=20 **************************************************************** Dr. Ulrich Speidel School of Computer Science Room 303S.594 (City Campus) The University of Auckland u.speidel@auckland.ac.nz http://www.cs.auckland.ac.nz/~ulrich/ ****************************************************************