From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gndrsh.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 03E1F3B29E for ; Sat, 10 Jul 2021 07:50:03 -0400 (EDT) Received: from gndrsh.dnsmgr.net (localhost [127.0.0.1]) by gndrsh.dnsmgr.net (8.13.3/8.13.3) with ESMTP id 16ABnpri045498; Sat, 10 Jul 2021 04:49:51 -0700 (PDT) (envelope-from starlink@gndrsh.dnsmgr.net) Received: (from starlink@localhost) by gndrsh.dnsmgr.net (8.13.3/8.13.3/Submit) id 16ABnoRo045497; Sat, 10 Jul 2021 04:49:50 -0700 (PDT) (envelope-from starlink) From: "Rodney W. Grimes" Message-Id: <202107101149.16ABnoRo045497@gndrsh.dnsmgr.net> In-Reply-To: To: Dave Taht Date: Sat, 10 Jul 2021 04:49:50 -0700 (PDT) CC: starlink@lists.bufferbloat.net, Ankit Singla , Sam Kumar X-Mailer: ELM [version 2.4ME+ PL121h (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Subject: Re: [Starlink] SatNetLab: A call to arms for the next global Internet testbed X-BeenThere: starlink@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Starlink has bufferbloat. Bad." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Jul 2021 11:50:04 -0000 > While it is good to have a call to arms, like this: ... much information removed as I only one to reply to 1 very narrow, but IMHO, very real problem in our networks today ... > Here's another piece of pre-history - alohanet - the TTL field was the > "time to live" field. The intent was that the packet would indicate > how much time it would be valid before it was discarded. It didn't > work out, and was replaced by hopcount, which of course switched > networks ignore and isonly semi-useful for detecting loops and the > like. TTL works perfectly fine where the original assumptions that a device along a network path only hangs on to a packet for a reasonable short duration, and that there is not some "retry" mechanism in place that is causing this time to explode. BSD, and as far as I can recall, almost ALL original IP stacks had a Q depth limit of 50 packets on egress interfaces. Everything pretty much worked well and the net was happy. Then these base assumptions got blasted in the name of "measurable bandwidth" and the concept of packets are so precious we must not loose them, at almost any cost. Linux crammed the per interface Q up to 1000, wifi decided that it was reasable to retry at the link layer so many times that I have seen packets that are >60 seconds old. Proposed FIX: Any device that transmits packets that does not already have an inherit FIXED transmission time MUST consider the current TTL of that packet and give up if > 10mS * TTL elapses while it is trying to transmit. AND change the default if Q size in LINUX to 50 for fifo, the codel, etc AQM stuff is fine at 1000 as it has delay targets that present the issue that initially bumping this to 1000 caused. ... end of Rods Rant ... -- Rod Grimes rgrimes@freebsd.org