From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.lang.hm (rrcs-45-59-245-186.west.biz.rr.com [45.59.245.186]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id CABC23B2A4 for ; Mon, 23 Oct 2023 19:39:25 -0400 (EDT) Received: from dlang-mobile (unknown [10.2.2.69]) by mail.lang.hm (Postfix) with ESMTP id A42471B4F0B; Mon, 23 Oct 2023 16:39:24 -0700 (PDT) Date: Mon, 23 Oct 2023 16:39:24 -0700 (PDT) From: David Lang To: Karl Auerbach , "=?ISO-8859-1?Q?Network_Neutrality_is_back!_Let=B4s_make_the?= =?ISO-8859-1?Q?_technical_aspects_heard_this_time!?=" In-Reply-To: <50c07326-781d-40de-8e2c-92d84bd84cf1@cavebear.com> Message-ID: <6756521n-p6n4-30q8-qr13-85p1s4n3on9s@ynat.uz> References: <7dd9294f-2871-46cc-bbc5-e72f3becd73d@cavebear.com> <50c07326-781d-40de-8e2c-92d84bd84cf1@cavebear.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="228850167-1959275991-1698104364=:5599" Subject: Re: [NNagain] upgrading old routers to modern, secure FOSS X-BeenThere: nnagain@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: =?utf-8?q?Network_Neutrality_is_back!_Let=C2=B4s_make_the_technical_aspects_heard_this_time!?= List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Oct 2023 23:39:25 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --228850167-1959275991-1698104364=:5599 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8BIT On Mon, 23 Oct 2023, Karl Auerbach via Nnagain wrote: > It would be nice if we built our network devices so that they each had a > little introspective daemon that frequently asked "am I healthy, am I > still connected, are packets still moving through me?"  (For consumer > devices an answer of "no" could trigger a full device reboot or reset.) I agree with a lot of what you say, but I want to throw in a word of caution here. I have seen systems go from 'slow but functioning' to 'completely down and requires a complete datacenter shutdown to recover' because of automated response systems that decided to restart something when it didn't respond fast enough, triggering a cascade of failures that prevented any service from being able to start into a healthy state. I've also implemented monitoring on APs to restart them if they don't have a path to the Internet, resulting in continual reboots when there is a transitory issue (now changed to only check their next hop and only shut down wifi to avoid becoming a black hole for that SSID to err is human, to really mess things up requires a computer, and automation removes the oversight from the computer allowing it to do more damage faster. David Lang --228850167-1959275991-1698104364=:5599--