From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp65.iad3a.emailsrvr.com (smtp65.iad3a.emailsrvr.com [173.203.187.65]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 1B50E21F284 for ; Tue, 17 Mar 2015 13:38:48 -0700 (PDT) Received: from smtp9.relay.iad3a.emailsrvr.com (localhost.localdomain [127.0.0.1]) by smtp9.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 6B839380474; Tue, 17 Mar 2015 16:38:47 -0400 (EDT) Received: from app57.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp9.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 4EDB0380150; Tue, 17 Mar 2015 16:38:47 -0400 (EDT) X-Sender-Id: dpreed@reed.com Received: from app57.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by 0.0.0.0:25 (trex/5.4.2); Tue, 17 Mar 2015 20:38:47 GMT Received: from reed.com (localhost.localdomain [127.0.0.1]) by app57.wa-webapps.iad3a (Postfix) with ESMTP id 37CD018005F; Tue, 17 Mar 2015 16:38:47 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@reed.com, from: dpreed@reed.com) with HTTP; Tue, 17 Mar 2015 16:38:47 -0400 (EDT) Date: Tue, 17 Mar 2015 16:38:47 -0400 (EDT) From: dpreed@reed.com To: "David Lang" MIME-Version: 1.0 Content-Type: text/plain;charset=UTF-8 Content-Transfer-Encoding: quoted-printable Importance: Normal X-Priority: 3 (Normal) X-Type: plain In-Reply-To: References: <7i1tkozwf2.wl-jch@pps.univ-paris-diderot.fr> <87pp87x2yp.wl-jch@pps.univ-paris-diderot.fr> X-Auth-ID: dpreed@reed.com Message-ID: <1426624727.226722951@apps.rackspace.com> X-Mailer: webmail/11.3.13-RC Cc: "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] Fwd: Dave's wishlist [was: Source-specific routing merged] X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Mar 2015 20:39:17 -0000 I agree wholeheartedly with your point, David.=0A=0AOne other clarifying po= int (I'm not trying to be pedantic, here, but it may sound that way):=0A=0A= Reliability is not the same as Availability. The two are quite different.= =0A=0A Bufferbloat is pretty much an "availability" issue, not a reliabilit= y issue. In other words, packets are not getting lost. The system is just= preventing desired use.=0A=0AAvailability issues can be due to actual fail= ures of components, but there are lots of availability issues that are caus= ed (as you suggest) by attempts to focus narrowly on "loss of data" or "com= ponent failures".=0A=0AWhen you build a system, there is a temptation to ap= ply what is called the Fallacy of Composition (look it up on Wikipedia for = precise definition). The key thing in the Fallacy of Composition is that w= hen a system of components has a property as a whole, then every component = of the system must by definition have that property.=0A=0A(The end-to-end a= rgument is a specific rule that is based on a recognition of the Fallacy of= Composition in one case.)=0A=0AWe all know that there is never a single mo= ment when any moderately large part of the Internet does not contain failed= components. Yet the Internet has *very* high availability - 24x7x365, and= we don't need to know very much about what parts are failing. That's by d= esign, of course. And it is a design that does not derive its properties fr= om a trivial notion of "proof of correctness", or even "bug freeness"=0A=0A= The relevance of a "failure" or even a "design flaw" to system availability= is a matter of a much bigger perspective of what the system does, and what= its users perceive as to whether they can get work done.=0A=0A=0A=0A=0AOn = Tuesday, March 17, 2015 3:30pm, "David Lang" said:=0A=0A> O= n Tue, 17 Mar 2015, Dave Taht wrote:=0A> =0A>> My quest is always for an ex= tra "9" of reliability. Anyplace where you can=0A>> make something more rob= ust (even if it is out at the .9999999999) level, I=0A>> tend to like to do= in order to have the highest MTBF possible in=0A>> combination with all th= e other moving parts on the spacecraft (spaceship=0A>> earth).=0A> =0A> The= re are different ways to add reliability=0A> =0A> one is to try and make su= re nothing ever fails=0A> =0A> the second is to have a way of recovering wh= en things go wrong.=0A> =0A> =0A> Bufferbloat came about because people got= trapped into the first mode of=0A> thinking (packets should never get lost= ), when the right answer ended up being=0A> to realize that we have a recov= ery method and use it.=0A> =0A> Sometimes trying to make sure nothing ever = fails adds a lot of complexity to the=0A> code to handle all the corner cas= es, and the overall reliability will improve by=0A> instead simplify normal= flow, even if it add a small number of failures, if that=0A> means that yo= u can have a common set of recovery code that's well excercised and=0A> tes= ted.=0A> =0A> As you are talking about loosing packets with route changes, = watch out that you=0A> don't fall into this trap.=0A> =0A> David Lang=0A> _= ______________________________________________=0A> Cerowrt-devel mailing li= st=0A> Cerowrt-devel@lists.bufferbloat.net=0A> https://lists.bufferbloat.ne= t/listinfo/cerowrt-devel=0A> =0A