From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <chromatix99@gmail.com>
Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com
 [IPv6:2a00:1450:4864:20::129])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 229F13BA8E
 for <bloat@lists.bufferbloat.net>; Tue, 27 Nov 2018 17:33:49 -0500 (EST)
Received: by mail-lf1-x129.google.com with SMTP id p17so17852031lfh.4
 for <bloat@lists.bufferbloat.net>; Tue, 27 Nov 2018 14:33:49 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to;
 bh=X4FftR70fyQ2MB1f6NMnHkMo2jHwOHcbulY64+eAY3Q=;
 b=rk5J9hzBg8yzxYnLN6hUqdZYR7DPlsHi3Q2cvLpAkw2I9cPrLTKUzvFyovO582ICR2
 ZyJPDnAJnOdfK8OAwypn/wSkUOscFx1Zwy369Y2Bq+bp6OVUFpmSBxETci+c21gYzS5p
 cInO0rOs+8bp3TzZxCUk1JbREVRMVR682kePEpWguA3B1wfbCzk/0wmBDJUlvgcLaRak
 PDy0pYP3+4vXjG+WPMJEi15YLXfn6f3m7/Xbx5PonZnv/Np9ve+RD+U7IzGy8WXsZJee
 o+OM5VvTH62LMy/PMIt7EZLCCFKgEFhNzbNxY0A/re1SapPUFU+VuW6W7OQu/MR4A78F
 L+yQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to;
 bh=X4FftR70fyQ2MB1f6NMnHkMo2jHwOHcbulY64+eAY3Q=;
 b=Ue1eOV9wKFzWTIeOYhXt3fBItYJ16RQuDr2KTBg7OlogmMSo8vXdEmW7hZEJSKvoIt
 s5lnDcOmfimOh+ZL2FPJVRtc8mQ8qTtAnzR4vIXNm4SQlU82Yz4KqsQqvpKnMSXKsbTc
 dJsbwFeTeZ1UQToFUn0/kAWBxap1kJIEbHhWuv3BFSTBUfY2vdufQv4acX0LLZ5IKxZ2
 JhzX8BATw2CiqLBnd28KNwg7GHyfJI7Bw8ckz8Q/FVYcz8u/ikcIW5PvOkvqcwxJI43+
 0MhjiGI0RmoKts1SCuKlo+01gHmcUy4GgtmLz6864ao2jF/kEd5oy1rzfzveMsIxVRLz
 fSJg==
X-Gm-Message-State: AGRZ1gLsJ+sz1vbT/LZhkETJV226OVBpw3UmqE3Y9C35GHagT4o8fLAM
 5b6cCpCbhUl5c0SFGdcpf5o=
X-Google-Smtp-Source: AJdET5fllt9TzNr0G9//HPDhTAjpCd+8+j9UfzQ5WkhD8N9R+bkVjL8KzdBZLIvvwr33QlLCkyqHUg==
X-Received: by 2002:a19:9fcd:: with SMTP id i196mr19601005lfe.82.1543358027894; 
 Tue, 27 Nov 2018 14:33:47 -0800 (PST)
Received: from jonathartonsmbp.lan (83-245-236-220-nat-p.elisa-mobile.fi.
 [83.245.236.220])
 by smtp.gmail.com with ESMTPSA id j126sm887757lfe.10.2018.11.27.14.33.46
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 27 Nov 2018 14:33:47 -0800 (PST)
Content-Type: text/plain;
	charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <C1068E9D-CB97-48CC-BC97-3319E95A3815@heistp.net>
Date: Wed, 28 Nov 2018 00:33:45 +0200
Cc: Dave Taht <dave.taht@gmail.com>,
 bloat <bloat@lists.bufferbloat.net>
Content-Transfer-Encoding: quoted-printable
Message-Id: <2DAA554D-04AD-4C63-ADE8-338BA68C9F16@gmail.com>
References: <CAA93jw7bPub0G7kAyrWEy+YQE5SGkAoOk49_QA-C_+MXa4a2gA@mail.gmail.com>
 <6C1479A8-43E8-4F89-BCEA-1D28CA3E8589@heistp.net> <87r2fbzrng.fsf@taht.net>
 <4FB37CD5-0DAB-479E-8C8C-671D442D668E@akamai.com>
 <20181127103114.3f403d8a@xeon-e3>
 <CAA93jw5P=WTKjTeG8usPJs4DdRO2juQZ-Pk9sEcJtVv032vy4w@mail.gmail.com>
 <C1068E9D-CB97-48CC-BC97-3319E95A3815@heistp.net>
To: Pete Heist <pete@heistp.net>
X-Mailer: Apple Mail (2.3445.9.1)
Subject: Re: [Bloat] one benefit of turning off shaping + fq_codel
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Tue, 27 Nov 2018 22:33:49 -0000

>> I wish I knew of a mailing list where I could get a definitive answer
>> on "modern problems with async circuits", or an update on the kind of
>> techniques the new AI chips were using to keep their power =
consumption
>> so low. I'll keep googling.
>=20
> I=E2=80=99d be interested in knowing this as well. This gives some =
examples of async circuits: =
https://web.stanford.edu/class/archive/ee/ee371/ee371.1066/lectures/lect_1=
2.pdf
>=20
> Page 43, =E2=80=9CBottom Line=E2=80=9D mentions that asynchronous =
design has =E2=80=9Csome delay matching / overhead issues=E2=80=9D. =
Apparently delay matching means getting the signal outputs on two =
separate paths to arrive at the same time(?) Presumably overhead refers =
to the 2x space on the die previously mentioned, for completion =
detection. Pages 23-25 on =E2=80=9Cdata-bundling constraints=E2=80=9D =
might also highlight some other challenges. Some more current material =
would be interesting though...

The area overhead is at least partly mitigated by the major advantage of =
not having to distribute and gate a coherent clock signal across the =
entire chip.  I half-remember seeing a quote that distributing the clock =
represents about 30% of the area and/or power consumption of a modern =
deep-sub-micron design.  This is area and power that is not directly =
contributing to functionality.

Generally there are two major styles of asynchronous logic:

1: Standard combinatorial logic stages accompanied by self-timing =
circuits with a matched delay, generally known as "bundled data".  This =
style has little overhead (probably less than the clock distribution it =
replaces) but requires local timing closure (the timing circuit must =
have strictly *more* delay than the logic it accompanies) to assure =
correct functionality.  I suspect that achieving local timing closure is =
easier than the global timing closure required by conventional =
synchronous logic.

2: Dual-rail QDI logic, in which completion is explicitly signalled by =
the arrival of a result.  This almost completely eliminates timing =
closure from the logic correctness equation, but the area overhead can =
be substantial.  Achieving maximum performance in this style can also be =
challenging, but suitable approaches do exist, eg:

	https://brej.org/papers/mapld.pdf

Both styles can inherently adapt timings to thermal and voltage =
conditions within a design range without much explicit provisioning, and =
typically have much cleaner power load and EMI characteristics than =
synchronous logic.  But as you can see from the above, the downsides =
typically associated with async logic tend to apply to one or the other =
of the styles, not to both at once.

 - Jonathan Morton