From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 229F13BA8E for ; Tue, 27 Nov 2018 17:33:49 -0500 (EST) Received: by mail-lf1-x129.google.com with SMTP id p17so17852031lfh.4 for ; Tue, 27 Nov 2018 14:33:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=X4FftR70fyQ2MB1f6NMnHkMo2jHwOHcbulY64+eAY3Q=; b=rk5J9hzBg8yzxYnLN6hUqdZYR7DPlsHi3Q2cvLpAkw2I9cPrLTKUzvFyovO582ICR2 ZyJPDnAJnOdfK8OAwypn/wSkUOscFx1Zwy369Y2Bq+bp6OVUFpmSBxETci+c21gYzS5p cInO0rOs+8bp3TzZxCUk1JbREVRMVR682kePEpWguA3B1wfbCzk/0wmBDJUlvgcLaRak PDy0pYP3+4vXjG+WPMJEi15YLXfn6f3m7/Xbx5PonZnv/Np9ve+RD+U7IzGy8WXsZJee o+OM5VvTH62LMy/PMIt7EZLCCFKgEFhNzbNxY0A/re1SapPUFU+VuW6W7OQu/MR4A78F L+yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=X4FftR70fyQ2MB1f6NMnHkMo2jHwOHcbulY64+eAY3Q=; b=Ue1eOV9wKFzWTIeOYhXt3fBItYJ16RQuDr2KTBg7OlogmMSo8vXdEmW7hZEJSKvoIt s5lnDcOmfimOh+ZL2FPJVRtc8mQ8qTtAnzR4vIXNm4SQlU82Yz4KqsQqvpKnMSXKsbTc dJsbwFeTeZ1UQToFUn0/kAWBxap1kJIEbHhWuv3BFSTBUfY2vdufQv4acX0LLZ5IKxZ2 JhzX8BATw2CiqLBnd28KNwg7GHyfJI7Bw8ckz8Q/FVYcz8u/ikcIW5PvOkvqcwxJI43+ 0MhjiGI0RmoKts1SCuKlo+01gHmcUy4GgtmLz6864ao2jF/kEd5oy1rzfzveMsIxVRLz fSJg== X-Gm-Message-State: AGRZ1gLsJ+sz1vbT/LZhkETJV226OVBpw3UmqE3Y9C35GHagT4o8fLAM 5b6cCpCbhUl5c0SFGdcpf5o= X-Google-Smtp-Source: AJdET5fllt9TzNr0G9//HPDhTAjpCd+8+j9UfzQ5WkhD8N9R+bkVjL8KzdBZLIvvwr33QlLCkyqHUg== X-Received: by 2002:a19:9fcd:: with SMTP id i196mr19601005lfe.82.1543358027894; Tue, 27 Nov 2018 14:33:47 -0800 (PST) Received: from jonathartonsmbp.lan (83-245-236-220-nat-p.elisa-mobile.fi. [83.245.236.220]) by smtp.gmail.com with ESMTPSA id j126sm887757lfe.10.2018.11.27.14.33.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Nov 2018 14:33:47 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) From: Jonathan Morton In-Reply-To: Date: Wed, 28 Nov 2018 00:33:45 +0200 Cc: Dave Taht , bloat Content-Transfer-Encoding: quoted-printable Message-Id: <2DAA554D-04AD-4C63-ADE8-338BA68C9F16@gmail.com> References: <6C1479A8-43E8-4F89-BCEA-1D28CA3E8589@heistp.net> <87r2fbzrng.fsf@taht.net> <4FB37CD5-0DAB-479E-8C8C-671D442D668E@akamai.com> <20181127103114.3f403d8a@xeon-e3> To: Pete Heist X-Mailer: Apple Mail (2.3445.9.1) Subject: Re: [Bloat] one benefit of turning off shaping + fq_codel X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2018 22:33:49 -0000 >> I wish I knew of a mailing list where I could get a definitive answer >> on "modern problems with async circuits", or an update on the kind of >> techniques the new AI chips were using to keep their power = consumption >> so low. I'll keep googling. >=20 > I=E2=80=99d be interested in knowing this as well. This gives some = examples of async circuits: = https://web.stanford.edu/class/archive/ee/ee371/ee371.1066/lectures/lect_1= 2.pdf >=20 > Page 43, =E2=80=9CBottom Line=E2=80=9D mentions that asynchronous = design has =E2=80=9Csome delay matching / overhead issues=E2=80=9D. = Apparently delay matching means getting the signal outputs on two = separate paths to arrive at the same time(?) Presumably overhead refers = to the 2x space on the die previously mentioned, for completion = detection. Pages 23-25 on =E2=80=9Cdata-bundling constraints=E2=80=9D = might also highlight some other challenges. Some more current material = would be interesting though... The area overhead is at least partly mitigated by the major advantage of = not having to distribute and gate a coherent clock signal across the = entire chip. I half-remember seeing a quote that distributing the clock = represents about 30% of the area and/or power consumption of a modern = deep-sub-micron design. This is area and power that is not directly = contributing to functionality. Generally there are two major styles of asynchronous logic: 1: Standard combinatorial logic stages accompanied by self-timing = circuits with a matched delay, generally known as "bundled data". This = style has little overhead (probably less than the clock distribution it = replaces) but requires local timing closure (the timing circuit must = have strictly *more* delay than the logic it accompanies) to assure = correct functionality. I suspect that achieving local timing closure is = easier than the global timing closure required by conventional = synchronous logic. 2: Dual-rail QDI logic, in which completion is explicitly signalled by = the arrival of a result. This almost completely eliminates timing = closure from the logic correctness equation, but the area overhead can = be substantial. Achieving maximum performance in this style can also be = challenging, but suitable approaches do exist, eg: https://brej.org/papers/mapld.pdf Both styles can inherently adapt timings to thermal and voltage = conditions within a design range without much explicit provisioning, and = typically have much cleaner power load and EMI characteristics than = synchronous logic. But as you can see from the above, the downsides = typically associated with async logic tend to apply to one or the other = of the styles, not to both at once. - Jonathan Morton