* [Bloat] Fwd: Will Edwards to give Mill talk in Estonia on 12/10/2014
[not found] <daeaa4982a7a1c6b864c6b557d27ac1acc3.20141205184544@mail158.atl21.rsgsv.net>
@ 2014-12-05 19:18 ` Dave Taht
[not found] ` <14518.1417808038@turing-police.cc.vt.edu>
0 siblings, 1 reply; 3+ messages in thread
From: Dave Taht @ 2014-12-05 19:18 UTC (permalink / raw)
To: bloat, cerowrt-devel
The mill computer is the coolest new cpu architecture I've ever heard
of, and if you are in estonia, you should go to this talk.
If not, the various presentations on each aspect of it, given at
places like stanford, google and elsewhere,
are all worthwhile.
http://millcomputing.com/docs/belt/
Really key to the architecture (and why it's sort of relevant to the
bloat list), is that it can do 30+ ops/cycle, AND context
switch time is *4* cycles - where things like the haswell architecture
are in the 700-1200 range context switch times. Context
switch overhead is one of the reasons why a ton of programmers and
data centers running lots of VMs are going
crazy trying to find new ways to bulk up operations (improving
throughput, at the cost of latency). Getting rid of the
context switch cost (it's the same cost as a subroutine!) has
implications throughout every other layer of the system,
not just for vms in data centers, but high speed embedded processors
of all kinds.
I spent a couple weekends going through the presentations and wiki,
scribbling down some psuedocode for what
I thought would be "hard" cases to actually implement (like a java
stack machine), and in every case, the
arch was a win. Furthermore the presentations were so well constructed
that everytime I had an objection,
it was answered a slide or two later.
If "the belt" can be built, (and I don't see any reason why not, at
this point) wow, it's a game changer.
If I had money to invest, I would. I look forward to the ISA and
compilers being published soon.
---------- Forwarded message ----------
From: Mill Computing List Manager <listmanager@millcomputing.com>
Date: Fri, Dec 5, 2014 at 10:45 AM
Subject: Will Edwards to give Mill talk in Estonia on 12/10/2014
To: dave <dave.taht@gmail.com>
** Will Edwards from Mill Computing
will be giving a talk on a revolutionary computer architecture
------------------------------------------------------------
View this email in your browser
(http://us7.campaign-archive1.com/?u=daeaa4982a7a1c6b864c6b557&id=856d7ee5b7&e=d27ac1acc3)
Mill Computing
------------------------------------------------------------
You have a 50-year-old IBM mainframe in your cell phone
Will Edwards from Mill Computing will be giving a talk on a
revolutionary computer architecture.
When: 14:00-15:00 with discussion afterwards, 10th December 2014
Where: ICT building room ICT-507AB, Tallinn Tech, Akadeemia tee 15a,
Tallinn, Estonia.
------------------------------------------------------------
Every architectural part of current CPUs was present in the System/360
Model 91 in 1967 – caches, out-of-order execution, large register
files, byte addressing, even hexadecimal. All the advances of the last
50 years have been in the fabrication process – how CPUs get made, not
how they work. Isn’t it about time to bring the architecture up to
date too?
This talk introduces the new Mill CPU architecture which brings
DSP-like efficiency and performance to general purpose computing.
Offering a 10x power/performance gain over conventional out-of-order
superscalar architectures, the Mill family of CPUs scales from phones
to supercomputers.
The Mill is an extremely wide-issue VLIW design, able to issue 30+
MIMD operations per cycle. The Mill is inherently a vector machine
and can vectorize and pipeline almost all loops in general purpose
code. The Mill is a belt machine (as distinct from a stack or register
machine) and has a fine grained security model that facilitates
microkernels without performance penalties.
This talk will give a high-level introduction to the Mill programming
model, with an opportunity for the audience to ask more detailed
questions in areas of interest.
Will is a technical member of the Mill CPU team.
Videos and other material about other aspects of the Mill can be found
at http://millcomputing.com/docs.
============================================================
Copyright © 2014 Mill Computing, All rights reserved.
You are on this list because you gave us your email address.
Our mailing address is:
Mill Computing
Box 1531
Palo Alto, CA 94302-1531
USA
** unsubscribe from this list
(http://millcomputing.us7.list-manage.com/unsubscribe?u=daeaa4982a7a1c6b864c6b557&id=dd27f0573c&e=d27ac1acc3&c=856d7ee5b7)
** update subscription preferences
(http://millcomputing.us7.list-manage.com/profile?u=daeaa4982a7a1c6b864c6b557&id=dd27f0573c&e=d27ac1acc3)
--
Dave Täht
thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Bloat] [Cerowrt-devel] Fwd: Will Edwards to give Mill talk in Estonia on 12/10/2014
[not found] ` <14518.1417808038@turing-police.cc.vt.edu>
@ 2014-12-05 20:15 ` Dave Taht
2014-12-10 6:13 ` Jonathan Morton
0 siblings, 1 reply; 3+ messages in thread
From: Dave Taht @ 2014-12-05 20:15 UTC (permalink / raw)
To: Valdis Kletnieks; +Cc: cerowrt-devel, bloat
On Fri, Dec 5, 2014 at 11:33 AM, <Valdis.Kletnieks@vt.edu> wrote:
> On Fri, 05 Dec 2014 11:18:57 -0800, Dave Taht said:
>> The Mill is an extremely wide-issue VLIW design, able to issue 30+
>> MIMD operations per cycle. The Mill is inherently a vector machine
>> and can vectorize and pipeline almost all loops in general purpose
>> code.
>
> The big question is whether we know more about writing compilers for VLIW
> machines than we did when the Itanium came out. That was hard enough to
> get just 3 instructions packed per word (of course, the fact that it wasn't
> 3 generic instructions, but 2 of one flavor and 1 of another, didn't help).
Well, in this case half the instructions are one flavor the other half another.
But it's the belt concept in the "mill" that is key. Basically, having
tons and tons
of fixed addressible registers doesn't work well (as in the itanium,
sparc, and other arches) for a variety of reasons...
Taking a classic smaller register set, such as in the x86_64, and
trying add all these superscalar and out of order features to it
has hit a brick wall
... and the best we see in arms and mips (
with way more registers) is typically two out of order ops, total.
stack machines overly serialize operations and tend to bottleneck
on local cache (see the transputer T800 for the last decent example)
Aside from a bunch of genuinely weirder architectures (see for example
the propeller, or dave may's xcore stuff, or parallella)
the mill's "belt" idea - temporal register addressing - is the first new idea
I've seen in cpu design for a very, very long time. (perhaps it
was tried in some other architecture?)
Even if the mill can't get to 32 ops/cycle generally (and some of those ops
are overhead in maintaining the belt, but not as much as you might
think), I do think it can get to quite a few, even in branchy code,
and the lower end versions of the arch are comparable in ops/cycle
to the best we can do today with computers running at much faster
basic clock rates.
and context switch/subroutine call overhead! 4 cycles. Wow. :)
I certainly have quibbles with the presos I've read so far, edge cases
like floating point ops, and other seemingly nice-to-have but not
critical to the core architecture feature(s)...
but I long for a FPGA version, at least, to play with. I've spent a lot
of time trying to come up with a microarchitecture that could do
fq_codel at 10GigE+ speeds (prototyping in the parallella's FPGA),
and kept dreaming of something like the "propeller" at a really
high clock rate...
... then I stumbled over this. Sure, it's years out, but, like wow.
Well worth an initial hour to read/think/watch about.
--
Dave Täht
thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Bloat] [Cerowrt-devel] Fwd: Will Edwards to give Mill talk in Estonia on 12/10/2014
2014-12-05 20:15 ` [Bloat] [Cerowrt-devel] " Dave Taht
@ 2014-12-10 6:13 ` Jonathan Morton
0 siblings, 0 replies; 3+ messages in thread
From: Jonathan Morton @ 2014-12-10 6:13 UTC (permalink / raw)
To: Dave Taht; +Cc: Valdis Kletnieks, cerowrt-devel, bloat
[-- Attachment #1: Type: text/plain, Size: 1170 bytes --]
Watching all the talks takes more than just an hour, but I've just spent a
couple of days doing so. This is certainly intriguing. At first glance it
all looks too good to be true, but the technical details actually look
plausible and elegant.
One way to look at it is: Mill is what Itanic could have been, if it wasn't
designed by committee. It neatly sidesteps a lot of the pitfalls that tend
to go with VLIW designs, and therefore has a chance of actually working as
advertised, unlike Itanic.
The only thing that I'm really not convinced about is the way they produce
a custom instruction encoding for each member of the architecture family.
However, that does make the architecture scalable, unlike Itanic.
It's possible that some of the more useful ideas it presents could have
been incorporated into a conventional RISC architecture. I might play with
that idea privately.
Estonia is even quite close to where I am, just the opposite side of the
Gulf of Finland, so I'm almost tempted to go and take a look. On the other
hand, it's probably just as educational to watch the video at home
afterwards, and doing so avoids scheduling conflict.
- Jonathan Morton
[-- Attachment #2: Type: text/html, Size: 1304 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-12-10 6:13 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <daeaa4982a7a1c6b864c6b557d27ac1acc3.20141205184544@mail158.atl21.rsgsv.net>
2014-12-05 19:18 ` [Bloat] Fwd: Will Edwards to give Mill talk in Estonia on 12/10/2014 Dave Taht
[not found] ` <14518.1417808038@turing-police.cc.vt.edu>
2014-12-05 20:15 ` [Bloat] [Cerowrt-devel] " Dave Taht
2014-12-10 6:13 ` Jonathan Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox