Nice resource, thanks.

If someone wonders why things look the way they do, so it's all about 
on-die and off-die memory. Either you use off-die or on-die memory, often 
SRAM which requires 6 gates per bit. So spending half a billion gates 
gives you ~10MB buffer on-die. If you're doing off-die memory (DRAM or 
similar) then you'll get the gigabytes of memory seen in some equipment. 
There basically is nothing in between. As soon as you go off-die you might 
as well put at least 2-6 GB in there.

There are some reasearch on new memory devices with unexpected results...
https://ieeexplore.ieee.org/document/8533260

The HMC memory allows improvements in execution time and consumed energy. In some situations, this memory type permits removing the L2 cache from the memory hierarchy.

HMC parts start at 2GB