CTO Israel Beinglass over at Monolithic 3D has published a thought-provoking blog on the company site titled “The most expensive SRAM in the world.” Beinglass deconstructs a 45nm Intel Penryn processor, saying that 50% of the die is SRAM and that people are therefore buying SRAM at a very dear price indeed considering the price of the chip. The proper question, the one that Beinglass is trying to address in his blog, is whether this is the best and least expensive way to provide the on-chip processor with the SRAM it needs for its L1 and L2 caches. Considering that Beinglass is the CTO of Monolithic 3D, it’s pretty clear where he’s going with this question. He’s going vertical.
If there were a way to provide fast (really fast) SRAM to the processor in a layer above the processor, then the size of the die would be smaller and smaller die accrue all sorts of economic advantages. That’s the premise of Beniglass’ blog post. The key assumption here, and it’s a big assumption, is that the SRAM provided in the upper layer will be fast enough to meet the processor’s needs. That means that the SRAM cells must be implemented in a technology fast enough to meet the very fast access-speed requirements of L1 and L2 cache and it also means that the associated drivers and interconnect must not add too much delay between the processor’s cache port and the SRAM. These are both huge requirements.
Adding an active layer above the processor core requires some sort of 3D fabrication and there are two alternatives here. One alternative is a separate SRAM die designed to be bonded to the processor using micro-bumps and compression bonding. This 3D method has a lot of momentum in the industry but not necessarily for something as critically fast as SRAM destined to be used as L1 and L2 cache. There’s far more interest in using this technique for bulk DRAM and NAND Flash die. The other approach, the one advocated by Monolithic 3D, involves building additional active layers above those already laid down using techniques developed to create SOI (silicon on insulator) wafers.
Whichever 3D process chosen, it’s possible that the interconnect could actually be faster than the planar approach of putting large SRAM on the same die as the processor. Interconnect that must stretch a quarter or half way across a large chip to connect a cache port to an SRAM might travel a much shorter distance if the direction of the connection is straight up. The shorter connection can be faster, given the right circumstances.
One thing is certain. The fabrication approach that meets performance requirements while enabling the lowest system cost is the approach that will eventually win.