3D Thursday: The most expensive RAM in the world

CTO Israel Beinglass over at Monolithic 3D has published a thought-provoking blog on the company site titled “The most expensive SRAM in the world.” Beinglass deconstructs a 45nm Intel Penryn processor, saying that 50% of the die is SRAM and that people are therefore buying SRAM at a very dear price indeed considering the price of the chip. The proper question, the one that Beinglass is trying to address in his blog, is whether this is the best and least expensive way to provide the on-chip processor with the SRAM it needs for its L1 and L2 caches. Considering that Beinglass is the CTO of Monolithic 3D, it’s pretty clear where he’s going with this question. He’s going vertical.

If there were a way to provide fast (really fast) SRAM to the processor in a layer above the processor, then the size of the die would be smaller and smaller die accrue all sorts of economic advantages. That’s the premise of Beniglass’ blog post. The key assumption here, and it’s a big assumption, is that the SRAM provided in the upper layer will be fast enough to meet the processor’s needs. That means that the SRAM cells must be implemented in a technology fast enough to meet the very fast access-speed requirements of L1 and L2 cache and it also means that the associated drivers and interconnect must not add too much delay between the processor’s cache port and the SRAM. These are both huge requirements.

Adding an active layer above the processor core requires some sort of 3D fabrication and there are two alternatives here. One alternative is a separate SRAM die designed to be bonded to the processor using micro-bumps and compression bonding. This 3D method has a lot of momentum in the industry but not necessarily for something as critically fast as SRAM destined to be used as L1 and L2 cache. There’s far more interest in using this technique for bulk DRAM and NAND Flash die. The other approach, the one advocated by Monolithic 3D, involves building additional active layers above those already laid down using techniques developed to create SOI (silicon on insulator) wafers.

Whichever 3D process chosen, it’s possible that the interconnect could actually be faster than the planar approach of putting large SRAM on the same die as the processor. Interconnect that must stretch a quarter or half way across a large chip to connect a cache port to an SRAM might travel a much shorter distance if the direction of the connection is straight up. The shorter connection can be faster, given the right circumstances.

One thing is certain. The fabrication approach that meets performance requirements while enabling the lowest system cost is the approach that will eventually win.

About sleibson2

Principal Analyst Emeritus, Tirias Research
This entry was posted in 3D, EDA360, Silicon Realization, SoC Realization, System Realization and tagged , , . Bookmark the permalink.

3 Responses to 3D Thursday: The most expensive RAM in the world

  1. David Chapman says:

    This is not nearly so much a technical issue as it is a human behavior issue. No program manager at Intel is going to be willing to trust the success of his project to an outside supplier, so while a die stack may be the better solution, you can be sure that nobody but Intel will be trusted with the SRAM die. The SRAM die will be manufactured by Intel, no matter what. It follows that the stacked solution will not be chosen until it is better from an internal Intel point of view. As long as it is cheaper and easier to put the SRAM on-chip, that is where it will stay…at least until Intel program managers develop the capacity to trust SRAM development teams outside their direct control. I find it interesting that we are at a point where a small investment in improving human interaction could produce more cost advantage in the processor suite than a big investment in hardware technology. Might Intel hire some psychologists to help them make their next breakthrough?

  2. sleibson2 says:

    David, I think this is rather a narrower interpretation and analysis of the blog post than I was expecting. The point of the post wasn’t supposed to be about Intel. It was supposed to be about all processors on chip along with L1 and L2 caches. Nevertheless, I recall that Intel’s first architectural improvement to the Pentium, the Pentium Pro, needed separate processor and SRAM die because the process technology could not put both on one piece of silicon at the time. Intel took a lot of flack for that approach. At the next process step, it did become possible to combine both and the Pentium Pro’s P6 microarchitecture became very successful. So Intel has in the past shown its willingness to use two die when needed. I’m not about to predict whether this will happen again or not.

  3. When I wrote the blog I showed Intel uP just as an example, but it could be any uP rich with SRAM.
    On the other hand I don’t see any issue for example in case of Intel for them to run the SRAM and to do the vertical integration. It will be better that way since they know the design rule, the process etc. and can do the integration better then buying the SRAM from a third party.

Leave a comment