Discussion:
DRAM Chiplet for L3 cache?
Add Reply
Stephen Fuld
2025-01-27 16:55:35 UTC
Reply
Permalink
One of the advantages of using chiplets instead of a large monolithic
chip is that you can use functionality made with different foundry
technologies.

This brings up the question of why, at least so far, no one is using a
DRAM chiplet (i.e. one made with a DRAM specialized technology), for the
L3 cache. ISTM that the advantage of being able to put a much higher
capacity cache in the same physical size chiplet is substantial.
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
Scott Lurndal
2025-01-27 17:07:30 UTC
Reply
Permalink
Post by Stephen Fuld
One of the advantages of using chiplets instead of a large monolithic
chip is that you can use functionality made with different foundry
technologies.
This brings up the question of why, at least so far, no one is using a
DRAM chiplet (i.e. one made with a DRAM specialized technology), for the
L3 cache. ISTM that the advantage of being able to put a much higher
capacity cache in the same physical size chiplet is substantial.
At additional development cost for doing the link training, a power cost
to support refresh and a performance cost due to the additional latency.
Anton Ertl
2025-01-27 17:18:29 UTC
Reply
Permalink
Post by Stephen Fuld
This brings up the question of why, at least so far, no one is using a
DRAM chiplet (i.e. one made with a DRAM specialized technology), for the
L3 cache. ISTM that the advantage of being able to put a much higher
capacity cache in the same physical size chiplet is substantial.
There used to be eDRAM used for an L4 cache ("Crystall Well") in some
Intel Broadwell and Skylake variants, as well as eDRAM used as L3
cache on Power8. There is an insightfull article on Crystal Well (as
well as a little bit about Power8):
<https://old.chipsandcheese.com/2024/11/01/broadwells-edram-vcache-before-vcache-was-cool/>,
which also provides an explanation why this technology is no longer
used.

In a recent article
<https://old.chipsandcheese.com/2025/01/18/inside-the-amd-radeon-instinct-mi300as-giant-memory-subsystem/>
they look at how a memory-side SRAM cache performs in the MI300A. For
CPUs you really want to have the cache on the core side.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
Stephen Fuld
2025-01-27 19:14:46 UTC
Reply
Permalink
Post by Anton Ertl
Post by Stephen Fuld
This brings up the question of why, at least so far, no one is using a
DRAM chiplet (i.e. one made with a DRAM specialized technology), for the
L3 cache. ISTM that the advantage of being able to put a much higher
capacity cache in the same physical size chiplet is substantial.
There used to be eDRAM used for an L4 cache ("Crystall Well") in some
Intel Broadwell and Skylake variants, as well as eDRAM used as L3
cache on Power8. There is an insightfull article on Crystal Well (as
<https://old.chipsandcheese.com/2024/11/01/broadwells-edram-vcache-before-vcache-was-cool/>,
which also provides an explanation why this technology is no longer
used.
Thank you Anton. You're right, that article is excellent. I knew about
the eDRAM used in some Power systems, but not Intel's use. The article
explains the issues very well. It seems like one of those things that
sounds good at first, but as you get into the details, the problems
become more evident, and the paper illustrated that very well.
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
Michael S
2025-01-27 21:48:57 UTC
Reply
Permalink
On Mon, 27 Jan 2025 17:18:29 GMT
Post by Anton Ertl
Post by Stephen Fuld
This brings up the question of why, at least so far, no one is using
a DRAM chiplet (i.e. one made with a DRAM specialized technology),
for the L3 cache. ISTM that the advantage of being able to put a
much higher capacity cache in the same physical size chiplet is
substantial.
There used to be eDRAM used for an L4 cache ("Crystall Well") in some
Intel Broadwell and Skylake variants, as well as eDRAM used as L3
cache on Power8.
In Power7/8/9 eDRAM is a part of proccessor die, so not quite the same
as OP's suggestion.

In some of IBM's mainframe processors eDRAM is present both as on-die
L2 and L3 caches and as on package L4 cache. I don't remember an exact
number of this CPU.
Post by Anton Ertl
There is an insightfull article on Crystal Well (as
<https://old.chipsandcheese.com/2024/11/01/broadwells-edram-vcache-before-vcache-was-cool/>,
which also provides an explanation why this technology is no longer
used.
In a recent article
<https://old.chipsandcheese.com/2025/01/18/inside-the-amd-radeon-instinct-mi300as-giant-memory-subsystem/>
they look at how a memory-side SRAM cache performs in the MI300A. For
CPUs you really want to have the cache on the core side.
- anton
Stephen Fuld
2025-01-28 19:17:45 UTC
Reply
Permalink
Post by Michael S
On Mon, 27 Jan 2025 17:18:29 GMT
Post by Anton Ertl
Post by Stephen Fuld
This brings up the question of why, at least so far, no one is using
a DRAM chiplet (i.e. one made with a DRAM specialized technology),
for the L3 cache. ISTM that the advantage of being able to put a
much higher capacity cache in the same physical size chiplet is
substantial.
There used to be eDRAM used for an L4 cache ("Crystall Well") in some
Intel Broadwell and Skylake variants, as well as eDRAM used as L3
cache on Power8.
In Power7/8/9 eDRAM is a part of proccessor die, so not quite the same
as OP's suggestion.
True, but as the OP, I give him some slack. I probably wasn't clear. I
had originally intended to mean using a chiplet manufactured on a DRAM
process in order to reduce cost per bit, perhaps by using actual
existing, though possibly slightly modified commodity DRAM die, not
eDRAM. The article does a good job of explaining why this isn't a good
idea.
--
- Stephen Fuld
(e-mail address disguised to prevent spam)
MitchAlsup1
2025-01-29 17:08:26 UTC
Reply
Permalink
There were at least 3 times when I wanted to use DRAM as cache.

Even refreshing a row every other cycle was insufficient for
any of the test engineers to sign off on being able to properly
test DRAM on a chip containing CPU cores.

I had even gone to the point of designing* and laying out the
DRAM cells, word line drivers, sense amplifiers, and bit-line
prechargers.

(*) SPICE.

Loading...