Post by Vikas Mishra Post by Patrick Schaaf
Not neccessarily, I would think, except for direct mapped caches.
Way selection is done with an associative tag structure, and I
don't see any reason why that would be more efficient when done
with a power-of-two number of ways. So you could have a 256k*7-way
cache, for a total of 1.75 MB. Some googling finds this page,
Patrick thanks. So is there a specific restriction that you are aware of for
a Direct Mapped Cache ? Actually I should have been more specific in my
question - I was interested more in a direct mapped cache. My apologies for
1) Maybe someone has implemented a non-power-of-2 direct-mapped cache,
but I've never heard of one. I've seen designs where there was some
simple [low-gate-delay] hashing inot a power-of-2 cache, but that was
for performance (avoidance of hot spots, especially in virtual caches),
not to allow a non-power-of-2 cache. Addressing bits cost, whether for
registers, cachelines, or memory addresses, so people try not to waste
Of course, I can't think of any real designer who would waste a second
of thought about putting a DIV/MOD in the the address path. [Google
comp.arch: mashey powers 2 1994].
2) Somebody said, it doesn't really cost much to have non-power-of-2
set-associative caches, and there is a very good reason to do this
In real-world chip design, things like floorplans, wire lengths, gate
delays, design costs, time-to-market, etc, etc, actually matter.
Suppose you are considering a design with the usual collection of
units, and the data-cache is 2-way. You may well have filled the die.
On the other hand, you might be pad/bump limited, i.e., in order to get
the number of power/grounds and signals you need, there is a minimum
size for the chip, and there may be extra space left. What do you do
with the space?
Suppose there is not enough space to double the cache size, but there
is enough space to do 3-way. That can be a very attractive
alternative, as it is much easier to design than say, redesignign the
pipeline, adding functional units, etc. Alternatively, if you had
planned an 8-way cache, and misestimated space on other units, it might
just be easier to go to 7-way, assuming that the layout let you make
use of the freed-up die space [sometimes it does, sometimes it
Such flexibility is often quite useful.
In the MIPS R2000, we had a 64-element TLB, with a 6-bit index, but
(unlike the size of a direct-mapped cache), there was nothing magic
about 64. It could have been 63 or 60 with minimal bother, and that
flexibility was valuable, because for a while, it didn't look like 64
would fit. Had the design been one that required power-of-2, I would
have been very nervous that we'd have been forced to go to 32.