Hard cold facts of life:

These two facts lead to the concept of a memory hierarchy: we want to use several types of memory of varying speeds and sizes, and arrange them in a hierarchy.

Basic idea of the memory hierarchy is that we want to simultaneously increase the average speed (per access) and decrease the average cost (per byte) of memory.

The average cost per byte is given by the weighted average of the cost per byte of memory at each level, while the average speed per access is given by the weighted average of the speed of access at each level. This can mean either latency (the time to access the first piece of a transfer) or bandwidth (the rate at which data can be transferred).

The basic idea at all levels is that we take an address and apply a mapping function to it to get a new address at some level of the hierarchy. The implementation of the mapping function depends on the level of the hierarchy. There are two important levels at which the characteristics of the mapping function will tend to change: between memory and disk, and between on-chip and off-chip.

The importance of the memory/disk distinction is that disk has very different access characteristics than memory.

These will end up implying that (1) we need to avoid having to go to disk for a memory access at nearly any cost, (2) when we do have to go to memory, we want to bring in a big chunk. The answer will turn out to be we want a lookup table, filled by software but read by hardware.

The importance of the on-chip/off-chip distinction is that the memory cost function is different between on-chip and off-chip.

The result of this is that on-chip and off-chip cache will use markedly similar hardware algorithms to produce the mapping, but with different parameterization.