Program 4 -- CS 473.
Due April 19, 2004.
This program is to help you understand how cache works on a
computer and how operating systems can "configure" the cache
at start up time.
Background: As we have seen in many cases, improving computational
performance is a primary goal in computer architecture. We have
seen this goal addressed in many different ways. We have seen the
clock speed increased, we have seen instructions pipelined, etc.
One such hardware element used to improve performance is the CACHE. The Cache
is module that stores a subset of items of memory so that memory references
(like "lw" and "sw") are faster. If we consider the memory
hierarchy, we see that Register's are as fast as the clock, CACHE
is slower, and then memory is even slower.
The idea behind a cache is to load up a set of words into the cache from memory
whenever a single memory item is loaded. This is sometimes called
a "cache line". The idea is based on a concept called "locality of
reference", which means that if we reference a memory location at
location X, we have a high probability of referencing memory locations next
to X, (like X+1, X+2, etc). If we only fetch values from memory when they
are needed, we will run at the speed of memory. However, if we take
advantage of the locality of reference and move a number of elements from
memory into cache AT THE SAME time, then the next reference (like X+1),
will be in CACHE, which makes the memory references faster.
The idea of loading a cache is that we can build datapaths from memory
to the cache that are larger than a word. This means that if we
reference a location, X, we could move X and its "friends" into cache
all in parallel (since we can use a BUS larger than one word). The
next reference (X+1), now is in cache and is much faster, which in turns
makes the instruction run faster, which in turns makes the program run
faster.
Terms like "cache hit" means that the item referenced was in cache
and "cache miss" which means that the item was not in cache.
There are a number of cache strategies and even cache hierarchies.
CACHE hierarchies have terms like L1, L2 and L3 cache. Each means
slower (as L1 goes to L3), and typically larger. The goal is to
stage memory into the level that best fits the demand from the program/
processor.
In this project, we will work with only have one level of CACHE.
In modern cache modules, the operating system can instruct the CACHE
on how many elements to bring into cache at a time. We will call
this a "cache line".
Cache lines are replaced based on a cache replacement algorithm.
These can include "FIFO", "LIFO", "Second Chance", "LRU", etc.
In our program, you will program First in First Out cache replacement.
For example:
Assume you have infinite RAM.
Assume you have a 256KB Cache.
Assume you want a cache line of 512 words.
Given this information, we know that there are exactly 512 cache lines.
The Cache lines are aligned on the 512 boundaries (0-511, 512-1023, etc).
If location 12 is referenced, then the CACHE system will
load memory values 0-511 into a cache line. If location 13 is then
referenced, you will have a "cache hit".
Program Problem:
Write a program to simulate cache line loads and cache line replacements.
The program will have 2 input files, "config.dat" and "input.dat".
"config.dat" will specify the size of your simulated cache in K words
as well as the cache line in words. "input.dat" will be a list of
memory references. Standard Output will be to "output.dat". For each memory
reference in "input.dat" you will produce EXACTLY one output line. The
format will be one of the following (assume X was read from the input.dat
file):
X: HIT -- means that one of the cache lines contained X
X : MISS : LOAD Y - Z -- means that X was not in cache, a cache line
was loaded with the values from Y to Z (should
contain X and be a multiple of cache line size)
X : MISS : LOAD Y - Z : REPLACE K - W
same as above, but also include what line has been replaced
The cache line replacement algorithm will be FIRST IN/ FIRST OUT
Program Specs
The following are required specifications for your program:
1) Good/professional Documentation. You must handle file IO
errors. You must explain how you represent your cache lines
and how you search for a HIT/MISS.
2) Input
Source file: config.dat -- configuration file for sizing cache
Source Type: ASCII
Source Lines:
Line 1: -- size of cache in K words
Line 2: -- size of cache line, words.
Source file: input.dat -- the list of memory references
Line 1-N: -- memory references
3) Output
Output file: output.dat
Output Type: ASCII
SEE PROBLEM DEFINITION
Error file: STD OUT
4) Efficient coding of the FIFO is required.
5) Hashing is forbidden.
6) You may NOT use a linked list
7) You MUST use an array to hold the information about which cache line
is assigned to what (you may use structures for the elements in the
array).
Any other output to output.dat is not within SPEC
Example
config.dat
1
512
input.dat
0
2
511
512
1024
1023
0
output.dat
0: MISS: LOAD 0 - 511
2: HIT
511: HIT
512: MISS: LOAD 512-1023
1024: MISS: LOAD 1024 - 1535: REPLACE 0 - 511
1023: HIT
0: MISS: LOAD 0 - 511: REPLACE 512-1023