CS 473 - HW4
More MIPS
Due Wednesday, February 26, 2003
Consider the following block of C code, which looks up 100 items from
an array in a table and computes a checksum based on the items found
there (this is actually a very practical problem: I'm having to do it
for some barcode scanning software I'm writing for Science Fair).
checksum = 0;
for (i = 0; i < 100; i++)
checksum = (checksum + table[string[i]]) % 256;
Make the following assumptions:
i and checksum are local variables
declared as int, stored in register
$t0 and $t1, respectively.
string and table are both global
variables. string is declared as
char[], while table is declared as
int[]. Following standard C semantics, you don't
need to do bounds checking on the arrays. Unfortunately, the
addresses of both string and table are
too large to fit in 16 bits, so you'll have to construct their
addresses by hand. To simplify things a bit, assume you can say
high(symbol) to mean the high-order 16 bits of a symbol, and
low(symbol) to mean the low-order 16 bits of the symbol. You
can do simple arithmetic involving constants only, and use that
where a symbol would go (so you could say something like
low(sym+1), but not low(sym+i), as
i is a variable).
- Assume the "standard" MIPS pipeline (as shown on page 499),
additionally assuming any extra data paths needed for instructions
that this pipeline can't handle. Also, assume no delayed loads
and branches. The main points here are that
- you have a five-stage pipeline
- a taken branch requires a 1-cycle stall
- a
lw, in which the loaded value is used for
arithmetic in the immediately following instruction, also
requires a one-cycle stall.
- Use only "real" MIPS instructions, no pseudo-instructions: so,
for instance, you can't use the
rem (remainder)
pseudo-instruction.
On to the problems. Note that I want exact answers to all the
questions. When I ask for code size, that means only the
code; you don't need to count space for variables.
- (20 points) ``Naively'' compile the code sequence shown into
MIPS assembly code. By ``naively'' translate, I mean perform a
straightforward translation, without thinking about
optimizing the code. How large (measured in
bytes) is your resulting code? How many cycles does it take to
execute, given the assumptions?
- (20 points) Optimize your code from the previous question for
minimum execution time: do absolutely anything you can think of
to make the code run as quickly as possible; this typically
involves eliminating the loop. Now how large is it? How many
cycles does it take?
- (20 points) Now optimize your code for minimum size. This time,
do absolutely anything you can think of to make the code as
small as possible. How large is it? How fast is it? The
resulting optimized code is typically very similar to a naive
compilation
- (30 points) Finally, optimize your code for a compromise between
speed and size on a dual-pipeline superscalar MIPS, as shown on
pages 511 through 514 (use the book version, not my lecture
version, since the book version is easier to look up details on).
Also as on page 513, you should assume a delayed branch this
time. How big is your code? How fast? Note that this last one
won't have a unique correct answer; as you found in the previous
parts of the problem, you can trade of space vs. time,
and I've deliberately not told you exactly how to trade (all the
same, a minimum-size answer will likely lose points for being
slow, and a minimum-time answer will ikely lose points for being
big).
Last modified: Wed Feb 19 10:36:28 MST 2003