ECE/CS 752 Spring 2008
Homework 3

Due: Wednesday April 2, 2008, at the beginning of lecture


All work must be done individually. No late assignments will be accepted.

Problem 1

Solve problem 5.31 in Chapter 5 of the Shen/Lipasti textbook.  Simulate the given trace in reverse order: x200, x80, x200, xE0, xB0, x80, x200, xA0, x80

Problem 2

Solve problem 3.28 from Chapter 3 of the Shen/Lipasti textbook.

Problem 3

Answer problem 3.31 from Chapter 5 of the Shen/Lipasti textbook.

Problem 4

This problem uses the Simplescalar simulator (documentation is available from http://www.simplescalar.com/. The SimpleScalar 3.0 simulator is available at CAE in "~ece752/simplesim-3.0". You will use a script called "RUNapsi" to run the apsi benchmark from the SPECfp95 benchmark suite. Copy the files in "~ece752/hw3" to somewhere that you have write access. The binaries in ~ece752/simplesim-3.0 are compiled for Sun machines; if you would prefer to use the linux machines (tux-*) you will need to copy the source and recompile, following the directions on the simplescalar website, and download the little-endian binary from http://www.ece.wisc.edu/~mikko/apsi-little-endian.ss.

a. "sim-cheetah" is a cache simulator that allows you to simulate multiple cache configurations in a single pass.  Read the README.cheetah file to learn more about it.  Then use sim-cheetah to collect the cold, capacity, and conflict misses of data caches ranging in size from 8KB to 64KB, varying associativity from 1 to 2 to 4-way, assuming a block size of 64B and an LRU replacement policy.  Plot these results in a stacked bar graph similar to that shown in the lecture notes.

b.  Use sim-cheetah to compare the number of capacity and conflict misses for a 4-way set-associative 8KB cache with 64B blocks, when operating under the LRU and OPT (optimal) replacement policies.

c. “sim-cache” is a multilevel cache simulator.  Run sim-cache with a first-level data cache that is 8KB, 4-way set-associative with LRU replacement, with 64B lines, connected to a second-level cache of 128KB, 2-way set-associative with LRU replacement, with 64B lines.  Compare the miss rate you obtain for the first-level data cache, compare against the LRU result from (b), and explain any discrepancies.

d. “sim-outorder” is an out-of-order processor simulator.  Run sim-outorder with its default machine configuration (i.e. no options), except configure the first-level data cache and L2 cache the same way you did in step (c).  Report the first-level data cache miss rate, and compare it to the results from (b) and (c).  Explain any discrepancies in the results.