March 26, 2019
For today:
1) Read Think OS Chapter 7 and do the reading quiz
2) Prepare for a quiz
Today:
1) Quiz
2) Think OS Chapter 7
For next time:
1) Read HFC Chapter 8
2) Start Homework 7
3) Come to class with some Project 2 ideas (see here)
Project 1 due this evening.
In the ExercisesInC repository
cd examples/cache
make
./cache > data
python graph_data.py
While that's running (for a while), let's discuss some background.
The instruction cycle
Fetch: Send the program counter to memory to fetch the next instruction. When the instruction arrives, it is stored in the instructions register.
Decode: Part of the CPU, called the “control unit”, decodes the instruction and send signals to the other parts of the CPU.
Execute: Signals from the control unit cause the appropriate computation to occur.
About half of the instructions are load or store, so they make additional access to memory to read or write data.
Typical main memory access time is 100 ns.
A 1 GHz CPU might initiate an instruction every 1 ns. If every fetch and every load takes 100 ns, the CPU would be idle 99% of the time.
That's why there are caches!
See Latency Numbers Every Programmer Should Know
Move the slider: What is changing? What is NOT?
And this stackoverflow article
1) Assume that the time to access cache is 2 ns and the time to access memory is 102 ns. If the hit rate for a series of accesses is 75%, what is the average access time?
Two ways to think of it:
Average access time = Hit rate * cache access time + miss rate * memory access time
OR
Miss penalty = memory access time - cache access time
Average miss penalty = miss rate * miss penalty
Average access time = cache access time + average miss penalty
2) Suppose you have a large array of 4-byte integers on a machine with a single level of memory cache, which uses block length 32 bytes.
For each of the following scenarios, compute the cache hit rate and the average access time.
a) Traverse the array once, accessing each element.
b) Traverse the array once, accessing every other element.
c) Traverse the array once, accessing every fourth element.
d) Traverse the array once, accessing every eighth element.
e) Traverse the array once, accessing every sixteenth element.
f) Traverse the array 16 times, accessing every sixteenth element, assuming that the array is small enough to fit in cache.
g) Traverse the array many times, accessing every sixteenth element, assuming that the array does not fit in cache.
3) Suppose we traverse an array many times. Sketch a line that shows average access time versus array size, assuming that the cache is 1 MB.
Quoth me:
In summary, we expect good cache performance if the array is smaller than the cache size or if the stride is smaller than the block size. Performance only degrades if the array is bigger than the cache and the stride is large.
4) Let's look at the results. What can you infer about your cache hierarchy?