cache.pdf

(70 KB) Pobierz

Cache Designs and Tricks

Craig C. Douglas

University of Kentucky Computer Science Department

Lexington, Kentucky, USA

Yale University Computer Science Department

New Haven, Connecticut, USA

douglas@ccs.uky.edu or douglas-craig@cs.yale.edu

http://www.ccs.uky.edu/~douglas

http://www.mgnet.org

1038948287.024.png

Cache Methodology

Motivation:

1. Time to run code = clock cycles running code +

clock cycles waiting for memory

2. For many years, CPU’s have sped up an average of 72% per year

over memory chip speeds.

Hence, memory access is the bottleneck to computing fast.

Definition of a cache:

1. Dictionary: a safe place to hide or store things.

2. Computer: a level in a memory hierarchy.

1038948287.025.png

Diagrams

Serial:

CPU

Registers

Logic Cache

Main Memory

Parallel:

Shared Memory

. . .

Network

Cache 1

Cache 2

. . .

Cache p

. . .

CPU 1

CPU 2

CPU p

1038948287.026.png

1038948287.027.png

1038948287.001.png

1038948287.002.png

1038948287.003.png

1038948287.004.png

1038948287.005.png

1038948287.006.png

1038948287.007.png

1038948287.008.png

1038948287.009.png

1038948287.010.png

1038948287.011.png

1038948287.012.png

1038948287.013.png

1038948287.014.png

1038948287.015.png

1038948287.016.png

1038948287.017.png

1038948287.018.png

1038948287.019.png

1038948287.020.png

1038948287.021.png

Tuning for Caches

1. Preserve locality.

2. Reduce cache thrashing.

3. Loop blocking when out of cache.

4. Software pipelining.

1038948287.022.png

Memory Banking

This started in the 1960’s with both 2 an d 4 way interleaved memory

banks. Each bank can produce one unit of memory per bank cycle.

Multiple reads and writes are possible in parallel.

The bank cycle time is currently 4-8 times the CPU clock time and getting

worse every year.

Very fast memory (e.g., SRAM) is unaffordable in large quantities.

This is not perfect. Consider a 2 way interleaved memory and a stride 2

algorithm. This is equivalent to non-interleaved memory systems.

1038948287.023.png

Plik z chomika:

Inne pliki z tego folderu:

gxemul compiled.7z (7660 KB)
bonk-0.6.tar.gz (17 KB)
算法导论.pdf (12811 KB)
888.11.algo.2.pdf (161 KB)
nnedi3.zip (12470 KB)

Inne foldery tego chomika:

Zgłoś jeśli naruszono regulamin