cache memory - muneeb-mbytes/computerArchitectureCourse GitHub Wiki

what is Cache memory?

A faster and smaller segment of memory whose access time is as close as registers are known as Cache memory. In a hierarchy of memory, cache memory has access time lesser than primary memory. Generally, cache memory is very small and hence is used as a buffer.

Data in primary memory can be accessed faster than secondary memory but still, access times of primary memory are generally in a few microseconds, whereas the CPU is capable of performing operations in nanoseconds. Due to the time lag between accessing data and acting on data performance of the system decreases as the CPU is not utilized properly, it may remain idle for some time. In order to minimize this time gap new segment of memory is Introduced known as Cache Memory. and the same can be seeb in below diagram

Why cache is faster than main memory?

consider the below two diagrams shown:

Cache memory uses SRAM technology , example SRAM diagram is shown below.
Primary memory uses DRAM technology, example DRAM diagram is shown below.

The reason why cache memory is faster because main memory(DRAM) has capacitor which requires time for charging and discharging, but SRAM technology does not have capacitors and is faster tham DRAM and thus cache is faster.

Cache memory (SRAM)

primary memory

Data transfer between levels:

The information is stored in different memory levels.
Initially, the processor attempts to retrieve data from the quickest memory, a successful retrieval from memory is termed as a "hit."
Failure to retrieve from the fastest memory results in a "miss."
Effective organization of memory is important to maximize the hit rates, aiming for hit rate of around 90%, 99%, or even 99.99%, depending on the memory hierarchy.

Untitled Diagram-Page-2 drawio

Ideally, the system operates smoothly with frequent hits, ensuring that data retrieval is efficient and well-organized. However, occasional misses prompt the system to move to the next memory level in search of the required information.
This approach involves attempting retrieval at each level successively; if unsuccessful, the system progresses to higher levels. Each transition between levels involves transferring a unit of data, known as a block. However, as the system progresses to higher levels due to misses, the unit of transfer may vary.

Untitled Diagram drawio (2)

For example, the communication between the processor and the data memory is in terms of block of data. The block of data is like a burst of AXI transfer and limited in size.
The data transfer between main memory and hard drive is in the form of pages. A page has bigger size compared to block data. We can call a page as a group of block data.

Principle of locality

principle of locality means that references to memory are localized in sense of time or the temporal sense as well as in spatial sense or in terms of space.

Temporal Locality:

Memory location that is referenced once will be referenced again multiple times in the near future.

References repeated in time.

Spatial Locality:

If a memory location is referenced once, then the program is likely to reference a nearby memory location in the near future.

References repeated in space.