As DRAM is slow compared to the internal CPU cycle, caches are used to improve performance.
The illustration describes the Intel E3826 Atom processor.
Assume the following:
We want all the code we are executing in the L1 cache for maximum performance. The second best is the L2 cache. We have a problem though, as only fragments of the code will fit in the caches.
Larger code size cause higher cache miss rates which is very expensive for the performance.
The CPU executes the following steps to prefetch the code fragment.
On a cache update, another code fragment must be evicted to create place for the new fragment. These operations takes time.
Thus quality (minimum size) versus quantity (bloatware) pays off big time in performance, power consumption and cost.
PDOS
1Computed from Corsair Voyager GTX 128 GB benchmark.
Copyright © All Rights Reserved