Modern computer architectures use hierarchical memory designs in order to hide the latencies of accesses to main memory components, which are dramatically slow in contrast to the floating-point performance of the CPUs. Current memory designscommonly involve several levelsof cache memories, which can be accessed up to a hundred times faster than main memory. It is well-known that efflcient program execution can merely be achieved, if the codes respect the hierarchical memory design.
However, iterative methods are characterized by successive sweeps over data sets, which -- for representative problems in science and engineering -- are much too large to fit in cache.
In this paper we present techniques in order to enhance the cache efflciency of multigrid methods for variable-coefficient problems on regular mesh structures. A variety of performance results are provided, showing both profiling data and the speedups which can be achieved by applying our optimizations.
Key words. Computer architecture, high performance computing, cache, iterative algorithms, multigrid methods
AMS subject classifications. 65N55, 65F10, 65-04