University of Kentucky, USA and Yale University, USA

University of Kentucky, USA

Lehrstuhl für Systemsimulation (IMMD 10)

Universität Erlangen-Nürnberg, Germany

University of Kentucky, USA and UNICAMP, Brasil

High speed cache memory is commonly used to address the disparity between the speed of a computer's central processing unit and the speed of a computer's main memory. It is advantageous to maximize the amount of time that data spends in cache. Tiling is a software technique which is often used to do just this. Tiling is not able, however, to handle dynamically changing data structures, such as those encountered in adaptively chosen, unstructured grids. We develop a variant of the Gauss-Seidel method for second order elliptic partial differential equations with variable coefficients. This variant keeps data in cache memory for much longer than non-cache implementations. As a result, our method is significantly faster than non-cache implementations. Examples from the structured grid case demonstrate the benefits of such a variant and provide motivation for the more difficult unstructured grid case. For this case, a publicly available load balancing package is used to decompose the grid into blocks of nodes which fit into cache. An O(n) algorithm is introduced that provides a one-time reordering of the nodes in each block. This new ordering permits significantly more Gauss-Seidel updates and residual calculation to be done in cache than in standard implementations. The cache aware Gauss-Seidel variant is incorporated into a multigrid code and tested by solving two dimensional linear elastic problems with multiple degrees of freedom per node. These experiments demonstrate cache hit rates and cache line reuse possible with a multigrid V cycle that incorporates such a cache aware method.

**Key words.** multigrid, unstructured grids, cache, tiling, iterative
methods, domain decomposition, partial differential equations, software.

**AMS subject classifications.** 65M55, 65N55, 65F10, 65N30, 65F50, 65N50