Computers today rely heavily on good utilization of their cache memory subsystems. Compilers are being optimized for business applications, not scientific computing ones, however. Automatic tiling of basic numerical algorithms involving sparse matrices and non-uniform grids is simply not provided by any compiler. Thus, absolutely terrible cache performance is normal for scientific computing applications.
Multigrid algorithms combine several numerical algorithms into a more complicated algorithm. Algorithms are derived that allow for data to pass through cache exactly once per multigrid level during a multigrid cycle before the level changes. This is optimal cache usage for large problems that do not fit entirely in cache.
The new algorithms are designed to provide the exact same solution (bitwise compatibility) as an equivalent standard multigrid algorithm using the same ordering for the iterative solver. Hence, existing multigrid theory can be used to ensure both convergence and the rate of convergence.
Non-uniform meshes and variable coefficient partial differential equations make designing cache based multgrid algorithms sigificantly different from cache based multigrid algorithms for similar problems on uniform or tensor product meshes. Quasi-uniform and unstructured grids have similar, but different properties that must be addressed separately.
The new algorithms run faster than the standard multigrid implementations. Finally, the new algorithms are designed to run well on any cache based computer and are not tailored to any particular computer system.