## Parallel U-Cycle Multigrid Method

Dexuan Xie

Courant Institute of Mathematical Sciences,
New York University,

251 Mercer Street,
New York, NY 10012

L. Ridgway Scott

Department of Mathematics,
University of Houston,
Houston, TX 77204

Abstract
A simple way to avoid idle processors in implementing the
multigrid method based on a sequence of nested grids, \Omega_j for
j = 1, 2,..., l with \Omega_1 being the coarsest grid,
on a parallel computer is to select a proper grid \Omega_j
with 1 < j < l as the new coarsest grid. The aim of this paper is to show
that such an approach is an efficient way to implement the multigrid
method on a large scale, multiprocessor computer. In particular,
the variant of the V-cycle generated by this approach, which is
called the U-cycle, is studied in this paper. It is shown that the
convergence rate \rho(j) of the U-cycle with grid \Omega_j being the
coarsest grid is a decreasing function of j, and the coarsest grid
equations of the U-cycle can be solved approximately without increasing
the total number of U-cycle iterations. Then, a parallel U-cycle is defined
by using domain partitioning techniques, which can be implemented on
a MIMD multiprocessor computer without any idle processors.
An analysis of the time complexity of the parallel U-cycle
shows that the parallel U-cycle is fully scalable, and can have
super-linear speedup in comparison to the original V-cycle. Further, the
performance of the parallel U-cycle in the memory-constrained case is
discussed. Numerical results are presented showing that the U-cycle can have
a better performance than the V-cycle on a sequential computer while the
parallel U-cycle has a high efficiency on a large scale, MIMD multiprocessor
computer. Experiments are presented for both the Intel Paragon and the IBM SP2.