Domain decomposition techniques constitute an excellent framework for describing and developing parallel algorithms for solving distributed linear systems. These systems are typically partitioned into blocks which are then mapped into processors, most often one for each processor. The simplest of these techniques, Additive Schwarz, though easy to implement and subject to little parallel overhead, may fail to converge for difficult linear systems and for large numbers of processors. On the other hand, when the Schur complement related to the original global system is considered, iterative methods can be develped whose convergence properties do not deteriorate. In general, the related Schur complement system is better conditioned than the original system, and it is often beneficial to obtain the solution to the original system by solving the Schur complement system. Good preconditioning strategies based on this approach can be developed. However, forming the Schur complement system and solving it exactly is fairly expensive, and this has been one of the main impediments in the use of Schur complement techniques. This is especially true in distributed environments, where the communication overhead is added to the overall cost.
Here, we present Schur complement techniques that do not require an explicit construction of the global Schur complement matrix, meaning that each processor performs operations only on its local portion of the global Schur complement matrix. This is accomplished by utilizing the distributed data structure for the original matrix. The cost of solving the Schur complement system is kept minimal by using approximations of the local Schur complements and by solving inexactly the global Schur complement system. We describe several ways of approximating local Schur complements. Specifically, these approximations are based on an incomplete LU factorization and on using sparse approximate-inverse techniques. Inexact solutions of the Schur complement system are obtained by a small number of GMRES steps. Our numerical experiments show that increasing the accuracy of the solution does not always lead to reduced overall execution times though the number of iterations needed to solve the original system may decrease. It has been observed that the number of steps required for convergence and the overall solution time increase at a much lower rate than standard Schwarz methods as the number of processors and problem size increase, suggesting much better scalability properties.