Accelerating the Complex Hessenberg QR Algorithm with the CSX600 Floating-Point Coprocessor

Y. Yamamoto, T. Miyata, and Y. Nakamura (Japan)


Eigenvalues, QR algorithm, Complex matrix, Floating point coprocessor, CSX600


In this paper, we show how to speed up the Hessenberg QR algorithm for computing the eigenvalues of a dense com plex nonsymmetric matrix using the CSX600 floating-point coprocessor. Our approach is based on the small-bulge multishift QR algorithm proposed by Braman et al. This al gorithm can perform a major part of the computation in the form of matrix multiplication and is therefore very suited to high performance architectures. However, to exploit the potential of the CSX600, the number of shifts must be in creased to more than two hundred. In that case, the parts other than matrix multiplication begin to occupy a consid erable percentage of the execution time, thereby limiting the speedup. To solve this problem, we focus on two most time-consuming parts in the algorithm except for matrix multiplication: accumulation of bulge-chasing transforma tions and solution of a decoupled small eigenproblem. We reconstruct the former as a recursive algorithm and apply a blocking technique to the latter. This enables these parts to be computed also using matrix multiplication. As a result of these optimizations, we obtained up to 3.8 times speedup with the CSX600 processor over a 3.2GHz Xeon proces sor when computing the eigenvalues of a 12,000 × 12,000 complex Hessenberg matrix.

Important Links:

Go Back