NEC

MathKeisan User's Guide


ScaLAPACK and BLACS

Introduction

ScaLAPACK (Scalable Linear Algebra PACKage) is a library of high-performance linear algebra routines for distributed-memory message passing computers. ScaLAPACK has routines for systems of linear equations, linear least squares problems, eigenvalue calculation, and singular value decomposition. ScaLAPACK can also handle many associated computations such as matrix factorization or estimating condition numbers. Dense and band matrices are supported, but not general sparse matrices. Similar functionality is provided for both real and complex matrices.

As in LAPACK, the ScaLAPACK routines are based on block-partitioned algorithms, in order to minimize data movement. The fundamental building block of the ScaLAPACK library is a distributed memory version of the Level 1, 2, and 3 BLAS, called PBLAS (Parallel BLAS). The PBLAS are, in turn, built on the BLAS for computation on a single node, and on BLACS for communication across nodes. PBLAS is an integral part of the ScaLAPACK library.

BLACS (Basic Linear Algebra Communication Subprograms) are a message-passing library designed for linear algebra. The computational model consists of a one- or two-dimensional process grid, where each process stores pieces of the matrices and vectors. The BLACS include synchronous send/receive routines to communicate a matrix or submatrix from one process to another, to broadcast submatrices to many processes, or to compute global data reductions (sums, maxima and minima). There are also routines to construct, change, or query the process grid. Since several ScaLAPACK algorithms require broadcasts or reductions among different subsets of processes, the BLACS permit a process to be a member of several overlapping or disjoint process grids, each one labeled by a context. In MPI this is called a communicator. The BLACS provide facilities for safe inter-operation of system contexts and BLACS contexts.

User Interface

User interface information is available from several sources:

ScaLAPACK Routine List

Simple Driver and Divide and Conquer Driver Routines

 ?  indicates prefix which must be filled with a combination of:
S = REAL(kind=4), D = REAL(kind=8), C = COMPLEX(kind=4), Z = COMPLEX(kind=8)
Name Prefixes Description
P?DBSV S D C Z Solves a general banded system of linear equations AX=B with no pivoting.
P?DTSV S D C Z Solves a general tridiagonal system of linear equations AX=B with no pivoting.
P?GBSV S D C Z Solves a general banded system of linear equations AX=B.
P?GELS S D C Z Solves over-determined or under-determined linear systems involving a matrix of full rank.
P?GESV S D C Z Solves a general system of linear equations AX=B.
P?GESVD S D Computes the singular value decomposition of a general matrix, optionally computing the left and/or right singular vectors.
P?PBSV S D C Z Solves a symmetric/Hermitian positive definite banded system of linear equations AX=B.
P?POSV S D C Z Solves a symmetric/Hermitian positive definite system of linear equations AX=B.
P?PTSV S D C Z Solves a symmetric/Hermitian positive definite tridiagonal system of linear equations AX=B.
P?SYEV S D Computes selected eigenvalues and eigenvectors of a symmetric matrix.
P?SYEVD S D Computes all eigenvalues, and optionally, eigenvectors of a real symmetric matrix. If eigenvectors are desired, it uses a divide and conquer algorithm.
P?HEEV C Z Computes all eigenvalues and, optionally, eigenvectors of a Hermitian matrix.
P?HEEVD C Z Computes all eigenvalues and, optionally, eigenvectors of a Hermitian matrix. If eigenvectors are desired, it uses a divide and conquer algorithm.

Expert Driver Routines

 ?  indicates prefix which must be filled with a combination of:
S = REAL(kind=4), D = REAL(kind=8), C = COMPLEX(kind=4), Z = COMPLEX(kind=8)
Name Prefixes Description
P?GESVX S D C Z Solves a general system of linear equations AX=B.
P?POSVX S D C Z Solves a symmetric/Hermitian positive definite system of linear equations AX=B.
P?SYEVX S D Computes selected eigenvalues and eigenvectors of a symmetric matrix.
P?SYGVX S D Computes selected eigenvalues and eigenvectors of a real generalized symmetric-definite eigenproblem.
P?HEEVX C Z Computes selected eigenvalues and eigenvectors of a Hermitian matrix.
P?HEGVX C Z Computes selected eigenvalues and eigenvectors of a generalized Hermitian-definite eigenproblem.

Computational Routines

 ?  indicates prefix which must be filled with a combination of:
S = REAL(kind=4), D = REAL(kind=8), C = COMPLEX(kind=4), Z = COMPLEX(kind=8)
Name Prefixes Description
P?DBTRF S D C Z Computes an LU factorization of a general band matrix with no pivoting.
P?DBTRS S D C Z Solves a general banded system of linear equations AX=B, ATX=B or AHX=B, using the LU factorization computed by P?DBTRF.
P?DBTRSV S D C Z Solves a banded triangular system of linear equations AX=B, ATX=B or AHX=B, using the LU factorization computed by P?DBTRF.
P?DTTRF S D C Z Computes an LU factorization of a general tridiagonal matrix with no pivoting.
P?DTTRS S D C Z Solves a general tridiagonal system of linear equations AX=B, ATX=B or AHX=B, using the LU factorization computed by P?DTTRF.
P?DTTRSV S D C Z Solves a tridiagonal triangular system of linear equations AX=B, ATX=B or AHX=B, using the LU factorization computed by P?DTTRF.
P?GBTRF S D C Z Computes an LU factorization of a general band matrix, using partial pivoting with row interchanges.
P?GBTRS S D C Z Solves a general banded system of linear equations AX=B, ATX=B or AHX=B, using the LU factorization computed by P?GBTRF.
P?GEBRD S D C Z Reduces a general rectangular matrix to real bidiagonal form by an orthogonal/unitary transformation.
P?GECON S D C Z Estimates the reciprocal of the condition number of a general matrix.
P?GEEQU S D C Z Computes row and column scalings to equilibrate a general rectangular matrix and reduce its condition number.
P?GEHRD S D C Z Reduces a general matrix to upper Hessenberg form by an orthogonal/unitary similarity transformation.
P?GELQF S D C Z Computes an LQ factorization of a general rectangular matrix.
P?GEQLF S D C Z Computes a QL factorization of a general rectangular matrix.
P?GEQPF S D C Z Computes a QR factorization with column pivoting of a general rectangular matrix.
P?GEQRF S D C Z Computes a QR factorization of a general rectangular matrix.
P?GERFS S D C Z Improves the computed solution to a system of linear equations and provides error bounds and backward error estimates for the solutions.
P?GERQF S D C Z Computes an RQ factorization of a general rectangular matrix.
P?GETRF S D C Z Computes an LU factorization of a general matrix, using partial pivoting with row interchanges.
P?GETRI S D C Z Computes the inverse of a general matrix, using the LU factorization computed by P?GETRF.
P?GETRS S D C Z Solves a general system of linear equations AX=B, ATX=B or AHX=B, using the LU factorization computed by P?GETRF.
P?GGQRF S D C Z Computes a generalized QR factorization.
P?GGRQF S D C Z Computes a generalized RQ factorization.
P?LAHQR S D Computes the Schur decomposition and/or eigenvalues of a matrix already in Hessenberg form.
P?ORGLQ S D Generates all or part of the orthogonal matrix Q from an LQ factorization determined by PSGELQF.
P?ORGQL S D Generates all or part of the orthogonal matrix Q from a QL factorization determined by PSGEQLF.
P?ORGQR S D Generates all or part of the orthogonal matrix Q from a QR factorization determined by PSGEQRF.
P?ORGRQ S D Generates all or part of the orthogonal matrix Q from an RQ factorization determined by PSGERQF.
P?ORMBR S D Multiplies a general matrix by one of the orthogonal transformation matrices from a reduction to bidiagonal form determined by PSGEBRD.
P?ORMHR S D Multiplies a general matrix by the orthogonal transformation matrix from a reduction to Hessenberg form determined by PSGEHRD.
P?ORMLQ S D Multiplies a general matrix by the orthogonal matrix from an LQ factorization determined by PSGELQF.
P?ORMQL S D Multiplies a general matrix by the orthogonal matrix from a QL factorization determined by PSGEQLF.
P?ORMQR S D Multiplies a general matrix by the orthogonal matrix from a QR factorization determined by PSGEQRF.
P?ORMRQ S D Multiplies a general matrix by the orthogonal matrix from an RQ factorization determined by PSGERQF.
P?ORMRZ S D Multiplies a general matrix by the orthogonal transformation matrix from a reduction to upper triangular form determined by PSTZRZF.
P?ORMTR S D Multiplies a general matrix by the orthogonal transformation matrix from a reduction to tridiagonal form determined by PSSYTRD.
P?PBTRF S D C Z Computes the Cholesky factorization of a symmetric/Hermitian positive definite banded matrix.
P?PBTRS S D C Z Solves a symmetric/Hermitian positive definite banded system of linear equations AX=B, using the Cholesky factorization computed by P?PBTRF.
P?PBTRSV S D C Z Solves a banded triangular system of linear equations AX=B, using the Cholesky factorization computed by P?PBTRF.
P?POCON S D C Z Estimates the reciprocal of the condition number of a symmetric/Hermitian positive definite distributed matrix.
P?POEQU S D C Z Computes row and column scalings to equilibrate a symmetric/Hermitian positive definite matrix and reduce its condition number.
P?PORFS S D C Z Improves the computed solution to a symmetric/Hermitian positive definite system of linear equations AX=B, and provides forward and backward error bounds for the solution.
P?POTRF S D C Z Computes the Cholesky factorization of a symmetric/Hermitian positive definite matrix.
P?POTRI S D C Z Computes the inverse of a symmetric/Hermitian positive definite matrix, using the Cholesky factorization computed by P?POTRF.
P?POTRS S D C Z Solves a symmetric/Hermitian positive definite system of linear equations AX=B, using the Cholesky factorization computed by P?POTRF.
P?PTTRF S D C Z Computes the Cholesky factorization of a symmetric/Hermitian positive definite tridiagonal matrix.
P?PTTRS S D C Z Solves a symmetric/Hermitian positive definite tridiagonal system of linear equations AX=B, using the Cholesky factorization computed by P?PTTRF.
P?PTTRSV S D C Z Solves a tridiagonal triangular system of linear equations AX=B, using the Cholesky factorization computed by P?PTTRF.
P?STEBZ S D C Z Computes the eigenvalues of a symmetric/Hermitian tridiagonal matrix by bisection.
P?STEDC S D Computes all eigenvalues and, optionally, eigenvectors of a symmetric tridiagonal matrix using the divide and conquer algorithm.
P?STEIN S D C Z Computes the eigenvectors of a symmetric/Hermitian tridiagonal matrix using inverse iteration.
P?SYGST S D Reduces a symmetric-definite generalized eigenproblem to standard form.
P?SYTRD S D Reduces a symmetric matrix to real symmetric tridiagonal form by an orthogonal similarity transformation.
P?TRCON S D C Z Estimates the reciprocal of the condition number of a triangular matrix.
P?TRRFS S D C Z Provides error bounds and backward error estimates for the solution to a system of linear equations with a triangular coefficient matrix.
P?TRTRI S D C Z Computes the inverse of a triangular matrix.
P?TRTRS S D C Z Solves a triangular system of linear equations AX=B, ATX=B or AHX=B.
P?TZRZF S D C Z Reduces an upper trapezoidal matrix to upper triangular form by means of orthogonal transformations.
P?HEGST C Z Reduces a Hermitian-definite generalized eigenproblem to standard form.
P?HETRD C Z Reduces a Hermitian matrix to Hermitian tridiagonal form by a unitary similarity transformation.
P?UNGLQ C Z Generates all or part of the unitary matrix Q from an LQ factorization determined by PCGELQF.
P?UNGQL C Z Generates all or part of the unitary matrix Q from a QL factorization determined by PCGEQLF.
P?UNGQR C Z Generates all or part of the unitary matrix Q from a QR factorization determined by PCGEQRF.
P?UNGRQ C Z Generates all or part of the unitary matrix Q from an RQ factorization determined by PCGERQF.
P?UNMBR C Z Multiplies a general matrix by one of the unitary transformation matrices from a reduction to bidiagonal form determined by PCGEBRD.
P?UNMHR C Z Multiplies a general matrix by the unitary transformation matrix from a reduction to Hessenberg form determined by PCGEHRD.
P?UNMLQ C Z Multiplies a general matrix by the unitary matrix from an LQ factorization determined by PCGELQF.
P?UNMQL C Z Multiplies a general matrix by the unitary matrix from a QL factorization determined by PCGEQLF.
P?UNMQR C Z Multiplies a general matrix by the unitary matrix from a QR factorization determined by PCGEQRF.
P?UNMRQ C Z Multiplies a general matrix by the unitary matrix from an RQ factorization determined by PCGERQF.
P?UNMRZ C Z Multiplies a general matrix by the unitary transformation matrix from a reduction to upper triangular form determined by PCTZRZF.
P?UNMTR C Z Multiplies a general matrix by the unitary transformation matrix from a reduction to tridiagonal form determined by PCHETRD.

PBLAS Routine List

 ?  indicates prefix which must be filled with a combination of:
S = REAL(kind=4), D = REAL(kind=8), C = COMPLEX(kind=4), Z = COMPLEX(kind=8)
  Name Prefixes Description
Level 1 P?SWAP S D C Z Swap vectors
P?SCAL S D C Z CS ZD Scale vector
P?COPY S D C Z Copy vector
P?AXPY S D C Z Vector scale and add
P?DOT S D Dot product, real
P?DOTU C Z Dot product, complex
P?DOTC C Z Dot product, complex, conjugate first vector
P?NRM2 S D SC DZ Euclidean norm
P?ASUM S D SC DZ Sum absolute values
PI?AMAX S D C Z Index of maximum absolute value
Level 2 P?GEMV S D C Z General matrix-vector multiplication
P?HEMV C Z Hermitian matrix-vector multiplication
P?SYMV S D C Z Symmetric matrix-vector multiplication
P?TRMV S D C Z Triangular matrix-vector multiplication
P?TRSV S D C Z Triangular solve
P?GER S D General rank-1 update, real
P?GERU C Z General rank-1 update, complex
P?GERC C Z General rank-1 update, complex, second vector conjugate
P?HER C Z Hermitian rank-1 update
P?HER2 C Z Hermitian rank-2 update
P?SYR S D Symmetric rank-1 update
P?SYR2 S D Symmetric rank-2 update
Level 3 P?GEMM S D C Z General matrix-matrix multiplication
P?SYMM S D C Z Symmetric matrix-matrix multiplication
P?HEMM C Z Hermitian matrix-matrix multiplication
P?SYRK S D C Z Symmetric rank-k update
P?HERK C Z Hermitian rank-k update
P?SYR2K S D C Z Symmetric rank-2k update
P?HER2K C Z Hermitian rank-2k update
P?TRAN S D Matrix transpose, real
P?TRANU C Z Matrix transpose, complex
P?TRANC C Z Matrix transpose, complex, conjugate
P?TRMM S D C Z Triangular matrix-matrix multiply
P?TRSM S D C Z Triangular solve

BLACS Routine List

 ?  indicates prefix which must be filled with a combination of:
Fortran: i = integer, s = REAL(KIND=4), d = REAL(KIND=8), c = COMPLEX(KIND=4), z = COMPLEX(KIND=8)
C: i = int, s = float, d = double, c = complex, z = double complex
  C Name Fortran Name Prefixes Description
Initialization Cblacs_pinfo blacs_pinfo   Get initial system information that is required before BLACS is set up
Cblacs_setup blacs_setup   Functionally equivalent to blas_pinfo
Cblacs_get blacs_get   Returns values BLACS is using for internal defaults
Cblacs_set blacs_set   Sets BLACS internal defaults
Cblacs_gridinit blacs_gridinit   Assigns processors to BLACS process grid
Cblacs_gridmap blacs_gridmap   Assigns processors to BLACS process grid in arbitrary manner
Destruction Cblacs_freebuff blacs_freebuff   Releases BLACS buffer
Cblacs_gridexit blacs_gridexit   Frees a BLACS context
Cblacs_abort blacs_abort   Aborts all BLACS processes
Cblacs_exit blacs_exit   Frees all BLACS contexts and allocated memory
Sending C?gesd2d ?gesd2d s d c z i General send 2-d
C?gebs2d ?gebs2d s d c z i General broadcast send 2-d
C?trsd2d ?trsd2d s d c z i Trapezoidal send 2-d
C?trbs2d ?trbs2d s d c z i Trapezoidal broadcast send 2-d
Receiving C?gerv2d ?gerv2d s d c z i General receive
C?gebr2d ?gebr2d s d c z i General broadcast receive
C?trrv2d ?trrv2d s d c z i Trapezoidal receive
C?trbr2d ?trbr2d s d c z i Trapezoidal broadcast receive
Combine C?gamx2d ?gamx2d s d c z i General element-wise absolute value maximum
C?gamn2d ?gamn2d s d c z i General element-wise absolute value minimum
C?gsum2d ?gsum2d s d c z i General element-wise summation
Information and Miscellaneous Cblacs_gridinfo blacs_gridinfo   Returns information on BLACS grid
Cblacs_pnum blacs_pnum   Returns system process number
Cblacs_pcoord blacs_pcoord   Returns row and col in BLACS process grid
Cblacs_barrier blacs_barrier   Holds up execution of all processes till all processes call this routine
Non-Standard Csetpvmtids setpvmtids   PVM routine, not used
Cdcputime00 dcputime00   Returns CPU seconds since arbitrary starting point
Cdwalltime00 dwalltime00   Returns wall clock seconds since arbitrary starting point
Cksendid ksendid   Returns BLACS message ID
Cdrecvid drecvid   Returns BLACS message ID for receive
Ckbsid kbsid   Returns BLACS message ID for source
Ckbrid kbrid   Returns BLACS message ID for destination in broadcast

Further Reading

  1. ScaLAPACK User's Guide
  2. ScaLAPACK Quick Reference
  3. PBLAS Quick Reference
  4. BLACS Quick Reference
  5. A hardcopy ScaLAPACK User's Guide may be ordered from SIAM:

    SIAM Customer Service
    P.O. Box 7260
    Philadelphia, PA 19104

    USA: 800-447-7426
    Worldwide: 215-382-9800
    FAX: 215-386-7999

    service@siam.org
    http://www.siam.org/