NEC

MathKeisan User's Guide


BLAS and PARBLAS

Introduction

The BLAS (Basic Linear Algebra Subprograms) are high-quality routines for performing basic vector and matrix operations. Level 1 BLAS are for vector-vector operations. Level 2 BLAS are for matrix-vector operations. Level 3 BLAS are for matrix-matrix operations.

MathKeisan for SX contains the libraries BLAS and PARBLAS. MathKeisan for IPF contains a combined serial/parallel BLAS library. Some of the subroutines in BLAS for IPF are from the Intel® Math Kernel Library (MKL).

User Interface

User interface information is available from several sources:

PARBLAS

PARBLAS is an OpenMP parallel version of BLAS available in MathKeisan for SX. PARBLAS has the same user inteface as BLAS, so any code that is linked to BLAS can be alternatively linked to PARBLAS. If the environment variable OMP_NUM_THREADS is set to np, then PARBLAS will run on np threads. If OMP_NUM_THREADS is not set, PARBLAS will run on mp threads, where mp is the maximum number of processes in the resource group.

In MathKeisan for IPF, the BLAS library is a combined serial/parallel library. It will run on one thread if the environment variable OMP_NUM_THREADS is unset, or if the environment variable MKL_SERIAL is set to YES. It will run on np threads if OMP_NUM_THREADS is set to np.

Inlining Level 1 BLAS

The level 1 BLAS subprograms perform a small amount of work relative to their call overhead. So inlining the level 1 BLAS may be faster than calling them. Also, inlining them gives the compiler better opportunities to optimize them if they are called from loops.

By contrast, the level 2 and 3 BLAS usually have a large amount of work to perform, so the call overhead is negligible, and it is best to call them, because they are optimized in special ways.

Source code for the BLAS suitable for inlining is provided in the MathKeisan directory src4inline. The default locations for this directory are listed below.

Host machine Default src4inline Directory
SX native /usr/opt/mathkeisan/inst/src4inline
SX cross-compile /SX/opt/mathkeisan/inst/src4inline
IPF Linux /opt/MathKeisan/src4inline

BLAS Routine List

 ?  indicates prefix which must be filled with a combination of:
S = REAL(kind=4), D = REAL(kind=8), C = COMPLEX(kind=4), Z = COMPLEX(kind=8)
  Name Prefixes Description
Level 1
BLAS
?ROTG S D C Z Generate plane rotation
?ROTMG S D Generate modified plane rotation
?ROT S D C Z Apply plane rotation
?ROTM S D Apply modified plane rotation
?SWAP S D C Z Swap vectors
?SCAL S D C Z CS ZD Scale vector
?COPY S D C Z Copy vector
?AXPY S D C Z Vector scale and add
?DOT S D SDS DS Dot product, real
?DOTU C Z Dot product, complex
?DOTC C Z Dot product, complex, conjugate first vector
?NRM2 S D SC DZ Euclidean norm
?ASUM S D SC DZ Sum absolute values
I?AMAX S D C Z Index of maximum absolute value
Level 2
BLAS
?GEMV S D C Z General matrix-vector multiplication
?GBMV S D C Z General banded matrix-vector multiplication
?HEMV C Z Hermitian matrix-vector multiplication
?HBMV C Z Hermitian banded matrix-vector multiplication
?HPMV C Z Hermitian packed matrix-vector multiplication
?SYMV S D Symmetric matrix-vector multiplication
?SBNV S D Symmetric banded matrix-vector multiplication
?SPMV S D Symmetric packed matrix-vector multiplication
?TRMV S D C Z Triangular matrix-vector multiplication
?TBMV S D C Z Triangular banded matrix-vector multiplication
?TPMV S D C Z Triangular packed matrix-vector multiplication
?TRSV S D C Z Triangular solve
?TBSV S D C Z Triangular banded solve
?TPSV S D C Z Triangular packed solve
?GER S D General rank-1 update, real
?GERU C Z General rank-1 update, complex
?GERC C Z General rank-1 update, complex, second vector conjugate
?HER C Z Hermitian rank-1 update
?HPR C Z Hermitian packed rank-1 update
?HER2 C Z Hermitian rank-2 update
?HPR2 C Z Hermitian packed rank-2 update
?SYR S D Symmetric rank-1 update
?SPR S D Symmetric packed rank-1 update
?SYR2 S D Symmetric rank-2 update
?SPR2 S D Symmetric packed rank-2 update
Level 3
BLAS
?GEMM S D C Z General matrix-matrix multiplication
?SYMM S D C Z Symmetric matrix-matrix multiplication
?HEMM C Z Hermitian matrix-matrix multiplication
?SYRK S D C Z Symmetric rank-k update
?HERK C Z Hermitian rank-k update
?SYR2K S D C Z Symmetric rank-2k update
?HER2K C Z Hermitian rank-2k update
?TRMM S D C Z Triangular matrix-matrix multiply
?TRSM S D C Z Triangular solve
Auxiliary
Subprograms
XERBLA Print argument error messages
MKVERSION Print library version information
MK_GET_VERSION Return library version number as an integer
MK_GET_VERSION_DATE Return library version date as an integer

Further Reading

  1. A Quick Reference Guide for BLAS