cuBLAS使用(3)
迪丽瓦拉
2025-05-29 07:06:12
0

目录

 cublasgbmv()

 cublasspr()

 ​​​​​​cublasspr2()

 cublasspmv()

 cublasspr()

cublasgemvStridedBatched()

 cublasgemvBatched()

 cublashpr()

 cublasher2()

 cublashpmv()

 cublashbmv()

cublashemv()

cublastrsv()

cublastrmv()

cublastpsv()

 cublastpmv()

cublastbsv()

 cublastbmv()

 cublassyr2()

cublassyr()

 cublassymv()

 cublassbmv()

 cublasger()

cublasgemv()


在本章中,我们将介绍执行矩阵-向量运算的二级基本线性代数子程序(BLAS2)函数。

 cublasgbmv()

cublasStatus_t cublasSgbmv(cublasHandle_t handle, cublasOperation_t trans,int m, int n, int kl, int ku,const float           *alpha,const float           *A, int lda,const float           *x, int incx,const float           *beta,float           *y, int incy)
cublasStatus_t cublasDgbmv(cublasHandle_t handle, cublasOperation_t trans,int m, int n, int kl, int ku,const double          *alpha,const double          *A, int lda,const double          *x, int incx,const double          *beta,double          *y, int incy)
cublasStatus_t cublasCgbmv(cublasHandle_t handle, cublasOperation_t trans,int m, int n, int kl, int ku,const cuComplex       *alpha,const cuComplex       *A, int lda,const cuComplex       *x, int incx,const cuComplex       *beta,cuComplex       *y, int incy)
cublasStatus_t cublasZgbmv(cublasHandle_t handle, cublasOperation_t trans,int m, int n, int kl, int ku,const cuDoubleComplex *alpha,const cuDoubleComplex *A, int lda,const cuDoubleComplex *x, int incx,const cuDoubleComplex *beta,cuDoubleComplex *y, int incy)

此函数支持64位整数接口。
此函数执行带状矩阵向量乘法

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix � lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix � .

alpha

host or device

input

scalar used for multiplication.

AP

device

input

array with � stored in packed format.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

beta

host or device

input

scalar used for multiplication, if beta==0 then y does not have to be a valid input.

y

device

input

vector with n elements.

incy

input

stride between consecutive elements of y.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or incy = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • alpha == NULL or beta == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublasspr()

cublasStatus_t cublasSspr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const float  *alpha,const float  *x, int incx, float  *AP)
cublasStatus_t cublasDspr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const double *alpha,const double *x, int incx, double *AP)

此函数支持64位整数接口。
此函数执行压缩对称秩1更新

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix � lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix � .

alpha

host or device

input

scalar used for multiplication.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

AP

device

in/out

array with � stored in packed format.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • alpha == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 ​​​​​​cublasspr2()

 

cublasStatus_t cublasSspr2(cublasHandle_t handle, cublasFillMode_t uplo,int n, const float  *alpha,const float  *x, int incx,const float  *y, int incy, float  *AP)
cublasStatus_t cublasDspr2(cublasHandle_t handle, cublasFillMode_t uplo,int n, const double *alpha,const double *x, int incx,const double *y, int incy, double *AP)

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix � lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix � .

alpha

host or device

input

scalar used for multiplication.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

y

device

input

vector with n elements.

incy

input

stride between consecutive elements of y.

AP

device

in/out

array with � stored in packed format.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or incy = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • alpha == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublassbmv()

cublasStatus_t cublasSsbmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, int k, const float  *alpha,const float  *A, int lda,const float  *x, int incx,const float  *beta, float *y, int incy)
cublasStatus_t cublasDsbmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, int k, const double *alpha,const double *A, int lda,const double *x, int incx,const double *beta, double *y, int incy)

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

k

input

number of sub- and super-diagonals of matrix A.

alpha

host or device

input

scalar used for multiplication.

A

device

input

array of dimension lda x n with \lda >= k+1.

lda

input

leading dimension of two-dimensional array used to store matrix A.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

beta

host or device

input

scalar used for multiplication, if beta==0 then y does not have to be a valid input.

y

device

in/out

vector with n elements.

incy

input

stride between consecutive elements of y.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or k < 0 or

  • if incx = 0 or incy = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • if alpha == NULL or beta == NULL or

  • lda < (1 + k)

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublasspmv()

cublasStatus_t cublasSspmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const float  *alpha, const float  *AP,const float  *x, int incx, const float  *beta,float  *y, int incy)
cublasStatus_t cublasDspmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const double *alpha, const double *AP,const double *x, int incx, const double *beta,double *y, int incy)

此函数支持64位整数接口。
此函数执行对称压缩矩阵-向量乘法

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix � lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix � .

alpha

host or device

input

scalar used for multiplication.

AP

device

input

array with � stored in packed format.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

beta

host or device

input

scalar used for multiplication, if beta==0 then y does not have to be a valid input.

y

device

input

vector with n elements.

incy

input

stride between consecutive elements of y.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or incy = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • alpha == NULL or beta == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublasspr()

cublasStatus_t cublasSspr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const float  *alpha,const float  *x, int incx, float  *AP)
cublasStatus_t cublasDspr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const double *alpha,const double *x, int incx, double *AP)

此函数支持64位整数接口。
此函数执行压缩对称秩1更新

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix � lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix � .

alpha

host or device

input

scalar used for multiplication.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

AP

device

in/out

array with � stored in packed format.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • alpha == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

cublasgemvStridedBatched()

cublasStatus_t cublasSgemvStridedBatched(cublasHandle_t handle,cublasOperation_t trans,int m, int n,const float           *alpha,const float           *A, int lda,long long int         strideA,const float           *x, int incx,long long int         stridex,const float           *beta,float                 *y, int incy,long long int         stridey,int batchCount)
cublasStatus_t cublasDgemvStridedBatched(cublasHandle_t handle,cublasOperation_t trans,int m, int n,const double          *alpha,const double          *A, int lda,long long int         strideA,const double          *x, int incx,long long int         stridex,const double          *beta,double                *yarray[], int incy,long long int         stridey,int batchCount)
cublasStatus_t cublasCgemvStridedBatched(cublasHandle_t handle,cublasOperation_t trans,int m, int n,const cuComplex       *alpha,const cuComplex       *A, int lda,long long int         strideA,const cuComplex       *x, int incx,long long int         stridex,const cuComplex       *beta,cuComplex             *y, int incy,long long int         stridey,int batchCount)
cublasStatus_t cublasZgemvStridedBatched(cublasHandle_t handle,cublasOperation_t trans,int m, int n,const cuDoubleComplex *alpha,const cuDoubleComplex *A, int lda,long long int         strideA,const cuDoubleComplex *x, int incx,long long int         stridex,const cuDoubleComplex *beta,cuDoubleComplex       *y, int incy,long long int         stridey,int batchCount)
cublasStatus_t cublasHSHgemvStridedBatched(cublasHandle_t handle,cublasOperation_t trans,int m, int n,const float           *alpha,const __half          *A, int lda,long long int         strideA,const __half          *x, int incx,long long int         stridex,const float           *beta,__half                *y, int incy,long long int         stridey,int batchCount)
cublasStatus_t cublasHSSgemvStridedBatched(cublasHandle_t handle,cublasOperation_t trans,int m, int n,const float           *alpha,const __half          *A, int lda,long long int         strideA,const __half          *x, int incx,long long int         stridex,const float           *beta,float                 *y, int incy,long long int         stridey,int batchCount)
cublasStatus_t cublasTSTgemvStridedBatched(cublasHandle_t handle,cublasOperation_t trans,int m, int n,const float           *alpha,const __nv_bfloat16   *A, int lda,long long int         strideA,const __nv_bfloat16   *x, int incx,long long int         stridex,const float           *beta,__nv_bfloat16         *y, int incy,long long int         stridey,int batchCount)
cublasStatus_t cublasTSSgemvStridedBatched(cublasHandle_t handle,cublasOperation_t trans,int m, int n,const float           *alpha,const __nv_bfloat16   *A, int lda,long long int         strideA,const __nv_bfloat16   *x, int incx,long long int         stridex,const float           *beta,float                 *y, int incy,long long int         stridey,int batchCount)

此函数支持64位整数接口。
此函数执行一批矩阵和向量的矩阵向量乘法。该批被认为是“均匀的,”即所有实例对于它们各自的A矩阵、x和y向量具有相同的维数(m,n)、前导维数(lda)、增量(incx,incy)和转置(trans).批处理的每个实例的输入矩阵A和向量x以及输出向量y位于相对于它们在前一实例中的位置的固定元素偏移量处。指向A矩阵的指针、第一个实例的x和y向量以及元素数量的偏移量(strideA、stridex和stridey)由用户传递给函数,这些元素确定输入矩阵和向量的位置,以及未来实例中的输出向量。

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

trans

input

operation op(A[i]) that is non- or (conj.) transpose.

m

input

number of rows of matrix A[i].

n

input

number of columns of matrix A[i].

alpha

host or device

input

scalar used for multiplication.

A

device

input

* pointer to the A matrix corresponding to the first instance of the batch, with dimensions lda x n with lda>=max(1,m).

lda

input

leading dimension of two-dimensional array used to store each matrix A[i].

strideA

input

Value of type long long int that gives the offset in number of elements between A[i] and A[i+1]

x

device

input

* pointer to the x vector corresponding to the first instance of the batch, with each dimension n if trans==CUBLAS_OP_N and m otherwise.

incx

input

stride of each one-dimensional array x[i].

stridex

input

Value of type long long int that gives the offset in number of elements between x[i] and x[i+1]

beta

host or device

input

scalar used for multiplication. If beta == 0y does not have to be a valid input.

y

device

in/out

* pointer to the y vector corresponding to the first instance of the batch, with each dimension m if trans==CUBLAS_OP_N and n otherwise. Vectors y[i] should not overlap; otherwise, undefined behavior is expected.

incy

input

stride of each one-dimensional array y[i].

stridey

input

Value of type long long int that gives the offset in number of elements between y[i] and y[i+1]

batchCount

input

number of GEMVs to perform in the batch.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

the parameters m,n,batchCount<0

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublasgemvBatched()

cublasStatus_t cublasSgemvBatched(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const float           *alpha,const float           *Aarray[], int lda,const float           *xarray[], int incx,const float           *beta,float           *yarray[], int incy,int batchCount)
cublasStatus_t cublasDgemvBatched(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const double          *alpha,const double          *Aarray[], int lda,const double          *xarray[], int incx,const double          *beta,double          *yarray[], int incy,int batchCount)
cublasStatus_t cublasCgemvBatched(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const cuComplex       *alpha,const cuComplex       *Aarray[], int lda,const cuComplex       *xarray[], int incx,const cuComplex       *beta,cuComplex       *yarray[], int incy,int batchCount)
cublasStatus_t cublasZgemvBatched(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const cuDoubleComplex *alpha,const cuDoubleComplex *Aarray[], int lda,const cuDoubleComplex *xarray[], int incx,const cuDoubleComplex *beta,cuDoubleComplex *yarray[], int incy,int batchCount)
cublasStatus_t cublasHSHgemvBatched(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const float           *alpha,const __half          *Aarray[], int lda,const __half          *xarray[], int incx,const float           *beta,__half                *yarray[], int incy,int batchCount)
cublasStatus_t cublasHSSgemvBatched(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const float           *alpha,const __half          *Aarray[], int lda,const __half          *xarray[], int incx,const float           *beta,float                 *yarray[], int incy,int batchCount)
cublasStatus_t cublasTSTgemvBatched(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const float           *alpha,const __nv_bfloat16   *Aarray[], int lda,const __nv_bfloat16   *xarray[], int incx,const float           *beta,__nv_bfloat16         *yarray[], int incy,int batchCount)
cublasStatus_t cublasTSSgemvBatched(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const float           *alpha,const __nv_bfloat16   *Aarray[], int lda,const __nv_bfloat16   *xarray[], int incx,const float           *beta,float                 *yarray[], int incy,int batchCount)

此函数支持64位整数接口。
此函数执行一批矩阵和向量的矩阵向量乘法。该批被认为是“均匀的,”即所有实例对于它们各自的A矩阵、x和y向量具有相同的维数(m,n)、前导维数(lda)、增量(incx,incy)和转置(trans).输入矩阵和向量的地址以及批处理的每个实例的输出向量是从调用方传递给函数的指针数组中读取的。

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

trans

input

operation op(A[i]) that is non- or (conj.) transpose.

m

input

number of rows of matrix A[i].

n

input

number of columns of matrix A[i].

alpha

host or device

input

scalar used for multiplication.

Aarray

device

input

array of pointers to array, with each array of dim. lda x n with lda>=max(1,m).

All pointers must meet certain alignment criteria. Please see below for details.

lda

input

leading dimension of two-dimensional array used to store each matrix A[i].

xarray

device

input

array of pointers to array, with each dimension n if trans==CUBLAS_OP_N and m otherwise.

All pointers must meet certain alignment criteria. Please see below for details.

incx

input

stride of each one-dimensional array x[i].

beta

host or device

input

scalar used for multiplication. If beta == 0y does not have to be a valid input.

yarray

device

in/out

array of pointers to array. It has dimensions m if trans==CUBLAS_OP_N and n otherwise. Vectors y[i] should not overlap; otherwise, undefined behavior is expected.

All pointers must meet certain alignment criteria. Please see below for details.

incy

input

stride of each one-dimensional array y[i].

batchCount

input

number of pointers contained in Aarray, xarray and yarray.

If math mode enables fast math modes when using cublasSgemvBatched(), pointers (not the pointer arrays) placed in the GPU memory must be properly aligned to avoid misaligned memory access errors. Ideally all pointers are aligned to at least 16 Bytes. Otherwise it is recommended that they meet the following rule:

  • if k % 4==0 then ensure intptr_t(ptr) % 16 == 0,

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

the parameters m,n,batchCount<0

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublashpr()

cublasStatus_t cublasChpr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const float *alpha,const cuComplex       *x, int incx,cuComplex       *AP)
cublasStatus_t cublasZhpr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const double *alpha,const cuDoubleComplex *x, int incx,cuDoubleComplex *AP)

此函数支持64位整数接口。
此函数执行压缩厄米特秩-1更新

 

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other Hermitian part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

alpha

host or device

input

scalar used for multiplication.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

A

device

in/out

array of dimensions lda x n, with lda>=max(1,n). The imaginary parts of the diagonal elements are assumed and set to zero.

lda

input

leading dimension of two-dimensional array used to store matrix A.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx == 0 or

  • if uplo != CUBLAS_FILL_MODE_UPPERCUBLAS_FILL_MODE_LOWER or

  • if lda < max(1, n) or

  • alpha == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublasher2()

cublasStatus_t cublasCher2(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuComplex       *alpha,const cuComplex       *x, int incx,const cuComplex       *y, int incy,cuComplex       *A, int lda)
cublasStatus_t cublasZher2(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuDoubleComplex *alpha,const cuDoubleComplex *x, int incx,const cuDoubleComplex *y, int incy,cuDoubleComplex *A, int lda)

此函数支持64位整数接口。
此函数执行厄米特秩2更新

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other Hermitian part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

alpha

host or device

input

scalar used for multiplication.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

y

device

input

vector with n elements.

incy

input

stride between consecutive elements of y.

A

device

in/out

array of dimension lda x n with lda>=max(1,n). The imaginary parts of the diagonal elements are assumed and set to zero.

lda

input

leading dimension of two-dimensional array used to store matrix A.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx == 0 or incy == 0 or

  • if uplo != CUBLAS_FILL_MODE_UPPERCUBLAS_FILL_MODE_LOWER or

  • if lda < max(1, n) or

  • alpha == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 

 

 cublashpmv()

cublasStatus_t cublasChpmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuComplex       *alpha,const cuComplex       *AP,const cuComplex       *x, int incx,const cuComplex       *beta,cuComplex       *y, int incy)
cublasStatus_t cublasZhpmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuDoubleComplex *alpha,const cuDoubleComplex *AP,const cuDoubleComplex *x, int incx,const cuDoubleComplex *beta,cuDoubleComplex *y, int incy)

 

此功能支持64位整数接口。此功能执行Hermitian包装的矩阵矢量乘法

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other Hermitian part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

alpha

host or device

input

scalar used for multiplication.

AP

device

input

array with A stored in packed format. The imaginary parts of the diagonal elements are assumed to be zero.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

beta

host or device

input

scalar used for multiplication, if beta==0 then y does not have to be a valid input.

y

device

in/out

vector with n elements.

incy

input

stride between consecutive elements of y.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx == 0 or incy == 0 or

  • if uplo != CUBLAS_FILL_MODE_UPPERCUBLAS_FILL_MODE_LOWER or

  • alpha == NULL or beta == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublashbmv()

cublasStatus_t cublasChbmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, int k, const cuComplex       *alpha,const cuComplex       *A, int lda,const cuComplex       *x, int incx,const cuComplex       *beta,cuComplex       *y, int incy)
cublasStatus_t cublasZhbmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, int k, const cuDoubleComplex *alpha,const cuDoubleComplex *A, int lda,const cuDoubleComplex *x, int incx,const cuDoubleComplex *beta,cuDoubleComplex *y, int incy)

此函数支持64位整数接口。
This function performs the Hermitian banded matrix-vector multiplication

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other Hermitian part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

k

input

number of sub- and super-diagonals of matrix A.

alpha

host or device

input

scalar used for multiplication.

A

device

input

array of dimensions lda x n, with lda>=k+1. The imaginary parts of the diagonal elements are assumed to be zero.

lda

input

leading dimension of two-dimensional array used to store matrix A.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

beta

host or device

input

scalar used for multiplication, if beta==0 then does not have to be a valid input.

y

device

in/out

vector with n elements.

incy

input

stride between consecutive elements of y.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or k < 0 or

  • if incx = 0 or incy = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • if lda < (k + 1) or

  • alpha == NULL or beta == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

cublashemv()

cublasStatus_t cublasChemv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuComplex       *alpha,const cuComplex       *A, int lda,const cuComplex       *x, int incx,const cuComplex       *beta,cuComplex       *y, int incy)
cublasStatus_t cublasZhemv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuDoubleComplex *alpha,const cuDoubleComplex *A, int lda,const cuDoubleComplex *x, int incx,const cuDoubleComplex *beta,cuDoubleComplex *y, int incy)

此函数支持64位整数接口。
此函数执行厄米特矩阵-向量乘法

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other Hermitian part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

alpha

host or device

input

scalar used for multiplication.

A

device

input

array of dimension lda x n, with lda>=max(1,n). The imaginary parts of the diagonal elements are assumed to be zero.

lda

input

leading dimension of two-dimensional array used to store matrix A.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

beta

host or device

input

scalar used for multiplication, if beta==0 then y does not have to be a valid input.

y

device

in/out

vector with n elements.

incy

input

stride between consecutive elements of y.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or incy = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • lda < n

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

cublastrsv()

cublasStatus_t cublasStrsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const float           *A, int lda,float           *x, int incx)
cublasStatus_t cublasDtrsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const double          *A, int lda,double          *x, int incx)
cublasStatus_t cublasCtrsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const cuComplex       *A, int lda,cuComplex       *x, int incx)
cublasStatus_t cublasZtrsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const cuDoubleComplex *A, int lda,cuDoubleComplex *x, int incx)

此函数支持64位整数接口。
此函数求解具有单个右侧边的三角线性系统

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements.

trans

input

operation op(A) that is non- or (conj.) transpose.

diag

input

indicates if the elements on the main diagonal of matrix A are unity and should not be accessed.

n

input

number of rows and columns of matrix A.

A

device

input

array of dimension lda x n, with lda>=max(1,n).

lda

input

leading dimension of two-dimensional array used to store matrix A.

x

device

in/out

vector with n elements.

incx

input

stride between consecutive elements of x.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or

  • if trans != CUBLAS_OP_NCUBLAS_OP_CCUBLAS_OP_T or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • if diag != CUBLAS_DIAG_UNITCUBLAS_DIAG_NON_UNIT or

  • lda < max(1, n)

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

cublastrmv()

cublasStatus_t cublasStrmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const float           *A, int lda,float           *x, int incx)
cublasStatus_t cublasDtrmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const double          *A, int lda,double          *x, int incx)
cublasStatus_t cublasCtrmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const cuComplex       *A, int lda,cuComplex       *x, int incx)
cublasStatus_t cublasZtrmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const cuDoubleComplex *A, int lda,cuDoubleComplex *x, int incx)

此函数支持64位整数接口。
此函数执行三角矩阵-向量乘法

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements.

trans

input

operation op(A) (that is, non- or conj.) transpose.

diag

input

indicates if the elements on the main diagonal of matrix A are unity and should not be accessed.

n

input

number of rows and columns of matrix A.

A

device

input

array of dimensions lda x n , with lda>=max(1,n).

lda

input

leading dimension of two-dimensional array used to store matrix A.

x

device

in/out

vector with n elements.

incx

input

stride between consecutive elements of x.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or

  • if trans != CUBLAS_OP_NCUBLAS_OP_CCUBLAS_OP_T or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • if diag != CUBLAS_DIAG_UNITCUBLAS_DIAG_NON_UNIT or

  • lda < max(1, n)

CUBLAS_STATUS_ALLOC_FAILED

the allocation of internal scratch memory failed

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

cublastpsv()

cublasStatus_t cublasStpsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const float           *AP,float           *x, int incx)
cublasStatus_t cublasDtpsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const double          *AP,double          *x, int incx)
cublasStatus_t cublasCtpsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const cuComplex       *AP,cuComplex       *x, int incx)
cublasStatus_t cublasZtpsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const cuDoubleComplex *AP,cuDoubleComplex *x, int incx)

此函数支持64位整数接口。
此函数求解具有单个右侧边的压缩三角线性系统

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements.

trans

input

operation op(A) that is non- or (conj.) transpose.

diag

input

indicates if the elements on the main diagonal of matrix are unity and should not be accessed.

n

input

number of rows and columns of matrix A.

AP

device

input

array with A stored in packed format.

x

device

in/out

vector with n elements.

incx

input

stride between consecutive elements of x.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or

  • if trans != CUBLAS_OP_NCUBLAS_OP_CCUBLAS_OP_T or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • diag != CUBLAS_DIAG_UNITCUBLAS_DIAG_NON_UNIT

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublastpmv()

cublasStatus_t cublasStpmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const float           *AP,float           *x, int incx)
cublasStatus_t cublasDtpmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const double          *AP,double          *x, int incx)
cublasStatus_t cublasCtpmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const cuComplex       *AP,cuComplex       *x, int incx)
cublasStatus_t cublasZtpmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, const cuDoubleComplex *AP,cuDoubleComplex *x, int incx)

此函数支持64位整数接口。
此函数求解具有单个右侧边的压缩三角线性系统

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements.

trans

input

operation op(A) that is non- or (conj.) transpose.

diag

input

indicates if the elements on the main diagonal of matrix A are unity and should not be accessed.

n

input

number of rows and columns of matrix A.

AP

device

input

array with � stored in packed format.

x

device

in/out

vector with n elements.

incx

input

stride between consecutive elements of x.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx == 0 or

  • if uplo != CUBLAS_FILL_MODE_UPPER, CUBLAS_FILL_MODE_LOWER or

  • if trans != CUBLAS_OP_N, CUBLAS_OP_T, CUBLAS_OP_C or

  • diag != CUBLAS_DIAG_UNIT, CUBLAS_DIAG_NON_UNIT

CUBLAS_STATUS_ALLOC_FAILED

the allocation of internal scratch memory failed

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

cublastbsv()

cublasStatus_t cublasStbsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, int k, const float           *A, int lda,float           *x, int incx)
cublasStatus_t cublasDtbsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, int k, const double          *A, int lda,double          *x, int incx)
cublasStatus_t cublasCtbsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, int k, const cuComplex       *A, int lda,cuComplex       *x, int incx)
cublasStatus_t cublasZtbsv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, int k, const cuDoubleComplex *A, int lda,cuDoubleComplex *x, int incx)

此函数支持64位整数接口。
此函数求解具有单个右侧边的三角带状线性系统

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements.

trans

input

operation op(A) that is non- or (conj.) transpose.

diag

input

indicates if the elements on the main diagonal of matrix A are unity and should not be accessed.

n

input

number of rows and columns of matrix A.

k

input

number of sub- and super-diagonals of matrix A.

A

device

input

array of dimension lda x n, with lda >= k+1.

lda

input

leading dimension of two-dimensional array used to store matrix A.

x

device

in/out

vector with n elements.

incx

input

stride between consecutive elements of x.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or k < 0 or

  • if incx = 0 or

  • if trans != CUBLAS_OP_NCUBLAS_OP_CCUBLAS_OP_T or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • if diag != CUBLAS_DIAG_UNITCUBLAS_DIAG_NON_UNIT or

  • lda < (1 + k)

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublastbmv()

cublasStatus_t cublasStbmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, int k, const float           *A, int lda,float           *x, int incx)
cublasStatus_t cublasDtbmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, int k, const double          *A, int lda,double          *x, int incx)
cublasStatus_t cublasCtbmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, int k, const cuComplex       *A, int lda,cuComplex       *x, int incx)
cublasStatus_t cublasZtbmv(cublasHandle_t handle, cublasFillMode_t uplo,cublasOperation_t trans, cublasDiagType_t diag,int n, int k, const cuDoubleComplex *A, int lda,cuDoubleComplex *x, int incx)

此函数支持64位整数接口。
此函数用于执行三角带矩阵向量乘法

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other part is not referenced and is inferred from the stored elements.

trans

input

operation op(A) that is non- or (conj.) transpose.

diag

input

indicates if the elements on the main diagonal of matrix A are unity and should not be accessed.

n

input

number of rows and columns of matrix A.

k

input

number of sub- and super-diagonals of matrix .

A

device

input

array of dimension lda x n, with lda>=k+1.

lda

input

leading dimension of two-dimensional array used to store matrix A.

x

device

in/out

vector with n elements.

incx

input

stride between consecutive elements of x.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or k < 0 or

  • if incx = 0 or

  • if trans != CUBLAS_OP_NCUBLAS_OP_CCUBLAS_OP_T or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • if diag != CUBLAS_DIAG_UNITCUBLAS_DIAG_NON_UNIT or

  • lda < (1 + k)

CUBLAS_STATUS_ALLOC_FAILED

the allocation of internal scratch memory failed

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublassyr2()

cublasStatus_t cublasSsyr2(cublasHandle_t handle, cublasFillMode_t uplo, int n,const float           *alpha, const float           *x, int incx,const float           *y, int incy, float           *A, int lda
cublasStatus_t cublasDsyr2(cublasHandle_t handle, cublasFillMode_t uplo, int n,const double          *alpha, const double          *x, int incx,const double          *y, int incy, double          *A, int lda
cublasStatus_t cublasCsyr2(cublasHandle_t handle, cublasFillMode_t uplo, int n,const cuComplex       *alpha, const cuComplex       *x, int incx,const cuComplex       *y, int incy, cuComplex       *A, int lda
cublasStatus_t cublasZsyr2(cublasHandle_t handle, cublasFillMode_t uplo, int n,const cuDoubleComplex *alpha, const cuDoubleComplex *x, int incx,const cuDoubleComplex *y, int incy, cuDoubleComplex *A, int lda

此函数支持64位整数接口。
此函数执行对称秩2更新

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

alpha

host or device

input

scalar used for multiplication.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

y

device

input

vector with n elements.

incy

input

stride between consecutive elements of y.

A

device

in/out

array of dimensions lda x n, with lda>=max(1,n).

lda

input

leading dimension of two-dimensional array used to store matrix A.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or incy = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • if alpha == NULL or

  • lda < max(1, n)

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

cublassyr()

cublasStatus_t cublasSsyr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const float           *alpha,const float           *x, int incx, float           *A, int lda)
cublasStatus_t cublasDsyr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const double          *alpha,const double          *x, int incx, double          *A, int lda)
cublasStatus_t cublasCsyr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuComplex       *alpha,const cuComplex       *x, int incx, cuComplex       *A, int lda)
cublasStatus_t cublasZsyr(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuDoubleComplex *alpha,const cuDoubleComplex *x, int incx, cuDoubleComplex *A, int lda)

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

alpha

host or device

input

scalar used for multiplication.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

A

device

in/out

array of dimensions lda x n, with lda>=max(1,n).

lda

input

leading dimension of two-dimensional array used to store matrix A.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • if lda < max(1, n) or

  • alpha == NULL

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublassymv()

cublasStatus_t cublasSsymv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const float           *alpha,const float           *A, int lda,const float           *x, int incx, const float           *beta,float           *y, int incy)
cublasStatus_t cublasDsymv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const double          *alpha,const double          *A, int lda,const double          *x, int incx, const double          *beta,double          *y, int incy)
cublasStatus_t cublasCsymv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuComplex       *alpha, /* host or device pointer */const cuComplex       *A, int lda,const cuComplex       *x, int incx, const cuComplex       *beta,cuComplex       *y, int incy)
cublasStatus_t cublasZsymv(cublasHandle_t handle, cublasFillMode_t uplo,int n, const cuDoubleComplex *alpha,const cuDoubleComplex *A, int lda,const cuDoubleComplex *x, int incx, const cuDoubleComplex *beta,cuDoubleComplex *y, int incy)

此函数支持64位整数接口。
此函数执行对称矩阵向量乘法。

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

alpha

host or device

input

scalar used for multiplication.

A

device

input

array of dimension lda x n with lda>=max(1,n).

lda

input

leading dimension of two-dimensional array used to store matrix A.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

beta

host or device

input

scalar used for multiplication, if beta==0 then y does not have to be a valid input.

y

device

in/out

vector with n elements.

incy

input

stride between consecutive elements of y.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or

  • if incx = 0 or incy = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • lda < n

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublassbmv()

cublasStatus_t cublasSsbmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, int k, const float  *alpha,const float  *A, int lda,const float  *x, int incx,const float  *beta, float *y, int incy)
cublasStatus_t cublasDsbmv(cublasHandle_t handle, cublasFillMode_t uplo,int n, int k, const double *alpha,const double *A, int lda,const double *x, int incx,const double *beta, double *y, int incy)

This function supports the 64-bit Integer Interface.

This function performs the symmetric banded matrix-vector multiplication

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

uplo

input

indicates if matrix A lower or upper part is stored, the other symmetric part is not referenced and is inferred from the stored elements.

n

input

number of rows and columns of matrix A.

k

input

number of sub- and super-diagonals of matrix A.

alpha

host or device

input

scalar used for multiplication.

A

device

input

array of dimension lda x n with \lda >= k+1.

lda

input

leading dimension of two-dimensional array used to store matrix A.

x

device

input

vector with n elements.

incx

input

stride between consecutive elements of x.

beta

host or device

input

scalar used for multiplication, if beta==0 then y does not have to be a valid input.

y

device

in/out

vector with n elements.

incy

input

stride between consecutive elements of y.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If n < 0 or k < 0 or

  • if incx = 0 or incy = 0 or

  • if uplo != CUBLAS_FILL_MODE_LOWERCUBLAS_FILL_MODE_UPPER or

  • if alpha == NULL or beta == NULL or

  • lda < (1 + k)

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

 cublasger()

cublasStatus_t  cublasSger(cublasHandle_t handle, int m, int n,const float           *alpha,const float           *x, int incx,const float           *y, int incy,float           *A, int lda)
cublasStatus_t  cublasDger(cublasHandle_t handle, int m, int n,const double          *alpha,const double          *x, int incx,const double          *y, int incy,double          *A, int lda)
cublasStatus_t cublasCgeru(cublasHandle_t handle, int m, int n,const cuComplex       *alpha,const cuComplex       *x, int incx,const cuComplex       *y, int incy,cuComplex       *A, int lda)
cublasStatus_t cublasCgerc(cublasHandle_t handle, int m, int n,const cuComplex       *alpha,const cuComplex       *x, int incx,const cuComplex       *y, int incy,cuComplex       *A, int lda)
cublasStatus_t cublasZgeru(cublasHandle_t handle, int m, int n,const cuDoubleComplex *alpha,const cuDoubleComplex *x, int incx,const cuDoubleComplex *y, int incy,cuDoubleComplex *A, int lda)
cublasStatus_t cublasZgerc(cublasHandle_t handle, int m, int n,const cuDoubleComplex *alpha,const cuDoubleComplex *x, int incx,const cuDoubleComplex *y, int incy,cuDoubleComplex *A, int lda)

This function supports the 64-bit Integer Interface.

This function performs the rank-1 update

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

m

input

number of rows of matrix A.

n

input

number of columns of matrix A.

alpha

host or device

input

scalar used for multiplication.

x

device

input

vector with m elements.

incx

input

stride between consecutive elements of x.

y

device

input

vector with n elements.

incy

input

stride between consecutive elements of y.

A

device

in/out

array of dimension lda x n with lda >= max(1,m).

lda

input

leading dimension of two-dimensional array used to store matrix A.

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

  • If m < 0 or n < 0

  • if incx = 0 or incy = 0 or

  • if alpha == NULL or

  • lda < max(1, m)

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

cublasgemv()

cublasStatus_t cublasSgemv(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const float           *alpha,const float           *A, int lda,const float           *x, int incx,const float           *beta,float           *y, int incy)
cublasStatus_t cublasDgemv(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const double          *alpha,const double          *A, int lda,const double          *x, int incx,const double          *beta,double          *y, int incy)
cublasStatus_t cublasCgemv(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const cuComplex       *alpha,const cuComplex       *A, int lda,const cuComplex       *x, int incx,const cuComplex       *beta,cuComplex       *y, int incy)
cublasStatus_t cublasZgemv(cublasHandle_t handle, cublasOperation_t trans,int m, int n,const cuDoubleComplex *alpha,const cuDoubleComplex *A, int lda,const cuDoubleComplex *x, int incx,const cuDoubleComplex *beta,cuDoubleComplex *y, int incy)

 

Param.

Memory

In/out

Meaning

handle

input

handle to the cuBLAS library context.

trans

input

operation op(A) that is non- or (conj.) transpose.

m

input

number of rows of matrix A.

n

input

number of columns of matrix A.

alpha

host or device

input

scalar used for multiplication.

A

device

input

array of dimension lda x n with lda >= max(1,m). Before entry, the leading m by n part of the array A must contain the matrix of coefficients. Unchanged on exit.

lda

input

leading dimension of two-dimensional array used to store matrix Alda must be at least max(1,m).

x

device

input

vector at least (1+(n-1)*abs(incx)) elements if transa==CUBLAS_OP_N and at least (1+(m-1)*abs(incx)) elements otherwise.

incx

input

stride between consecutive elements of x.

beta

host or device

input

scalar used for multiplication, if beta==0 then y does not have to be a valid input.

y

device

in/out

vector at least (1+(m-1)*abs(incy)) elements if transa==CUBLAS_OP_N and at least (1+(n-1)*abs(incy)) elements otherwise.

incy

input

stride between consecutive elements of y

The possible error values returned by this function and their meanings are listed below.

Error Value

Meaning

CUBLAS_STATUS_SUCCESS

the operation completed successfully

CUBLAS_STATUS_NOT_INITIALIZED

the library was not initialized

CUBLAS_STATUS_INVALID_VALUE

the parameters m,n<0 or incx,incy=0

CUBLAS_STATUS_EXECUTION_FAILED

the function failed to launch on the GPU

相关内容

热门资讯

linux入门---制作进度条 了解缓冲区 我们首先来看看下面的操作: 我们首先创建了一个文件并在这个文件里面添加了...
C++ 机房预约系统(六):学... 8、 学生模块 8.1 学生子菜单、登录和注销 实现步骤: 在Student.cpp的...
A.机器学习入门算法(三):基... 机器学习算法(三):K近邻(k-nearest neigh...
数字温湿度传感器DHT11模块... 模块实例https://blog.csdn.net/qq_38393591/article/deta...
有限元三角形单元的等效节点力 文章目录前言一、重新复习一下有限元三角形单元的理论1、三角形单元的形函数(Nÿ...
Redis 所有支持的数据结构... Redis 是一种开源的基于键值对存储的 NoSQL 数据库,支持多种数据结构。以下是...
win下pytorch安装—c... 安装目录一、cuda安装1.1、cuda版本选择1.2、下载安装二、cudnn安装三、pytorch...
MySQL基础-多表查询 文章目录MySQL基础-多表查询一、案例及引入1、基础概念2、笛卡尔积的理解二、多表查询的分类1、等...
keil调试专题篇 调试的前提是需要连接调试器比如STLINK。 然后点击菜单或者快捷图标均可进入调试模式。 如果前面...
MATLAB | 全网最详细网... 一篇超超超长,超超超全面网络图绘制教程,本篇基本能讲清楚所有绘制要点&#...
IHome主页 - 让你的浏览... 随着互联网的发展,人们越来越离不开浏览器了。每天上班、学习、娱乐,浏览器...
TCP 协议 一、TCP 协议概念 TCP即传输控制协议(Transmission Control ...
营业执照的经营范围有哪些 营业执照的经营范围有哪些 经营范围是指企业可以从事的生产经营与服务项目,是进行公司注册...
C++ 可变体(variant... 一、可变体(variant) 基础用法 Union的问题: 无法知道当前使用的类型是什...
血压计语音芯片,电子医疗设备声... 语音电子血压计是带有语音提示功能的电子血压计,测量前至测量结果全程语音播报࿰...
MySQL OCP888题解0... 文章目录1、原题1.1、英文原题1.2、答案2、题目解析2.1、题干解析2.2、选项解析3、知识点3...
【2023-Pytorch-检... (肆十二想说的一些话)Yolo这个系列我们已经更新了大概一年的时间,现在基本的流程也走走通了,包含数...
实战项目:保险行业用户分类 这里写目录标题1、项目介绍1.1 行业背景1.2 数据介绍2、代码实现导入数据探索数据处理列标签名异...
记录--我在前端干工地(thr... 这里给大家分享我在网上总结出来的一些知识,希望对大家有所帮助 前段时间接触了Th...
43 openEuler搭建A... 文章目录43 openEuler搭建Apache服务器-配置文件说明和管理模块43.1 配置文件说明...