-
Notifications
You must be signed in to change notification settings - Fork 16
SVD of a real matrix
The full example is contained in the SVD.cu file.
The first step towards SVD calculation in CUDA is to initialize the cuSOLVER
, as is required for any other routine of the cuSOLVER
library:
cusolverDnHandle_t solver_handle;
cusolverDnCreate(&solver_handle);
The second step is to allocate the buffer space required by the SVD calculation routine:
cusolveSafeCall(cusolverDnDgesvd_bufferSize(solver_handle, Nrows, Ncols, &work_size));
double *work; gpuErrchk(cudaMalloc(&work, work_size * sizeof(double)));
The buffer space dimension is calculated by cusolverDnDgesvd_bufferSize
.
The third step is the final SVD calculation:
cusolveSafeCall(cusolverDnDgesvd(solver_handle, 'A', 'A', Nrows, Ncols, d_A, Nrows, d_S, d_U, Nrows, d_V, Ncols, work, work_size, NULL, devInfo));
int devInfo_h = 0; gpuErrchk(cudaMemcpy(&devInfo_h, devInfo, sizeof(int), cudaMemcpyDeviceToHost));
if (devInfo_h != 0) std::cout << "Unsuccessful SVD execution\n\n";
Note that cusolverDnDgesvd
returns the matrix V
transposed. Also, U
is the matrix of the left singular vectors, namely, the singular vectors spanning the output space of the matrix, while V
is the matrix of the right singular vectors, namely, the singular vectors spanning the input space of the matrix.
For the example at SVD.cu, the output is
Singular values
d_S[0] = 314.891390715956
d_S[1] = 14.4118456776669
d_S[2] = 0.842607258487678
d_S[3] = 0.0277423461350813
d_S[4] = 0.000710344049837318
U[0,0]=-0.026900
U[1,0]=-0.043393
U[2,0]=-0.091636
U[3,0]=-0.181098
U[4,0]=-0.319440
U[5,0]=-0.513289
U[6,0]=-0.768565
U[0,1]=-0.366199
U[1,1]=-0.428964
U[2,1]=-0.477524
U[3,1]=-0.454763
U[4,1]=-0.323752
U[5,1]=-0.055840
U[6,1]=0.372983
U[0,2]=0.747999
U[1,2]=0.316640
U[2,2]=-0.072851
U[3,2]=-0.312084
U[4,2]=-0.353727
U[5,2]=-0.162134
U[6,2]=0.293467
U[0,3]=-0.530903
U[1,3]=0.565260
U[2,3]=0.392143
U[3,3]=-0.039990
U[4,3]=-0.324668
U[5,3]=-0.264268
U[6,3]=0.260770
U[0,4]=0.152553
U[1,4]=-0.590073
U[2,4]=0.467756
U[3,4]=0.350750
U[4,4]=-0.204309
U[5,4]=-0.423010
U[6,4]=0.256984
U[0,5]=0.007732
U[1,5]=-0.054514
U[2,5]=0.051584
U[3,5]=0.281387
U[4,5]=-0.717133
U[5,5]=0.607745
U[6,5]=-0.177468
U[0,6]=-0.021961
U[1,6]=0.207773
U[2,6]=-0.618898
U[3,6]=0.677639
U[4,6]=-0.081169
U[5,6]=-0.298327
U[6,6]=0.136131
V[0,0]=-0.349562
V[0,1]=-0.395802
V[0,2]=-0.442097
V[0,3]=-0.488660
V[0,4]=-0.535638
V[1,0]=0.637619
V[1,1]=0.405149
V[1,2]=0.114078
V[1,3]=-0.224424
V[1,4]=-0.604909
V[2,0]=0.635177
V[2,1]=-0.323686
V[2,2]=-0.505308
V[2,3]=-0.213766
V[2,4]=0.436743
V[3,0]=-0.258432
V[3,1]=0.705292
V[3,2]=-0.279236
V[3,3]=-0.497617
V[3,4]=0.331934
V[4,0]=0.031788
V[4,1]=-0.277461
V[4,2]=0.676925
V[4,3]=-0.646162
V[4,4]=0.215062
These results can be compared with the following simple Matlab code:
clear all
close all
clc
Nrows = 7;
Ncols = 5;
A = zeros(Nrows, Ncols);
for j = 0 : Nrows - 1
for i = 0 : Ncols - 1
A(j + 1, i + 1) = (i + j * j) * sqrt(i + j);
end
end
[U, S, V] = svd(A);
which outputs
314.8914
14.4118
0.8426
0.0277
0.0007
U =
-0.0269 0.3662 -0.7480 0.5309 0.1526 0.0076 -0.0220
-0.0434 0.4290 -0.3166 -0.5653 -0.5901 -0.0528 0.2082
-0.0916 0.4775 0.0729 -0.3921 0.4678 0.0466 -0.6193
-0.1811 0.4548 0.3121 0.0400 0.3508 0.2868 0.6754
-0.3194 0.3238 0.3537 0.3247 -0.2043 -0.7178 -0.0754
-0.5133 0.0558 0.1621 0.2643 -0.4230 0.6053 -0.3032
-0.7686 -0.3730 -0.2935 -0.2608 0.2570 -0.1764 0.1375
V =
-0.3496 -0.6376 -0.6352 0.2584 0.0318
-0.3958 -0.4051 0.3237 -0.7053 -0.2775
-0.4421 -0.1141 0.5053 0.2792 0.6769
-0.4887 0.2244 0.2138 0.4976 -0.6462
-0.5356 0.6049 -0.4367 -0.3319 0.2151
As it can be seen, the CUDA and Matlab results coincide for the singular values, but not for the singular vectors. However, the singular vectors span the same spaces.