Error in La.svd(x, nu, nv): error code 1 from Lapack routine 'dgesdd' #82

fataltes · 2024-05-03T17:16:07Z

Hi,

Running rgcca_cv with the following options:

rgcca_cv(blocks, response = 4, method = "rgcca",
                   par_type = "ncomp",
                   par_value = grid_search_matrix,
                   init = "rand",
                   tau = 0.5,
                   prediction_model = "glmnet",
                   metric = "F1",
                   k=3, n_run = 5,
                   verbose = TRUE, 
                   quiet = FALSE)

I get the following error:

Error in La.svd(x, nu, nv): error code 1 from Lapack routine 'dgesdd'
Traceback:

1. system.time(ncomp_cv_out <- ncomp_cv(blocks, grid_search_matrix, 
 .     method))
2. ncomp_cv(blocks, grid_search_matrix, method)
3. rgcca_cv(blocks, response = 4, method = cur_method, par_type = "ncomp", 
 .     par_value = grid_search_matrix, init = "rand", tau = 0.5, 
 .     prediction_model = "glmnet", metric = "F1", k = 3, n_run = 5, 
 .     verbose = TRUE, quiet = FALSE)   # at line 4-14 of file <text>
4. par_pblapply(idx, function(n) {
 .     i <- (n - 1)%/%length(v_inds) + 1
 .     j <- (n - 1)%%length(v_inds) + 1
 .     rgcca_cv_k(rgcca_args, inds = v_inds[[j]], metric = metric, 
 .         par_type = param$par_type, par_value = param$par_value[i, 
 .             ], prediction_model = model$prediction_model, ...)
 . }, n_cores = n_cores, verbose = verbose)
5. pbapply::pblapply(X, FUN, ..., cl = cl)
6. lapply(X, FUN, ...)
7. FUN(X[[i]], ...)
8. rgcca_cv_k(rgcca_args, inds = v_inds[[j]], metric = metric, par_type = param$par_type, 
 .     par_value = param$par_value[i, ], prediction_model = model$prediction_model, 
 .     ...)
9. do.call(rgcca, rgcca_args) ...

Data set is of size 927 * 986 (so number_features > number_samples)
No perfect or high colinearity in the data because I have removed highly colinear features by removing one of any pair of features having correlation >= 0.95
Since the error seems to come from SVD, I have tried replacing any arguments that thought would involve calling SVD, therefore
** replacing init="svd" with init="rand"
** and replacing prediction_model = "lda" with prediction_model = "glmnet"
And I still get the error.
I was not getting any such error using previous versions of RGCCA until I updated to the most recent version (December 2023 release) or even running rgcca function itself in current release (with one example of nComp and tau), but get the error on the rgcca_cv.

What do you think can be causing this and if you have any suggestions to try?

Thank you,
Fatemeh

The text was updated successfully, but these errors were encountered:

GFabien · 2024-05-05T15:51:41Z

Hey @fataltes!

Thank you for raising this issue. This problem is related to issues with the R svd function, but some people managed to find a fix in other packages relying on the SVD. I tried replicating their fix in https://github.com/rgcca-factory/RGCCA/tree/fix_lapack_error_with_svd. Can you try out the branch to see if it solves your problem?

Best,
Fabien

fataltes · 2024-05-06T18:39:44Z

Thanks for your quick response @GFabien,

I switched to that branch and now the error changed to a NAN case error. Here is the traceback:

Error in if (any(x$a != 0)) {: missing value where TRUE/FALSE needed
Traceback:

1. system.time(ncomp_cv_out <- ncomp_cv(blocks, grid_search_matrix, 
 .     method))
2. ncomp_cv(blocks, grid_search_matrix, method)
3. rgcca_cv(blocks, response = 4, method = cur_method, par_type = "ncomp", 
 .     par_value = grid_search_matrix, init = "rand", tau = 0.5, 
 .     prediction_model = "glmnet", metric = "F1", k = 3, n_run = 5, 
 .     verbose = TRUE, quiet = FALSE)   # at line 4-14 of file <text>
4. par_pblapply(idx, function(n) {
 .     i <- (n - 1)%/%length(v_inds) + 1
 .     j <- (n - 1)%%length(v_inds) + 1
 .     rgcca_cv_k(rgcca_args, inds = v_inds[[j]], metric = metric, 
 .         par_type = param$par_type, par_value = param$par_value[i, 
 .             ], prediction_model = model$prediction_model, ...)
 . }, n_cores = n_cores, verbose = verbose)
5. pbapply::pblapply(X, FUN, ..., cl = cl)
6. lapply(X, FUN, ...)
7. FUN(X[[i]], ...)
8. rgcca_cv_k(rgcca_args, inds = v_inds[[j]], metric = metric, par_type = param$par_type, 
 .     par_value = param$par_value[i, ], prediction_model = model$prediction_model, 
 .     ...)
9. do.call(rgcca, rgcca_args)
10. (function (blocks, connection = NULL, tau = 1, ncomp = 1, scheme = "factorial", 
  .     scale = TRUE, init = "svd", bias = TRUE, tol = 1e-08, verbose = FALSE, 
  .     scale_block = "inertia", method = "rgcca", sparsity = 1, 
  .     response = NULL, superblock = FALSE, NA_method = "na.ignore", 
  .     quiet = TRUE, n_iter_max = 1000, comp_orth = TRUE, A = NULL, 
  .     C = NULL) 
  . {
  .     if (!missing(A)) {
  .         warning("Argument A is deprecated, use blocks instead.")
  .         blocks <- A
  .     }
  .     if (!missing(C)) {
  .         warning("Argument C is deprecated, use connection instead.")
  .         connection <- C
  .     }
  .     rgcca_args <- as.list(environment())
  .     tmp <- get_rgcca_args(blocks, rgcca_args)
  .     opt <- tmp$opt
  .     rgcca_args <- tmp$rgcca_args
  .     rgcca_args$quiet <- quiet
  .     rgcca_args$verbose <- verbose
  .     blocks <- remove_null_sd(rgcca_args$blocks)$list_m
  .     if (opt$disjunction) {
  .         blocks[[rgcca_args$response]] <- as_disjunctive(blocks[[rgcca_args$response]])
  .     }
  .     tmp <- handle_NA(blocks, NA_method = rgcca_args$NA_method)
  .     na.rm <- tmp$na.rm
  .     blocks <- scaling(tmp$blocks, scale = rgcca_args$scale, bias = rgcca_args$bias, 
  .         scale_block = rgcca_args$scale_block, na.rm = na.rm)
  .     if (rgcca_args$superblock) {
  .         blocks[["superblock"]] <- Reduce(cbind, blocks)
  .         colnames(blocks[["superblock"]]) <- paste0("s-", colnames(blocks[["superblock"]]))
  .     }
  .     gcca_args <- rgcca_args[c("connection", "ncomp", "scheme", 
  .         "init", "bias", "tol", "verbose", "superblock", "response", 
  .         "n_iter_max", "comp_orth")]
  .     gcca_args[["na.rm"]] <- na.rm
  .     gcca_args[["blocks"]] <- blocks
  .     gcca_args[["disjunction"]] <- opt$disjunction
  .     gcca_args[[opt$param]] <- rgcca_args[[opt$param]]
  .     func_out <- do.call(rgcca_outer_loop, gcca_args)
  .     func_out <- format_output(func_out, rgcca_args, opt, blocks)
  .     class(func_out) <- "rgcca"
  .     invisible(func_out)
  . })(tau = c(0.5, 0.5, 0.5, 0), tol = 1e-08, init = "random", bias = TRUE, 
  .     quiet = TRUE, scale = TRUE, ncomp = c(1, 1, 319, 319), blocks = list( 
.....

The error is apparently coming from here:

RGCCA/R/block_project.R

Line 28 in eefad3b

if (any(x$a != 0)) {

Before the error I also get this warning:

Missing colnames are automatically labeled.

Missing rownames are automatically labeled.

Warning message in sqrt(t(x$alpha) %*% x$M %*% x$K %*% x$alpha):
“NaNs produced”

Which I think is coming from here:

RGCCA/R/block_project.R

Line 40 in eefad3b

t(x$alpha) %*% x$M %*% x$K %*% x$alpha

I'd appreciate if you have any suggestions or updates on the code for me to try.

Thanks,
Fatemeh

GFabien · 2024-05-07T08:30:29Z

Hi Fatemeh,

What I guess is that t(x$alpha) %*% x$M %*% x$K %*% x$alpha produces a negative value so the square root creates a NaN. From a theoretical point of view, the matrix x$M %*% x$K is symmetric and positive semi-definite, so the issue should not happen. However, it might be possible that due to numerical imprecision, the computed matrix has close to zero negative eigenvalues, which should correspond to zero eigenvalues of the theoretical x$M %*% x$K matrix.

Anyway, it means that a = t(x$x) %*% alpha is close to zero so the block does not contribute much to the objective function. I would say that you can reduce the number of components extracted for this particular block to remove the error.

A fix of the code might be to check on the value t(x$alpha) %*% x$M %*% x$K %*% x$alpha to see if it is smaller or equal to zero and set alpha to zero if so. To further investigate, I would need a reproducible example, would you be able to create one?

I hope this helps!

Best,
Fabien

fataltes · 2024-05-07T18:55:25Z

Hi @GFabien ,

Thanks for the response!

Unfortunately, I cannot share the data and am trying to both apply your suggestion and generate a case that reproduces the issue on some random data.

But if you can share some contact, maybe I can follow up with you on a 1:1 meeting and then you can update the repo and issue here with the solution.

Thank you,
Fatemeh

fataltes · 2024-05-16T19:31:11Z

Hi @GFabien ,
That would be great if you can give me some point of contact to reach out for the 1:1 meeting to go over the data and any suggested modifications/reruns by you.

GFabien · 2024-05-17T07:19:15Z

Hi @fataltes,
Sorry for the late reply, I sent you an email to set up a 1:1 meeting.
Best,
Fabien

fataltes · 2024-05-22T19:57:48Z

Hi @GFabien ,

Thanks for updating the code.
It now gives me another error which indicates x$alpha has NAN itself.
I don't know if it is possible or what should one do in such a case.

Thanks,
Fatemeh

GFabien · 2024-05-23T09:02:02Z

Hi @fataltes,

I updated the branch again, you shouldn't have NAN problems anymore. Please tell me if there are new errors or if it is fine now.

Best,
Fabien

fataltes · 2024-05-24T14:34:00Z

Hi @GFabien ,

Thank you,
I tested the code. It actually runs but never finishes.
On a fairly small dataset of size 1000*1050 with three imbalanced blocks, RGCCA_cv has been running for over 10 hours.
I don't think this is expected. right?
Could there be an infinite loop or such a thing happening?

Thank you

GFabien · 2024-05-24T14:46:02Z

Hmmm... There shouldn't be. Did you run with verbose = TRUE? If so did the progress bar stay at 0%?
Is it working if you run RGCCA without cross validation but with the same parameters?

GFabien · 2024-07-04T14:46:20Z

Hi @fataltes,
How are you? Do you still face the problems you reported?
Best,
Fabien

fataltes · 2024-07-11T15:41:39Z

Hi @GFabien ,
Unfortunately yes. I still face the issue and there is another problem which I'm facing:
It takes a loong time to run RGCCA on a small dataset if I activate parallelization on R.

GFabien mentioned this issue May 5, 2024

Create svd_wrapper and copy ginv to avoid LAPACK errors #83

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in La.svd(x, nu, nv): error code 1 from Lapack routine 'dgesdd' #82

Error in La.svd(x, nu, nv): error code 1 from Lapack routine 'dgesdd' #82

fataltes commented May 3, 2024 •

edited

Loading

GFabien commented May 5, 2024

fataltes commented May 6, 2024

GFabien commented May 7, 2024

fataltes commented May 7, 2024

fataltes commented May 16, 2024

GFabien commented May 17, 2024

fataltes commented May 22, 2024 •

edited

Loading

GFabien commented May 23, 2024

fataltes commented May 24, 2024

GFabien commented May 24, 2024

GFabien commented Jul 4, 2024

fataltes commented Jul 11, 2024

Error in La.svd(x, nu, nv): error code 1 from Lapack routine 'dgesdd' #82

Error in La.svd(x, nu, nv): error code 1 from Lapack routine 'dgesdd' #82

Comments

fataltes commented May 3, 2024 • edited Loading

GFabien commented May 5, 2024

fataltes commented May 6, 2024

GFabien commented May 7, 2024

fataltes commented May 7, 2024

fataltes commented May 16, 2024

GFabien commented May 17, 2024

fataltes commented May 22, 2024 • edited Loading

GFabien commented May 23, 2024

fataltes commented May 24, 2024

GFabien commented May 24, 2024

GFabien commented Jul 4, 2024

fataltes commented Jul 11, 2024

fataltes commented May 3, 2024 •

edited

Loading

fataltes commented May 22, 2024 •

edited

Loading