You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After collecting a test set of fragCounter coverage profiles for 4 normal samples, I attempted to run the dryclean workflow.
I encountered the following error while trying the first step of creating the PoN in prepare_detergent:
pon_detergent <- prepare_detergent(normal.table.path = "/drycleanRun/test_ton.rds",
use.all = TRUE,
num.cores = 2,
build = "hg38",
path.to.save = "drycleanRun/",
nochr = T,
save.pon = T)
### OUTPUT ###
Starting the preparation of Panel of Normal samples a.k.a detergent
4 samples available
Using all samples
PAR file not provided, using hg38 default. If this is not the correct build, please provide a GRange object delineating for corresponding build
PAR read
Checking for existence of files
4 files present
|=====================================================================================================================| 100%, Elapsed 07:21
Error in setattr(ans, "names", c(keep.names, paste0("V", seq_len(length(ans) - :
'names' attribute [1] must be the same length as the vector [0]
While troubleshooting, it seems like others have encountered the same error, but at a different stage of the workflow (#2).
Based on the output message, it looks like the error occurs within pbmclapply function call at line 259 although I am not exactly sure where.
I then decided to test prepare_detergent under the other possible approaches instead of using all samples.
Interestingly, using either of the two alternative options choose.randomly = TRUE or choose.by.clustering = TRUE both executed without an error.
Here using choose.randomly = TRUE and selecting 2 of the 4 samples:
pon_detergent <- prepare_detergent(normal.table.path = "/drycleanRun/test_ton.rds",
use.all = FALSE,
choose.randomly = TRUE,
number.of.samples = 2,
choose.by.clustering = FALSE,
num.cores = 2,
build = "hg38",
path.to.save = "drycleanRun/",
nochr = T,
save.pon = T)
### OUTPUT ###
Starting the preparation of Panel of Normal samples a.k.a detergent
4 samples available
Selecting 2 normal samples randomly
PAR file not provided, using hg38 default. If this is not the correct build, please provide a GRange object delineating for corresponding build
PAR read
Checking for existence of files
2 files present
|============================================================================================================| 100%, Elapsed 03:28
Starting decomposition
This is version 2
Warning: Item 1 has 3031053 rows but longest item has 15155223; recycled with remainder.Finished making the PON or detergent and saving it to the path provided
And here using choose.by.clustering = TRUE
pon_detergent <- prepare_detergent(normal.table.path = "/drycleanRun/test_ton.rds",
use.all = FALSE,
choose.randomly = FALSE,
number.of.samples = 2,
choose.by.clustering = TRUE,
num.cores = 2,
build = "hg38",
path.to.save = "drycleanRun/",
nochr = T,
save.pon = T)
### OUTPUT ###
Starting the preparation of Panel of Normal samples a.k.a detergent
4 samples available
Starting the clustering
Starting decomposition on a small section of genome
This is version 2
Starting clustering
PAR file not provided, using hg38 default. If this is not the correct build, please provide a GRange object delineating for corresponding build
PAR read
Checking for existence of files
2 files present
|============================================================================================================| 100%, Elapsed 01:52
Starting decomposition
This is version 2
Warning: Item 1 has 3031053 rows but longest item has 15155223; recycled with remainder.Finished making the PON or detergent and saving it to the path provided
The output detergent.rds is in working order as I was able to run start_wash_cycle without any problems.
I will likely use the clustering method for further analysis but wanted to point out this issue for others who encounter it.
Best,
Patrick
The text was updated successfully, but these errors were encountered:
Thanks for letting us know about the error. I have not encountered this before on our samples. What happens if you set number.of.samples to the total number of available samples when choosing randomly?
I finally had some time to test out your suggestion. Unfortunately, using choose.randomly with setting number.of.samples equal to the total number of samples leads to the same error as use.all.
Furthermore, choose.randomly works when I set the number of samples to 2 out of 4 but it fails when I use 3 out of 4.
The same occurs with choose.by.clustering.
I'll keep testing to see if I can determine a pattern or give more information for debugging if others experience the same issue. I plan to greatly increase the input sample size so this may help resolve this as well.
Hello,
After collecting a test set of fragCounter coverage profiles for 4 normal samples, I attempted to run the
dryclean
workflow.I encountered the following error while trying the first step of creating the PoN in
prepare_detergent
:While troubleshooting, it seems like others have encountered the same error, but at a different stage of the workflow (#2).
Based on the output message, it looks like the error occurs within
pbmclapply
function call at line 259 although I am not exactly sure where.I then decided to test
prepare_detergent
under the other possible approaches instead of using all samples.Interestingly, using either of the two alternative options
choose.randomly = TRUE
orchoose.by.clustering = TRUE
both executed without an error.Here using
choose.randomly = TRUE
and selecting 2 of the 4 samples:And here using
choose.by.clustering = TRUE
The output
detergent.rds
is in working order as I was able to runstart_wash_cycle
without any problems.I will likely use the clustering method for further analysis but wanted to point out this issue for others who encounter it.
Best,
Patrick
The text was updated successfully, but these errors were encountered: