-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running block=TRUE crashes during permutations with "Error in .start_as_unnamed_integer(start)" #64
Comments
The input command was: I previously ran it with default values for minInSpan, bpSpan, and maxGapSmooth but it gave a warning to increase them. (Also, what values are recommended for running with block=TRUE?) |
Hi @mpiersonsmela, Thanks for reporting this issue. I'm not able to determine the root cause from the output alone. Can you let me know what (if any) filtering you've done on the input object (e.g. removing loci with low coverage across samples)? If you're able to share a small portion of your input data that reproduces the output (e.g. chromosome 5 in the example above - you can use Regarding your question about parameter settings when using |
OK, I'll upload the example data tomorrow. Regarding the filtering, I'm removing all the loci that have zero reads for all samples in a group. This filtering strategy worked OK when not running block=TRUE. This is the filtering code I'm using:
|
OK, I uploaded the files to your Dropbox link. I wasn't able to figure out how to take a small portion that replicates the results, so I just uploaded the whole thing. |
Thanks again for reporting this issue and for sending me your files to debug. It took me a while to track it down, but I found the implicated line that didn't properly filter loci with zero coverage in all samples of one permuted condition in the permutations during the construction of candidate blocks. I suspect I never encountered this before because I typically use more stringent filters for coverage. I pushed a fix, and was able to run your code successfully. You can access this version on the devel branch of GitHub immediately ( As a side note, I noticed that the number of loci in your dataset and the high number of consecutive genomic coordinates look like you may be including loci on both strands separately. You may consider collapsing strands. Happy to provide some more information on that if you're interested. Thanks again for helping me to improve the software. |
Thanks! I'll follow up with you by email about collapsing strands. |
Regarding collapsing strands, does this basically mean treating each CpG
site as one genomic coordinate? I am not sure this would be appropriate for
me, since I expect a substantial amount of hemimethylated sites in my data.
But I am pretty new to methylation analysis so I would appreciate it if you
could explain more. Thanks for all your help.
…On Sat, Feb 17, 2024 at 1:07 PM Keegan Korthauer ***@***.***> wrote:
Thanks again for reporting this issue and for sending me your files to
debug. It took me a while to track it down, but I found the implicated line
that didn't properly filter loci with zero coverage in all samples of one
permuted condition in the permutations during the construction of candidate
blocks. I suspect I never encountered this before because I typically use
more stringent filters for coverage. I pushed a fix, and was able to run
your code successfully. You can access this version on the devel branch of
GitHub immediately (BiocManager::install_github("kdkorthauer/dmrseq",
"devel"), and it will also be available in Bioconductor Devel (version
3.19) in either today or Monday's build. Assuming it passes all checks,
I'll port it over to the current Bioc release version (version 3.18) early
next week.
As a side note, I noticed that the number of loci in your dataset and the
high number of consecutive genomic coordinates look like you may be
including loci on both strands separately. You may consider collapsing
strands. Happy to provide some more information on that if you're
interested.
Thanks again for helping me to improve the software.
—
Reply to this email directly, view it on GitHub
<#64 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AO3OAGJ5WN4MHZMPPGGWW63YUDWWTAVCNFSM6AAAAABCBIHFAWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJQGI3TEMZTHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yes, exactly - collapsing would treat each CpG as one observation. If you are interested in hemimethylation, then it would not be appropriate to collapse. |
I am consistently getting the following error while running permutations. It doesn't always happen on the same permutation.
Example output:
Beginning permutation 1
...Chromosome chr1: Smoothed (1.26 min). 38 CpG(s) excluded due to zero coverage. 72 regions scored (0.07 min).
...Chromosome chr2: Smoothed (1.87 min). 10 regions scored (0.04 min).
...Chromosome chr3: Smoothed (1.34 min). 6 regions scored (0.03 min).
...Chromosome chr4: Smoothed (0.68 min). 11 regions scored (0.04 min).
...Chromosome chr5: Smoothed (1.65 min). 2 CpG(s) excluded due to zero coverage. Error in .start_as_unnamed_integer(start) :
each range must have a start that is < 2^31 and > - 2^31
Calls: dmrseq ... .new_IRanges_from_start_end -> .start_as_unnamed_integer
In addition: Warning messages:
1: In FUN(X[[i]], ...) : no non-missing arguments to min; returning Inf
2: In FUN(X[[i]], ...) : no non-missing arguments to max; returning -Inf
Execution halted
I am using the latest version of all packages.
The text was updated successfully, but these errors were encountered: