You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This bug has two sides - one of which was reported as #92 - but there seems to be more to this, so I'm opening a new issue.
Bug description
1) Creation of List columns
Currently the reduce_ranges function implicitly creates list-columns when reducing any metadata column without any other functions (reduce_ranges(col1 = col1)).
If one tries to specifically create list columns (reduce_ranges(col1 = list(col1))), as one would do in i.e. tidyverse functions, then each entry of the list will contain the whole metadata column. This is also the case even when using group_by.
Based on the tutorial referred to in #92 I assume that this current behavior has not always been the case and is unintended or a bug. Given that plyranges::summarise does create list columns this way when used after group_by, but not when used alone, there is at least some inconsistency here, even if the current behavior is intended.
2) Trying to create list columns crashes when no ranges are reduced
Since reduce_ranges(col1 = col1) can create list-columns I tried to use it that way. However, this leads to an error when nothing would be reduced but metadata columns need to be handled somehow.
Reproducible code example
suppressMessages(library(plyranges))
gr<-data.frame(seqnames="chr1",
start= c(1, 5, 25, 45, 60, 65),
end= c(10, 20, 35, 55, 70, 75),
strand= c("*", "*", "*", "*", "*", "*"),
gene=c("geneA", "geneB", "geneC", "geneD", "geneE", "geneF"),
genetype= c('coding', 'coding', 'coding',
'coding', 'coding', 'non-coding')) %>%
as_granges()
## This creates list columns from the overlapping genesgr %>% reduce_ranges(gene=gene)
#> GRanges object with 4 ranges and 1 metadata column:#> seqnames ranges strand | gene#> <Rle> <IRanges> <Rle> | <CharacterList>#> [1] chr1 1-20 * | geneA,geneB#> [2] chr1 25-35 * | geneC#> [3] chr1 45-55 * | geneD#> [4] chr1 60-75 * | geneE,geneF#> -------#> seqinfo: 1 sequence from an unspecified genome; no seqlengths## This puts all genes into each list entrygr %>% reduce_ranges(gene=list(gene))
#> GRanges object with 4 ranges and 1 metadata column:#> seqnames ranges strand | gene#> <Rle> <IRanges> <Rle> | <List>#> [1] chr1 1-20 * | geneA,geneB,geneC,geneD,...#> [2] chr1 25-35 * | geneA,geneB,geneC,geneD,...#> [3] chr1 45-55 * | geneA,geneB,geneC,geneD,...#> [4] chr1 60-75 * | geneA,geneB,geneC,geneD,...#> -------#> seqinfo: 1 sequence from an unspecified genome; no seqlengths## Nothing overlaps anymore: errorgr[2:5] %>% reduce_ranges(gene=gene)
#> Error in summarize_rng(.data, dots): length(ans[[i]]) == nr is not TRUE## without metadata columns this does not happengr[2:5] %>% reduce_ranges()
#> GRanges object with 4 ranges and 0 metadata columns:#> seqnames ranges strand#> <Rle> <IRanges> <Rle>#> [1] chr1 5-20 *#> [2] chr1 25-35 *#> [3] chr1 45-55 *#> [4] chr1 60-70 *#> -------#> seqinfo: 1 sequence from an unspecified genome; no seqlengths## group_by works gr %>% group_by(genetype) %>% reduce_ranges(gene=gene)
#> GRanges object with 5 ranges and 2 metadata columns:#> seqnames ranges strand | genetype gene#> <Rle> <IRanges> <Rle> | <character> <CharacterList>#> [1] chr1 1-20 * | coding geneA,geneB#> [2] chr1 25-35 * | coding geneC#> [3] chr1 45-55 * | coding geneD#> [4] chr1 60-70 * | coding geneE#> [5] chr1 65-75 * | non-coding geneF#> -------#> seqinfo: 1 sequence from an unspecified genome; no seqlengths## but has the same issue when explicitly creating listsgr %>% group_by(genetype) %>% reduce_ranges(gene=list(gene))
#> GRanges object with 5 ranges and 2 metadata columns:#> seqnames ranges strand | genetype gene#> <Rle> <IRanges> <Rle> | <character> <List>#> [1] chr1 1-20 * | coding geneA,geneB,geneC,geneD,...#> [2] chr1 25-35 * | coding geneA,geneB,geneC,geneD,...#> [3] chr1 45-55 * | coding geneA,geneB,geneC,geneD,...#> [4] chr1 60-70 * | coding geneA,geneB,geneC,geneD,...#> [5] chr1 65-75 * | non-coding geneA,geneB,geneC,geneD,...#> -------#> seqinfo: 1 sequence from an unspecified genome; no seqlengths## if nothing is reduced due to groups, the error is still theregr[2:6] %>% group_by(genetype) %>% reduce_ranges(gene=gene)
#> Error in summarize_rng(.data, dots): length(ans[[i]]) == nr is not TRUE## without metadata columns again no errorgr[2:6] %>% group_by(genetype) %>% reduce_ranges()
#> GRanges object with 5 ranges and 1 metadata column:#> seqnames ranges strand | genetype#> <Rle> <IRanges> <Rle> | <character>#> [1] chr1 5-20 * | coding#> [2] chr1 25-35 * | coding#> [3] chr1 45-55 * | coding#> [4] chr1 60-70 * | coding#> [5] chr1 65-75 * | non-coding#> -------#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
This bug has two sides - one of which was reported as #92 - but there seems to be more to this, so I'm opening a new issue.
Bug description
1) Creation of List columns
Currently the
reduce_ranges
function implicitly creates list-columns when reducing any metadata column without any other functions (reduce_ranges(col1 = col1)
).If one tries to specifically create list columns (
reduce_ranges(col1 = list(col1))
), as one would do in i.e. tidyverse functions, then each entry of the list will contain the whole metadata column. This is also the case even when usinggroup_by
.Based on the tutorial referred to in #92 I assume that this current behavior has not always been the case and is unintended or a bug. Given that plyranges::summarise does create list columns this way when used after group_by, but not when used alone, there is at least some inconsistency here, even if the current behavior is intended.
2) Trying to create list columns crashes when no ranges are reduced
Since
reduce_ranges(col1 = col1)
can create list-columns I tried to use it that way. However, this leads to an error when nothing would be reduced but metadata columns need to be handled somehow.Reproducible code example
Created on 2023-03-01 with reprex v2.0.2
R session information
Created on 2023-03-01 with reprex v2.0.2
The text was updated successfully, but these errors were encountered: