-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should GenomicRanges
and GenomicRangesList
check that the row_names of mcols
is the same as the name of the ranges in the parent object?
#121
Comments
If I understand correctly, R does not enforce this either. Users are allowed to modify row names, and there are no built-in checks to ensure that the row names of IRanges and GenomicRanges objects match. It is assumed that there is a 1-to-1 index mapping between these two structures. |
I did not look at the Bioconductor code code before this. What I see is that |
Weirdly, it doesn't seem to prevent creation of a genomic ranges object. It seems to elevate the mcols of iranges and merge it with genomicranges. I think this is confusing when objects magically change things and I prefer explicit construction of these objects. > gr <- GRanges(
+ seqnames = c("chr1", "chr1"),
+ ranges = IRanges(c(2, 9), width=c(6, 11), mcols=DataFrame(a = 1:2)),
+ strand = c("+", "-"))
> gr
GRanges object with 2 ranges and 1 metadata column:
seqnames ranges strand | mcols.a
<Rle> <IRanges> <Rle> | <integer>
[1] chr1 2-7 + | 1
[2] chr1 9-19 - | 2
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> ranges(gr)
IRanges object with 2 ranges and 0 metadata columns:
start end width
<integer> <integer> <integer>
[1] 2 7 6
[2] 9 19 11 |
Transformation stuff happens here: https://code.bioconductor.org/browse/GenomicRanges/blob/RELEASE_3_19/R/GRanges-class.R#L187 through line 197. I don't disagree, its just handling these differences when trying to read from or write to disk is very challenging. |
Currently,
GenomicRanges
andGenomicRangesList
only checks that themcols
BiocFrame
has the same number of rows as there are ranges. However, it seems prudent to also ensure thatmcols
has the samerow_names
as the names of the ranges in the parent object. The lack of this guarantee makes it impossible to select mcols rows from a subset of ranges rows and many other operations. This also applies to the names of theIRanges
object inGenomicRanges
.The text was updated successfully, but these errors were encountered: