Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can DelayedArray:::.rec_showtree be a generic? #119

Closed
LTLA opened this issue Nov 20, 2024 · 6 comments
Closed

Can DelayedArray:::.rec_showtree be a generic? #119

LTLA opened this issue Nov 20, 2024 · 6 comments

Comments

@LTLA
Copy link
Contributor

LTLA commented Nov 20, 2024

For custom seeds, I would like to be able to make showtree() more informative. For example, I might have a seed class where one or more of the slots is another DelayedArray, and I would like to show the delayed operations of the component DelayedArrays rather than just terminating the tree at my seed.

More specifically, alabaster.matrix has a ReloadedArray class that could potentially store a DelayedOp in its seed slot. Currently, showtree() terminates at the ReloadedArraySeed:

library(DelayedArray)
y <- DelayedArray(rsparsematrix(1000, 100, 0.05))
y <- log1p(abs(y) / 1:100) # adding some delayed ops.
showtree(y) # nice
## 1000x100 double, sparse: DelayedMatrix object
## └─ 1000x100 double, sparse: Stack of 1 unary iso op(s)
##    └─ 1000x100 double, sparse: Unary iso op with args
##       └─ 1000x100 double, sparse: Stack of 1 unary iso op(s)
##          └─ 1000x100 double, sparse: [seed] dgCMatrix object

library(alabaster.matrix)
z <- ReloadedArray(path=tempdir(), y)
showtree(z) # not so informative
## 1000x100 double, sparse: ReloadedMatrix object
## └─ 1000x100 double, sparse: [seed] ReloadedArraySeed object

I would like to be able to write a .rec_showtree() method so that I get something like:

## 1000x100 double, sparse: ReloadedMatrix object
## └─ 1000x100 double, sparse: ReloadedArraySeed object
##    └─ seed slot: 1000x100 double, sparse: Stack of 1 unary iso op(s)
##       └─ 1000x100 double, sparse: Unary iso op with args
##          └─ 1000x100 double, sparse: Stack of 1 unary iso op(s)
##              └─ 1000x100 double, sparse: [seed] dgCMatrix object

Even better if I can just return some kind of nested list and showtree() handles the pretty printing.

Session information
R Under development (unstable) (2024-10-30 r87277)
Platform: aarch64-apple-darwin22.6.0
Running under: macOS Ventura 13.7

Matrix products: default
BLAS:   /Users/luna/Software/R/trunk/lib/libRblas.dylib
LAPACK: /Users/luna/Software/R/trunk/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] alabaster.matrix_1.5.11 alabaster.base_1.7.2    DelayedArray_0.33.2
 [4] SparseArray_1.7.2       S4Arrays_1.7.1          abind_1.4-8
 [7] IRanges_2.41.1          S4Vectors_0.45.2        MatrixGenerics_1.19.0
[10] matrixStats_1.4.1       BiocGenerics_0.53.3     generics_0.1.3
[13] Matrix_1.7-1

loaded via a namespace (and not attached):
 [1] zlibbioc_1.53.0         lattice_0.22-6          rhdf5filters_1.19.0
 [4] XVector_0.47.0          Rhdf5lib_1.29.0         HDF5Array_1.35.1
 [7] grid_4.5.0              rhdf5_2.51.0            compiler_4.5.0
[10] tools_4.5.0             alabaster.schemas_1.7.0 Rcpp_1.0.13-1
[13] jsonlite_1.8.9          crayon_1.5.3
@hpages
Copy link
Contributor

hpages commented Nov 20, 2024

Isn't WrapperArraySeed a no-op from a delayed op point of view? If that's the case, how about having WrapperArraySeed extend DelayedUnaryIsoOp?

setClass("WrapperArraySeed", contains=c("VIRTUAL", "DelayedUnaryIsoOp"), slots=c(seed="ANY"))

showtree()'s mission is to display the tree of delayed ops and to stop on "leaf seeds" i.e. on seeds that are not delayed ops. So if WrapperArraySeed is seen as a delayed op, then showtree() will treat it as an "inner seed" instead of a "leaf seed":

showtree(z)
# 1000x100 double, sparse: ReloadedMatrix object
# └─ 1000x100 double, sparse: ReloadedArraySeed object
#    └─ 1000x100 double, sparse: Stack of 1 unary iso op(s)
#       └─ 1000x100 double, sparse: Unary iso op with args
#          └─ 1000x100 double, sparse: Stack of 1 unary iso op(s)
#             └─ 1000x100 double, sparse: [seed] dgCMatrix object

Another benefit of doing this is that you no longer need to define all the trivial dim(), dimnames(), extract_array(), is_sparse(), and extract_sparse_array() methods that you are currently defining for WrapperArraySeed, because they are already defined for DelayedUnaryIsoOp.

@hpages
Copy link
Contributor

hpages commented Nov 20, 2024

Edit: Please ignore this.

Furthermore, right now you have the following class hierarchy in alabaster.matrix:

           DelayedArray
                 ^
                 |
...........................................................................
                 |
                 |
VIRTUAL:   WrapperArray     <---[seed is a]---   WrapperArraySeed
           (slots: seed)                         (slots: seed)
                 ^                                     ^
                 |                                     |
           ReloadedArray    <---[seed is a]---   ReloadedArraySeed
           (slots: seed)                         (slots: path, seed)
                 ^                                   /
                 |                                  /
           ReloadedMatrix   <---[seed is a]-------/
           (slots: seed)

That sounds like a lot of classes! In particular I'm not sure about the need for the *ArraySeed classes when you could just do:

           DelayedArray
                 ^
                 |
...........................................................................
                 |
                 |
VIRTUAL:   WrapperArray
           (slots: seed)
                 ^
                 |
           ReloadedArray
           (slots: path, seed)
                 ^
                 |
           ReloadedMatrix
           (slots: path, seed)

That is: add the path slot to the ReloadedArray class and get rid of the *ArraySeed classes. Sounds like this would make the whole thing A LOT simpler.

I'm probably overlooking the reasons that motivated the current approach but wanted to mention this simpler approach just in case.

H.

@hpages
Copy link
Contributor

hpages commented Nov 20, 2024

Hmm.. yeah of course there are good reasons for having the *ArraySeed classes. My over-simplified proposal above doesn't work. Please ignore.

@hpages
Copy link
Contributor

hpages commented Nov 27, 2024

Have you tried to make WrapperArraySeed a DelayedUnaryIsoOp derivative? Can we close this?

@LTLA
Copy link
Contributor Author

LTLA commented Nov 27, 2024

Oh sorry, I only read your "please ignore this" message and ignored the entire thread.

Just tried it out and it seems reasonable. Might even be able to get rid of WrapperArraySeed altogether and just have ReloadedArraySeed inherit directly from DelayedUnaryIsoOp; the only purpose of the WrapperArraySeed is to implement all those trivial methods.

@hpages
Copy link
Contributor

hpages commented Nov 28, 2024

Thanks for the update. I'll close then. Please re-open or create a new issue if you think we should discuss this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants