Skip to content

Commit

Permalink
[r] Port resume-mode to R (#2405)
Browse files Browse the repository at this point in the history
* [r] Port resume-mode to R

Implement resume-mode ingestion in the R API

This PR parallels #664; it adds support for resume-mode in factory
functions, which allows `SOMA*Create()` to check for an already existing
TileDB object at `uri` and if so, simply connect to it rather than try
to re-create

It also adds support for resume-mode in `write_soma()` methods; these
methods check the `soma_joinids` that are present in the input data and
that already exist on disk, and only writes data for `soma_joinids` that
are missing from disk

The following SOMA functions have been modified to with an `ingest_mode`
parameter:
 - `SOMADataFrameCreate()`
 - `SOMASparseNDarrayCreate()`
 - `SOMACollectionCreate()`
 - `SOMAMeasurementCreate()`
 - `SOMAExperimentCreate()`

The following methods of `write_soma()` have been modified with an
`ingest_mode` parameter:
 - `write_soma.character()`/`write_soma.data.frame()`/`write_soma.Dataframe()`
 - `write_soma.matrix()`/`write_soma.TsparseMatrix()`
 - `write_soma.Seurat()` and other Seurat-subobject methods
 - `write_soma.SummarizedExperiment()`/`write_soma.SingleCellExperiment()`

resolves #1399

* Simplify testing suite

* Helper function to read SOMA join IDs from an array

* Add support for resume-mode for `data.frame`s

* Add tests for dense arrays

* Better checking for SOMA array registration

* Update docs to include resume-mode for collections

* Add support for resume-mode with sparse arrays

* Update docs

* Update docs

* Plumb resume-mode through `write_soma.Seurat()` and helper methods

* Workaround for a bug in SeuratObject

* More workaround

* Plumb resume-mode through SCE write path

* Improvements to `.register_soma_object`

* Better handling of arrays/collections for resume-mode on TileDB Cloud

* Allow force-reopen

* Improve tests

* Update docs

* Update `write_soma()` for Bioc objects w/ resume mode on TileDB Cloud

* Update changelog
Bump develop version
  • Loading branch information
mojaveazure authored Apr 24, 2024
1 parent 4eb276e commit 6b37895
Show file tree
Hide file tree
Showing 29 changed files with 2,059 additions and 389 deletions.
2 changes: 1 addition & 1 deletion apis/r/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Description: Interface for working with 'TileDB'-based Stack of Matrices,
like those commonly used for single cell data analysis. It is documented at
<https://github.com/single-cell-data>; a formal specification available is at
<https://github.com/single-cell-data/SOMA/blob/main/abstract_specification.md>.
Version: 1.10.99
Version: 1.10.99.1
Authors@R: c(
person(given = "Aaron", family = "Wolen",
role = c("cre", "aut"), email = "[email protected]",
Expand Down
2 changes: 2 additions & 0 deletions apis/r/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

S3method("[[",MappingBase)
S3method("[[<-",MappingBase)
S3method(.read_soma_joinids,SOMADataFrame)
S3method(.read_soma_joinids,SOMASparseNDArray)
S3method(as.list,CoordsStrider)
S3method(as.list,MappingBase)
S3method(iterators::nextElem,CoordsStrider)
Expand Down
1 change: 1 addition & 0 deletions apis/r/NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
* Add support for reading `*m` and `*p` layers from `SOMAExperimentAxisQuery`
* Add support for blockwise iteration
* Make `reopen()` a public method for all `TileDBObjects`
* Add support for resume-mode in `write_soma()`

# 1.7.0

Expand Down
Loading

0 comments on commit 6b37895

Please sign in to comment.