Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python/c++] COO to CSX conversion optimization #3304

Merged
merged 49 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
121d2a2
first cut at fast CSX conversion
bkmartinjr Nov 6, 2024
0e402b9
fix compile time warning
bkmartinjr Nov 6, 2024
c68f736
build python bindings -O3
bkmartinjr Nov 6, 2024
d3ab2e0
typos
bkmartinjr Nov 7, 2024
7a642a3
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 7, 2024
48e3dbc
fix clang compile issue
bkmartinjr Nov 7, 2024
d6f6a20
reformat to repo C++ standards
bkmartinjr Nov 7, 2024
b67d297
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 7, 2024
3733f4e
fix clang warning about unused captures
bkmartinjr Nov 7, 2024
27810a3
more clang fixes
bkmartinjr Nov 7, 2024
7d05331
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 7, 2024
9d0b7ba
fix build error found during CI
bkmartinjr Nov 7, 2024
70bf7ca
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 7, 2024
9b23b02
lint
bkmartinjr Nov 7, 2024
fa44160
more lint
bkmartinjr Nov 7, 2024
ff1ed9e
lint chase
bkmartinjr Nov 7, 2024
1f273e6
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 7, 2024
9e2826a
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 8, 2024
91f63d9
cleanup include statements
bkmartinjr Nov 8, 2024
3ddd252
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 8, 2024
0abb9f3
debugging R build
bkmartinjr Nov 8, 2024
48030ab
lint
bkmartinjr Nov 8, 2024
12c7565
cleanup GHA for interop testing
bkmartinjr Nov 8, 2024
bae3a03
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 8, 2024
8f3eb7d
add -mavx2 for x86 build
bkmartinjr Nov 9, 2024
ae6b533
more tests
bkmartinjr Nov 9, 2024
2d9050c
comment
bkmartinjr Nov 9, 2024
866c4f5
add bounds check for second dimension coordiate
bkmartinjr Nov 9, 2024
98529f8
lint
bkmartinjr Nov 9, 2024
d1b8e32
test / bug fix argument handling
bkmartinjr Nov 9, 2024
875d6dd
hand code AVX specializations for speed eval
bkmartinjr Nov 11, 2024
35be47b
reduce template overhead
bkmartinjr Nov 11, 2024
7532a45
do not use experimental value width dispatching
bkmartinjr Nov 11, 2024
2c33838
lint
bkmartinjr Nov 11, 2024
19a545b
improve exception discrimination
bkmartinjr Nov 11, 2024
841046f
cleanup size-based value templating
bkmartinjr Nov 12, 2024
8314b67
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 12, 2024
8e1bfdc
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 14, 2024
0858763
cleanup
bkmartinjr Nov 14, 2024
d014089
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 15, 2024
6a7e26e
clean up C++ namespace
bkmartinjr Nov 15, 2024
3e8672d
revert cleanup on request
bkmartinjr Nov 15, 2024
1986ad5
PR feedback (thanks John!)
bkmartinjr Nov 15, 2024
bc866f8
incorporate more PR fb
bkmartinjr Nov 15, 2024
6242085
Merge branch 'main' into bkmartinjr/fastercsx
bkmartinjr Nov 15, 2024
de34a42
lint
bkmartinjr Nov 15, 2024
e5ef34d
comments
bkmartinjr Nov 16, 2024
52933f2
fix compile warnings
bkmartinjr Nov 17, 2024
ba3f69e
PR f/b
bkmartinjr Nov 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 6 additions & 16 deletions .github/workflows/r-python-interop-testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ on:
push:
branches:
- main
- 'release-*'
- "release-*"
workflow_dispatch:

jobs:
Expand All @@ -27,19 +27,16 @@ jobs:
- name: Checkout TileDB-SOMA
uses: actions/checkout@v4
with:
fetch-depth: 0 # ensure we get all tags to inform package version determination
fetch-depth: 0 # ensure we get all tags to inform package version determination

- name: Bootstrap
run: cd apis/r && tools/r-ci.sh bootstrap

- name: Dependencies
run: cd apis/r && Rscript -e "remotes::install_deps(dependencies = TRUE, upgrade = FALSE)"

- name: CMake
bkmartinjr marked this conversation as resolved.
Show resolved Hide resolved
uses: lukka/get-cmake@latest

- name: MkVars
run: mkdir ~/.R && echo "CXX17FLAGS=-Wno-deprecated-declarations -Wno-deprecated" > ~/.R/Makevars
- name: Build Package
run: cd apis/r && R CMD build --no-build-vignettes --no-manual .

# Uncomment these next two stanzas as needed whenever we've just released a new tiledb-r for
# which source is available but CRAN releases (and hence update r2u binaries) are not yet:
Expand All @@ -66,15 +63,8 @@ jobs:
# if: ${{ matrix.os != 'macOS-latest' }}
# run: cd apis/r && Rscript -e "options(bspm.version.check=TRUE); install.packages('tiledb', repos = c('https://eddelbuettel.r-universe.dev/bin/linux/jammy/4.3/', 'https://cloud.r-project.org'))"

- name: Build and install libtiledbsoma
run: sudo scripts/bld --prefix=/usr/local --no-tiledb-deprecated=true --werror=true && sudo ldconfig

- name: Install R-tiledbsoma
run: |
cd apis/r
R CMD build --no-build-vignettes --no-manual .
FILE=$(ls -1t *.tar.gz | head -n 1)
R CMD INSTALL $FILE
- name: Install Package
run: cd apis/r && R CMD INSTALL $(ls -1tr *.tar.gz | tail -1)

johnkerl marked this conversation as resolved.
Show resolved Hide resolved
- name: Show R package versions
run: Rscript -e 'tiledbsoma::show_package_versions()'
Expand Down
7 changes: 6 additions & 1 deletion apis/python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
import ctypes
import os
import pathlib
import platform
import shutil
import subprocess
import sys
Expand Down Expand Up @@ -242,7 +243,10 @@ def run(self):
str(tiledb_dir / "lib"),
]

CXX_FLAGS = []
CXX_FLAGS = ["-O3"]
bkmartinjr marked this conversation as resolved.
Show resolved Hide resolved

if platform.machine() == "x86_64":
CXX_FLAGS.append("-mavx2")
bkmartinjr marked this conversation as resolved.
Show resolved Hide resolved

if os.name != "nt":
CXX_FLAGS.append(f'-Wl,-rpath,{str(tiledbsoma_dir / "lib")}')
Expand Down Expand Up @@ -301,6 +305,7 @@ def run(self):
"tiledbsoma.pytiledbsoma",
[
"src/tiledbsoma/common.cc",
"src/tiledbsoma/fastercsx.cc",
"src/tiledbsoma/reindexer.cc",
"src/tiledbsoma/query_condition.cc",
"src/tiledbsoma/soma_vfs.cc",
Expand Down
270 changes: 270 additions & 0 deletions apis/python/src/tiledbsoma/_fastercsx.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,270 @@
# Copyright (c) 2024 The Chan Zuckerberg Initiative Foundation
# Copyright (c) 2024 TileDB, Inc.
#
# Licensed under the MIT License.

from __future__ import annotations

import collections.abc
import math
from typing import Any, List, Literal, Sequence, Tuple, Union, cast

import numpy as np
import numpy.typing as npt
import pyarrow as pa
import scipy.sparse
from typing_extensions import TypeAlias

from .options._soma_tiledb_context import SOMATileDBContext
from .pytiledbsoma.fastercsx import compress_coo, copy_csx_to_dense, sort_csx_indices

NDArrayIndex: TypeAlias = npt.NDArray[np.integer[Any]]
NDArrayNumber: TypeAlias = npt.NDArray[Union[np.integer[Any], np.floating[Any]]]


class CompressedMatrix:
"""Fast compressed matrix construction from COO data.

Private class intended _only_ for compressed matrix construction, for package-internal use.
Export constructed matrix to SciPy (or comparable package) for use of the result.
"""

__slots__ = ("indptr", "indices", "data", "shape", "format", "sorted", "context")

def __init__(
self,
indptr: NDArrayIndex,
indices: NDArrayIndex,
data: NDArrayNumber,
shape: Tuple[int, int],
format: Literal["csc", "csr"],
sorted: bool,
context: SOMATileDBContext,
) -> None:
"""Construct from PJV format. Not intended for direct use - use instead the
static factory methods `from_ijd`, `from_pjv` and `from_soma`.
"""
assert len(data) == len(indices)
assert len(data) <= np.iinfo(indptr.dtype).max
assert shape[1] <= np.iinfo(indices.dtype).max
assert indptr[-1] == len(data) and indptr[0] == 0

self.shape = shape
self.indptr = indptr
self.indices = indices
self.data = data
self.format = format
self.sorted = sorted
self.context = context

@staticmethod
def from_ijd(
i: NDArrayIndex | Sequence[NDArrayIndex],
j: NDArrayIndex | Sequence[NDArrayIndex],
d: NDArrayNumber | Sequence[NDArrayNumber],
shape: Tuple[int, int],
format: Literal["csc", "csr"],
make_sorted: bool,
context: SOMATileDBContext,
) -> CompressedMatrix:
"""Factory method accepting COO points stored in IJD vectors."""
i = i if isinstance(i, collections.abc.Sequence) else (i,)
j = j if isinstance(j, collections.abc.Sequence) else (j,)
d = d if isinstance(d, collections.abc.Sequence) else (d,)

if format == "csc":
i, j = j, i
n_major, n_minor = shape[1], shape[0]
else:
n_major, n_minor = shape[0], shape[1]

nnz = sum(len(d_) for d_ in d)
index_dtype = CompressedMatrix._smallest_index_dtype(nnz)
indptr = np.zeros((n_major + 1), dtype=index_dtype)
indices = np.empty((nnz,), dtype=index_dtype)
data = np.empty((nnz,), dtype=d[0].dtype)
compress_coo(
context.native_context, (n_major, n_minor), i, j, d, indptr, indices, data
)
if make_sorted:
sort_csx_indices(context.native_context, indptr, indices, data)
return CompressedMatrix(
indptr, indices, data, shape, format, make_sorted, context
)

@staticmethod
def from_pjd(
p: NDArrayIndex,
j: NDArrayIndex,
d: NDArrayNumber,
shape: Tuple[int, int],
format: Literal["csc", "csr"],
make_sorted: bool,
context: SOMATileDBContext,
) -> CompressedMatrix:
"""Factory method accepting pointers, indices and data vectors."""
return CompressedMatrix(p, j, d, shape, format, make_sorted, context)

@staticmethod
def from_soma(
tables: pa.Table | Sequence[pa.Table],
shape: Tuple[int, int],
format: Literal["csc", "csr"],
make_sorted: bool,
context: SOMATileDBContext,
) -> CompressedMatrix:
"""Factory method accepting a sequence of Arrow tables containing SOMA sparse matrix data.
johnkerl marked this conversation as resolved.
Show resolved Hide resolved

Table names must conform to the standard SOMA format, i.e., have three columns, named
``soma_dim_0``, ``soma_dim_1`` and ``soma_data``. All arrays in each table column must
contain the same chunk count/size, and the dimension columns must be int64.
"""
tables = tables if isinstance(tables, collections.abc.Sequence) else (tables,)

def chunks(a: pa.Array | pa.ChunkedArray) -> List[pa.Array]:
return (
list(a) if isinstance(a, pa.Array) else cast(List[pa.Array], a.chunks)
)

i = tuple(a.to_numpy() for tbl in tables for a in chunks(tbl["soma_dim_0"]))
j = tuple(a.to_numpy() for tbl in tables for a in chunks(tbl["soma_dim_1"]))
d = tuple(a.to_numpy() for tbl in tables for a in chunks(tbl["soma_data"]))
return CompressedMatrix.from_ijd(i, j, d, shape, format, make_sorted, context)

@property
def nnz(self) -> int:
return len(self.indices)

@property
def nbytes(self) -> int:
return int(self.indptr.nbytes + self.indices.nbytes + self.data.nbytes)

@property
def dtype(self) -> npt.DTypeLike:
return self.data.dtype

def to_scipy(
self, index: slice | None = None
) -> scipy.sparse.csr_matrix | scipy.sparse.csc_matrix:
"""Extract a slice on the compressed dimension and return as a dense
bkmartinjr marked this conversation as resolved.
Show resolved Hide resolved
:class:`scipy.sparse.csr_matrix` or
:class:`scipy.sparse.csc_matrix`.

Optionally allows slicing on compressed dimension during conversion, in which case
an extra memory copy is avoided.
"""
index = index or slice(None)
assert isinstance(index, slice)
assert index.step in (1, None)
idx_start, idx_end, _ = index.indices(self.indptr.shape[0] - 1)
n_major = max(idx_end - idx_start, 0)

if n_major == self.indptr.shape[0] - 1:
assert idx_start == 0 and n_major == idx_end
indptr = self.indptr
indices = self.indices
data = self.data

elif n_major == 0:
return (
scipy.sparse.csr_matrix((0, self.shape[1]), dtype=self.dtype)
if self.format == "csr"
else scipy.sparse.csc_matrix((self.shape[0], 0), dtype=self.dtype)
)

else:
indptr = (self.indptr[idx_start : idx_end + 1]).copy()
indices = self.indices[indptr[0] : indptr[-1]].copy()
data = self.data[indptr[0] : indptr[-1]].copy()
indptr -= indptr[0] # type: ignore[misc]

shape = (
(n_major, self.shape[1])
if self.format == "csr"
else (self.shape[0], n_major)
)
return CompressedMatrix._to_scipy(
indptr, indices, data, shape, self.format, self.sorted
)

def to_numpy(self, index: slice | None = None) -> NDArrayNumber:
bkmartinjr marked this conversation as resolved.
Show resolved Hide resolved
"""Extract a slice on the compressed dimension and return as a dense :class:`numpy.ndarray`."""
index = index or slice(None)
assert isinstance(index, slice)
assert index.step in (1, None)
major_idx_start, major_idx_end, _ = index.indices(self.indptr.shape[0] - 1)
n_major = max(major_idx_end - major_idx_start, 0)

out_shape = (
(n_major, self.shape[1])
if self.format == "csr"
else (self.shape[0], n_major)
)
out = np.zeros(math.prod(out_shape), dtype=self.data.dtype)
copy_csx_to_dense(
self.context.native_context,
major_idx_start,
major_idx_end,
self.shape,
self.format,
self.indptr,
self.indices,
self.data,
out,
)
return out.reshape(out_shape)

@classmethod
def _smallest_index_dtype(cls, max_val: int) -> npt.DTypeLike:
"""NB: the underlying C++ code supports other index types, including uint16 and
uint32. This helper method uses only int32/int64 to retain compatibility with
the SciPy sparse matrix/array package.
"""
candidate_index_types: list[npt.DTypeLike] = [np.int32, np.int64]
for dt in candidate_index_types:
if max_val <= np.iinfo(dt).max:
return dt

raise ValueError(

Check warning on line 228 in apis/python/src/tiledbsoma/_fastercsx.py

View check run for this annotation

Codecov / codecov/patch

apis/python/src/tiledbsoma/_fastercsx.py#L228

Added line #L228 was not covered by tests
"Unable to find index type sufficiently large for max index value."
)

@staticmethod
def _to_scipy(
indptr: NDArrayNumber,
indices: NDArrayNumber,
data: NDArrayNumber,
shape: Tuple[int, int],
format: Literal["csc", "csr"],
sorted: bool,
) -> scipy.sparse.csr_matrix | scipy.sparse.csc_matrix:
"""
This is to bypass the O(N) scan that :meth:`sparse._cs_matrix.__init__`
performs when a new compressed matrix is created.

See `SciPy bug 11496 <https://github.com/scipy/scipy/issues/11496>` for details.

Conceptually, this is identical to:
sparse.csr_matrix((data, indices, indptr), shape=shape)
(or the equivalent for csc_matrix)
"""
if sorted:
matrix = (
scipy.sparse.csr_matrix.__new__(scipy.sparse.csr_matrix)
if format == "csr"
else scipy.sparse.csc_matrix.__new__(scipy.sparse.csc_matrix)
)
matrix.data = data
matrix.indptr = indptr
matrix.indices = indices
matrix._shape = shape
matrix._has_sorted_values = True
matrix._has_canonical_format = True
return matrix

else:
return (
scipy.sparse.csr_matrix((data, indices, indptr), shape=shape)
if format == "csr"
else scipy.sparse.csc_matrix((data, indices, indptr), shape=shape)
)
22 changes: 15 additions & 7 deletions apis/python/src/tiledbsoma/_read_iters.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from concurrent.futures import ThreadPoolExecutor
from typing import (
TYPE_CHECKING,
Any,
Dict,
Iterator,
List,
Expand All @@ -36,6 +37,7 @@

from . import _util
from ._exception import SOMAError
from ._fastercsx import CompressedMatrix
from ._indexer import IntIndexer
from ._types import NTuple
from .options import SOMATileDBContext
Expand All @@ -56,7 +58,8 @@

IndicesType = Tuple[npt.NDArray[np.int64], npt.NDArray[np.int64]]
IJDType = Tuple[
Tuple[npt.NDArray[np.int64], npt.NDArray[np.int64]], npt.NDArray[np.generic]
Tuple[npt.NDArray[np.int64], npt.NDArray[np.int64]],
npt.NDArray[Union[np.integer[Any], np.floating[Any]]],
]


Expand Down Expand Up @@ -406,12 +409,17 @@ def _cs_reader(
):
major_coords = indices[self.major_axis]
minor_coords = indices[self.minor_axis]
cls = sparse.csr_matrix if self.major_axis == 0 else sparse.csc_matrix
sp = cls(
sparse.coo_matrix(
(d, (i, j)), shape=self._mk_shape(major_coords, minor_coords)
)
)
shape = self._mk_shape(major_coords, minor_coords)
assert self.context is not None
sp = CompressedMatrix.from_ijd(
i,
j,
d,
shape=shape,
format="csr" if self.major_axis == 0 else "csc",
make_sorted=True,
context=self.context,
).to_scipy()
yield sp, indices


Expand Down
Loading
Loading