Skip to content

Commit

Permalink
Improve test coverage for storage classes (#2693)
Browse files Browse the repository at this point in the history
* Run Store tests on logging

* Run store tests on wrapper

* Add read only open tests to WrapperStore

* Ignore new coverage files

* Simplify wrapper tests

* Fix __eq__ method in WrapperStore

* Implement __repr__ for WrapperStore

* Allow separate open and init kwargs

* Add open class method to LoggingStore

* Add __str__ to WrapperStore

* Add repr test for LoggingStore

* Fix __eq__ in LoggingStore

* Test getsize for stores

* Test for invalid ByteRequest

* Use stdout rather than stderr as the default logging stream

* Test default logging stream

* Add test for getsize_prefix

* Document buffer prototype parameter

* Add test for invalid modes in StorePath.open()

* Add test for contains_group

* Add tests for contains_array

* Test for invalid root type for LocalStore

* Test LocalStore.get with default prototype

* Test for invalid set buffer arguments

* Test get and set on closed stores

* Test using stores in a context manager

* Specify abstract methods for StoreTests

* Apply suggestions from code review

Co-authored-by: Davis Bennett <[email protected]>

* Lint

* Fix typing for LoggingStore

Co-authored-by: Davis Bennett <[email protected]>

* Match specific Errors in tests

Co-authored-by: Davis Bennett <[email protected]>

* Add docstring

Co-authored-by: Davis Bennett <[email protected]>

* Parametrize tests

Co-authored-by: Davis Bennett <[email protected]>

* Test for contains group/array at multiple heirarchies

Co-authored-by: Davis Bennett <[email protected]>

* Update TypeError on GpuMemoryStore

* Don't implement _is_open setter on wrapped stores

* Update reprs for LoggingStore and WrapperStore

* Test check_writeable and close for WrapperStore

* Update pull request template (#2717)

* Add release notes

* Comprehensive changelog entry

* Match error message

* Apply suggestions from code review

Co-authored-by: David Stansby <[email protected]>

* Update 2693.bugfix.rst

---------

Co-authored-by: Davis Bennett <[email protected]>
Co-authored-by: Hannes Spitz <[email protected]>
Co-authored-by: David Stansby <[email protected]>
Co-authored-by: Joe Hamman <[email protected]>
  • Loading branch information
5 people authored Jan 25, 2025
1 parent 9fd4545 commit 80aea2a
Show file tree
Hide file tree
Showing 14 changed files with 406 additions and 51 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ src/zarr/_version.py
data/*
src/fixture/
fixture/
junit.xml

.DS_Store
tests/.hypothesis
Expand Down
13 changes: 13 additions & 0 deletions changes/2693.bugfix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Implement open() for LoggingStore
LoggingStore is now a generic class.
Use stdout rather than stderr as the default stream for LoggingStore
Ensure that ZipStore is open before getting or setting any values
Update equality for LoggingStore and WrapperStore such that 'other' must also be a LoggingStore or WrapperStore respectively, rather than only checking the types of the stores they wrap.
Indicate StoreTest's `test_store_repr`, `test_store_supports_writes`, `test_store_supports_partial_writes`, and `test_store_supports_listing` need to be implemented using `@abstractmethod` rather than `NotImplementedError`.
Separate instantiating and opening a store in StoreTests
Test using Store as a context manager in StoreTests
Match the errors raised by read only stores in StoreTests
Test that a ValueError is raise for invalid byte range syntax in StoreTests
Test getsize() and getsize_prefix() in StoreTests
Test the error raised for invalid buffer arguments in StoreTests
Test that data can be written to a store that's not yet open using the store.set method in StoreTests
6 changes: 4 additions & 2 deletions src/zarr/abc/store.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,10 +176,10 @@ async def get(
Parameters
----------
key : str
prototype : BufferPrototype
The prototype of the output buffer. Stores may support a default buffer prototype.
byte_range : ByteRequest, optional
ByteRequest may be one of the following. If not provided, all data associated with the key is retrieved.
- RangeByteRequest(int, int): Request a specific range of bytes in the form (start, end). The end is exclusive. If the given range is zero-length or starts after the end of the object, an error will be returned. Additionally, if the range ends after the end of the object, the entire remainder of the object will be returned. Otherwise, the exact requested range will be returned.
- OffsetByteRequest(int): Request all bytes starting from a given byte offset. This is equivalent to bytes={int}- as an HTTP header.
- SuffixByteRequest(int): Request the last int bytes. Note that here, int is the size of the request, not the byte offset. This is equivalent to bytes=-{int} as an HTTP header.
Expand All @@ -200,6 +200,8 @@ async def get_partial_values(
Parameters
----------
prototype : BufferPrototype
The prototype of the output buffer. Stores may support a default buffer prototype.
key_ranges : Iterable[tuple[str, tuple[int | None, int | None]]]
Ordered set of key, range pairs, a key may occur multiple times with different ranges
Expand Down
7 changes: 6 additions & 1 deletion src/zarr/storage/_fsspec.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,15 @@
Store,
SuffixByteRequest,
)
from zarr.core.buffer import Buffer
from zarr.storage._common import _dereference_path

if TYPE_CHECKING:
from collections.abc import AsyncIterator, Iterable

from fsspec.asyn import AsyncFileSystem

from zarr.core.buffer import Buffer, BufferPrototype
from zarr.core.buffer import BufferPrototype
from zarr.core.common import BytesLike


Expand Down Expand Up @@ -264,6 +265,10 @@ async def set(
if not self._is_open:
await self._open()
self._check_writable()
if not isinstance(value, Buffer):
raise TypeError(
f"FsspecStore.set(): `value` must be a Buffer instance. Got an instance of {type(value)} instead."
)
path = _dereference_path(self.path, key)
# write data
if byte_range:
Expand Down
6 changes: 4 additions & 2 deletions src/zarr/storage/_local.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def __init__(self, root: Path | str, *, read_only: bool = False) -> None:
root = Path(root)
if not isinstance(root, Path):
raise TypeError(
f'"root" must be a string or Path instance. Got an object with type {type(root)} instead.'
f"'root' must be a string or Path instance. Got an instance of {type(root)} instead."
)
self.root = root

Expand Down Expand Up @@ -169,7 +169,9 @@ async def _set(self, key: str, value: Buffer, exclusive: bool = False) -> None:
self._check_writable()
assert isinstance(key, str)
if not isinstance(value, Buffer):
raise TypeError("LocalStore.set(): `value` must a Buffer instance")
raise TypeError(
f"LocalStore.set(): `value` must be a Buffer instance. Got an instance of {type(value)} instead."
)
path = self.root / key
await asyncio.to_thread(_put, path, value, start=None, exclusive=exclusive)

Expand Down
26 changes: 18 additions & 8 deletions src/zarr/storage/_logging.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@

import inspect
import logging
import sys
import time
from collections import defaultdict
from contextlib import contextmanager
from typing import TYPE_CHECKING, Any
from typing import TYPE_CHECKING, Any, Self, TypeVar

from zarr.abc.store import Store
from zarr.storage._wrapper import WrapperStore
Expand All @@ -18,8 +19,10 @@

counter: defaultdict[str, int]

T_Store = TypeVar("T_Store", bound=Store)

class LoggingStore(WrapperStore[Store]):

class LoggingStore(WrapperStore[T_Store]):
"""
Store wrapper that logs all calls to the wrapped store.
Expand All @@ -42,7 +45,7 @@ class LoggingStore(WrapperStore[Store]):

def __init__(
self,
store: Store,
store: T_Store,
log_level: str = "DEBUG",
log_handler: logging.Handler | None = None,
) -> None:
Expand All @@ -67,7 +70,7 @@ def _configure_logger(

def _default_handler(self) -> logging.Handler:
"""Define a default log handler"""
handler = logging.StreamHandler()
handler = logging.StreamHandler(stream=sys.stdout)
handler.setLevel(self.log_level)
handler.setFormatter(
logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
Expand All @@ -94,6 +97,14 @@ def log(self, hint: Any = "") -> Generator[None, None, None]:
end_time = time.time()
self.logger.info("Finished %s [%.2f s]", op, end_time - start_time)

@classmethod
async def open(cls: type[Self], store_cls: type[T_Store], *args: Any, **kwargs: Any) -> Self:
log_level = kwargs.pop("log_level", "DEBUG")
log_handler = kwargs.pop("log_handler", None)
store = store_cls(*args, **kwargs)
await store._open()
return cls(store=store, log_level=log_level, log_handler=log_handler)

@property
def supports_writes(self) -> bool:
with self.log():
Expand Down Expand Up @@ -126,8 +137,7 @@ def _is_open(self) -> bool:

@_is_open.setter
def _is_open(self, value: bool) -> None:
with self.log(value):
self._store._is_open = value
raise NotImplementedError("LoggingStore must be opened via the `_open` method")

async def _open(self) -> None:
with self.log():
Expand All @@ -151,11 +161,11 @@ def __str__(self) -> str:
return f"logging-{self._store}"

def __repr__(self) -> str:
return f"LoggingStore({repr(self._store)!r})"
return f"LoggingStore({self._store.__class__.__name__}, '{self._store}')"

def __eq__(self, other: object) -> bool:
with self.log(other):
return self._store == other
return type(self) is type(other) and self._store.__eq__(other._store) # type: ignore[attr-defined]

async def get(
self,
Expand Down
9 changes: 6 additions & 3 deletions src/zarr/storage/_memory.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,9 @@ async def set(self, key: str, value: Buffer, byte_range: tuple[int, int] | None
await self._ensure_open()
assert isinstance(key, str)
if not isinstance(value, Buffer):
raise TypeError(f"Expected Buffer. Got {type(value)}.")
raise TypeError(
f"MemoryStore.set(): `value` must be a Buffer instance. Got an instance of {type(value)} instead."
)

if byte_range is not None:
buf = self._store_dict[key]
Expand Down Expand Up @@ -231,8 +233,9 @@ async def set(self, key: str, value: Buffer, byte_range: tuple[int, int] | None
self._check_writable()
assert isinstance(key, str)
if not isinstance(value, Buffer):
raise TypeError(f"Expected Buffer. Got {type(value)}.")

raise TypeError(
f"GpuMemoryStore.set(): `value` must be a Buffer instance. Got an instance of {type(value)} instead."
)
# Convert to gpu.Buffer
gpu_value = value if isinstance(value, gpu.Buffer) else gpu.Buffer.from_buffer(value)
await super().set(key, gpu_value, byte_range=byte_range)
16 changes: 15 additions & 1 deletion src/zarr/storage/_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,14 @@ async def _ensure_open(self) -> None:
async def is_empty(self, prefix: str) -> bool:
return await self._store.is_empty(prefix)

@property
def _is_open(self) -> bool:
return self._store._is_open

@_is_open.setter
def _is_open(self, value: bool) -> None:
raise NotImplementedError("WrapperStore must be opened via the `_open` method")

async def clear(self) -> None:
return await self._store.clear()

Expand All @@ -67,7 +75,13 @@ def _check_writable(self) -> None:
return self._store._check_writable()

def __eq__(self, value: object) -> bool:
return type(self) is type(value) and self._store.__eq__(value)
return type(self) is type(value) and self._store.__eq__(value._store) # type: ignore[attr-defined]

def __str__(self) -> str:
return f"wrapping-{self._store}"

def __repr__(self) -> str:
return f"WrapperStore({self._store.__class__.__name__}, '{self._store}')"

async def get(
self, key: str, prototype: BufferPrototype, byte_range: ByteRequest | None = None
Expand Down
10 changes: 9 additions & 1 deletion src/zarr/storage/_zip.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,8 @@ def _get(
prototype: BufferPrototype,
byte_range: ByteRequest | None = None,
) -> Buffer | None:
if not self._is_open:
self._sync_open()
# docstring inherited
try:
with self._zf.open(key) as f: # will raise KeyError
Expand Down Expand Up @@ -193,6 +195,8 @@ async def get_partial_values(
return out

def _set(self, key: str, value: Buffer) -> None:
if not self._is_open:
self._sync_open()
# generally, this should be called inside a lock
keyinfo = zipfile.ZipInfo(filename=key, date_time=time.localtime(time.time())[:6])
keyinfo.compress_type = self.compression
Expand All @@ -206,9 +210,13 @@ def _set(self, key: str, value: Buffer) -> None:
async def set(self, key: str, value: Buffer) -> None:
# docstring inherited
self._check_writable()
if not self._is_open:
self._sync_open()
assert isinstance(key, str)
if not isinstance(value, Buffer):
raise TypeError("ZipStore.set(): `value` must a Buffer instance")
raise TypeError(
f"ZipStore.set(): `value` must be a Buffer instance. Got an instance of {type(value)} instead."
)
with self._lock:
self._set(key, value)

Expand Down
Loading

0 comments on commit 80aea2a

Please sign in to comment.