Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharding storage transformer for v3 #1111

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
605620b
add storage_transformers and get/set_partial_values
jstriebel Jul 28, 2022
566e4b0
formatting
jstriebel Jul 28, 2022
5f85439
add docs and release notes
jstriebel Jul 28, 2022
3c38d57
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Jul 28, 2022
dd7fedb
add test_core testcase
jstriebel Jul 29, 2022
e33b365
Update zarr/creation.py
jstriebel Jul 29, 2022
81ebf68
apply PR feedback
jstriebel Jul 29, 2022
ca28471
add comment that storage_transformers=None is the same as storage_tra…
jstriebel Jul 29, 2022
85f3309
use empty tuple as default for storage_transformers
jstriebel Aug 1, 2022
03de894
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Aug 1, 2022
41eaafb
make mypy happy
jstriebel Aug 1, 2022
5d7be76
better coverage, minor fix, adding rmdir
jstriebel Aug 1, 2022
46229ad
add missing rmdir to test
jstriebel Aug 1, 2022
3a9f7cc
increase coverage
jstriebel Aug 2, 2022
efa4e07
improve test coverage
jstriebel Aug 3, 2022
b4668a8
fix TestArrayWithStorageTransformersV3
jstriebel Aug 3, 2022
e4a4853
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Aug 5, 2022
e454046
Update zarr/creation.py
jstriebel Aug 8, 2022
a3c7f74
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Aug 8, 2022
92ce212
add sharding storage transformer
jstriebel Aug 18, 2022
f6c87b4
add actual transformer
jstriebel Aug 18, 2022
df2dd71
fixe, and allow partial reads for uncompressed v3 arrays
jstriebel Aug 22, 2022
c041dd8
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Aug 22, 2022
06ce675
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 22, 2022
696d5ca
pick generic storage transformer changes from #1111
jstriebel Aug 22, 2022
4c0807e
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 22, 2022
c099440
increase coverage
jstriebel Aug 22, 2022
61db74a
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 22, 2022
83c9389
make lgtm happy
jstriebel Aug 22, 2022
fde61e8
add release note
jstriebel Aug 22, 2022
de4de18
better coverage
jstriebel Aug 23, 2022
0deb2b6
fix hexdigest
jstriebel Aug 23, 2022
d3eda71
improve tests
jstriebel Aug 23, 2022
093926c
fix order of storage transformers
jstriebel Aug 24, 2022
be98c01
fix order of storage transformers
jstriebel Aug 24, 2022
6e2790c
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 24, 2022
7c2767a
retrigger CI
jstriebel Aug 25, 2022
9257b85
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Aug 25, 2022
e7b14b7
minor test improvement
jstriebel Aug 25, 2022
a52300c
minor test update
jstriebel Aug 25, 2022
a960481
apply PR feedback
jstriebel Sep 8, 2022
146c30a
Merge remote-tracking branch 'origin/main' into storage-transformers-…
jstriebel Dec 12, 2022
6bc1025
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Dec 12, 2022
59cca8b
minor fixes
jstriebel Dec 12, 2022
92a48d8
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Dec 12, 2022
c2dc0d6
make flake8 happy
jstriebel Dec 12, 2022
12dc1ae
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Dec 12, 2022
91f10ff
call ensure_bytes in sharding transformer
jstriebel Dec 12, 2022
73fb0a5
minor fixes
jstriebel Dec 12, 2022
7402262
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Dec 19, 2022
b9d8177
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Dec 21, 2022
91f0c2c
apply PR feedback
jstriebel Dec 22, 2022
e68c97f
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Dec 22, 2022
490b962
Merge branch 'storage-transformers-and-partial-get-set' into sharding…
jstriebel Dec 22, 2022
e1960a1
adapt to supports_efficient_get_partial_values property
jstriebel Dec 22, 2022
c1bc26d
add ZARR_V3_SHARDING flag for sharding usage
jstriebel Dec 22, 2022
6f5b35a
fix release notes
jstriebel Dec 22, 2022
070c02c
fix release notes
jstriebel Dec 22, 2022
ef5c020
Merge remote-tracking branch 'scm/storage-transformers-and-partial-ge…
jstriebel Dec 22, 2022
a7e4d89
Merge remote-tracking branch 'origin/main' into storage-transformers-…
joshmoore Jan 16, 2023
fcb9ba0
Merge pull request #3 from joshmoore/storage-transformers-and-partial…
jstriebel Jan 16, 2023
b6588e7
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Jan 16, 2023
eba9006
Merge branch 'main' into storage-transformers-and-partial-get-set
jstriebel Jan 16, 2023
652653d
Merge remote-tracking branch 'scm/storage-transformers-and-partial-ge…
jstriebel Jan 19, 2023
1ccf052
Merge remote-tracking branch 'origin/main' into sharding-storage-tran…
jstriebel Jan 19, 2023
8bb79ef
Merge branch 'main' into sharding-storage-transformer
jstriebel Jan 23, 2023
a2eb332
Merge branch 'main' into sharding-storage-transformer
jstriebel Jan 26, 2023
dbf9fff
Merge branch 'main' into sharding-storage-transformer
jstriebel Feb 2, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/minimal.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
shell: "bash -l {0}"
env:
ZARR_V3_EXPERIMENTAL_API: 1
ZARR_V3_SHARDING: 1
jstriebel marked this conversation as resolved.
Show resolved Hide resolved
run: |
conda activate minimal
python -m pip install .
Expand All @@ -32,6 +33,7 @@ jobs:
shell: "bash -l {0}"
env:
ZARR_V3_EXPERIMENTAL_API: 1
ZARR_V3_SHARDING: 1
run: |
conda activate minimal
rm -rf fixture/
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ jobs:
ZARR_TEST_MONGO: 1
ZARR_TEST_REDIS: 1
ZARR_V3_EXPERIMENTAL_API: 1
ZARR_V3_SHARDING: 1
run: |
conda activate zarr-env
mkdir ~/blob_emulator
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/windows-testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ jobs:
env:
ZARR_TEST_ABS: 1
ZARR_V3_EXPERIMENTAL_API: 1
ZARR_V3_SHARDING: 1
- name: Conda info
shell: bash -l {0}
run: conda info
Expand Down
11 changes: 7 additions & 4 deletions docs/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,16 @@ Unreleased
# .. warning::
# Pre-release! Use :command:`pip install --pre zarr` to evaluate this release.


Major changes
~~~~~~~~~~~~~

* Improve `Zarr V3 support <https://zarr-specs.readthedocs.io/en/latest/core/v3.0.html>`_
adding partial store read/write and storage transformers.
By :user:`Jonathan Striebel <jstriebel>`; :issue:`1096`.
* Improve Zarr V3 support, adding partial store read/write and storage transformers.
Add two features of the [v3 spec](https://zarr-specs.readthedocs.io/en/latest/core/v3.0.html):
* storage transformers
* `get_partial_values` and `set_partial_values`
* efficient `get_partial_values` implementation for `FSStoreV3`
* sharding storage transformer
By :user:`Jonathan Striebel <jstriebel>`; :issue:`1096`, :issue:`1111`.


Bug fixes
Expand Down
29 changes: 29 additions & 0 deletions zarr/_storage/v3.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,35 @@ def rmdir(self, path=None):
if self.fs.isdir(store_path):
self.fs.rm(store_path, recursive=True)

@property
def supports_efficient_get_partial_values(self):
return True

def get_partial_values(self, key_ranges):
"""Get multiple partial values.
key_ranges can be an iterable of key, range pairs,
where a range specifies two integers range_start and range_length
as a tuple, (range_start, range_length).
range_length may be None to indicate to read until the end.
range_start may be negative to start reading range_start bytes
from the end of the file.
A key may occur multiple times with different ranges.
Inserts None for missing keys into the returned list."""
results = []
for key, (range_start, range_length) in key_ranges:
key = self._normalize_key(key)
path = self.dir_path(key)
try:
if range_start is None or range_length is None:
end = None
else:
end = range_start + range_length
result = self.fs.cat_file(path, start=range_start, end=end)
except self.map.missing_exceptions:
result = None
results.append(result)
return results


class MemoryStoreV3(MemoryStore, StoreV3):

Expand Down
Loading