Are write operations with the zarr Driver guaranteed to be thread- and process-safe? #198

xantho09 · 2024-09-30T10:10:41Z

Suppose I have an existing on-disk Zarr array. If I were to have two separate processes that:

Open this Zarr array via tensorstore.open
Write to separate regions that potentially share the same chunks within the Zarr array

Are these two write operations guaranteed to write correctly?

For example, suppose my.zarr has a chunk shape of (64,64,64).

Process 1 writes to (0,0,0):(64,64,32)
Process 2 writes to (0,0,32):(64,64,64)

# Process 1
path = "path/to/my.zarr"
arr = ts.open(
    {
        "driver": "zarr",
        "kvstore": {"driver": "file", "path": path},
    },
    open=True,
    read=True,
    write=True,
    create=False,
).result()

arr[(0,0,0):(64,64,32)] = 100

# Process 2
path = "path/to/my.zarr"
arr = ts.open(...) # Same as Process 1

arr[(0,0,32):(64,64,64)] = 200

The only mention I could find was in the homepage, under the list of highlights.

Supports safe, efficient access from multiple processes and machines via optimistic concurrency.

And some basic testing seems to suggest that this is indeed true.

However, is this guaranteed to be the case? Is there anything within the documentation that provides this guarantee?

P.S. Out of curiosity, how is the OCC actually implemented? Checking the last modified date of the Zarr chunk in which to write, or something along these lines?
P.P.S. Great library, by the way

The text was updated successfully, but these errors were encountered:

laramiel · 2024-09-30T17:43:25Z

To achieve this you need to use transactions, note, however, that this will not work with the s3 driver.

xantho09 · 2024-09-30T23:36:46Z

I see...

So something like this would be sufficient. Is that correct?

# Process 1
path = "path/to/my.zarr"
arr = ts.open(
    {
        "driver": "zarr",
        "kvstore": {"driver": "file", "path": path},
    },
    open=True,
    read=True,
    write=True,
    create=False,
).result()

with ts.Transaction() as txn:
  arr.with_transaction(txn)[(0,0,0):(64,64,32)] = 100

# Process 2
path = "path/to/my.zarr"
arr = ts.open(...) # Same as Process 1

with ts.Transaction() as txn:
    arr.with_transaction(txn)[(0,0,32):(64,64,64)] = 200

I do have some additional related questions:

Suppose the two processes write to the regions (0,0,0):(900,900,650) and (0,0,650):(900,900,1280), where the regions are fairly large and only share a comparatively small number of common chunks. I'll refer to these two transactions as T1 and T2.

If the transaction commits happened to clash and T2 needed to be rolled back and retried, does the entirety of T2 get rolled back and retried, or is this limited to the overlapping region of (0,0,640):(900,900,704)?
If process-safety isn't a requirement, roughly how much slower is performing write operations via transactions versus calling TensorStore.write() directly?

For simplicity, I'm limiting the scope to only the Zarr driver with an on-disk Zarr array.

Please and thank you.

jbms · 2024-10-21T22:55:16Z

You don't need to use a transaction to ensure that concurrent writes by different processes to disjoint portions of the same chunk are not lost --- this is always ensured, provided that you use a kvstore that supports atomic writes, like file or gcs (but unlike s3).

Both with and without use of an explicit transaction, only the conflicting chunks will be retried.

In the case of the file driver, optimistic concurrency is actually implemented by using filesystem advisory locks.

We are planning to add an option to disable the locking (currently does not exist), but I would expect that it adds very little overhead for a local filesystem if there is no contention. In general I expect it would only be noticed in the case of very small chunks, as otherwise the actual I/O would surely dominate.

What can have a large impact is disabling fsync (via https://google.github.io/tensorstore/kvstore/file/index.html#durability-of-writes).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are write operations with the zarr Driver guaranteed to be thread- and process-safe? #198

Are write operations with the zarr Driver guaranteed to be thread- and process-safe? #198

xantho09 commented Sep 30, 2024

laramiel commented Sep 30, 2024

xantho09 commented Sep 30, 2024 •

edited

Loading

jbms commented Oct 21, 2024

Are write operations with the zarr Driver guaranteed to be thread- and process-safe? #198

Are write operations with the zarr Driver guaranteed to be thread- and process-safe? #198

Comments

xantho09 commented Sep 30, 2024

laramiel commented Sep 30, 2024

xantho09 commented Sep 30, 2024 • edited Loading

jbms commented Oct 21, 2024

xantho09 commented Sep 30, 2024 •

edited

Loading