dask.array writing with distributed / multiprocessing schedulers #185

rabernat · 2024-10-11T16:06:09Z

Icechunk does support distributed writing of Arrays. However, currently Icechunk does not allow writing arrays via dask.array.store with distributed or multiprocessing schedulers.

This is because Icechunk must gather the results of each distributed write operation (metadata about the chunks that were written) back to the client in order to commit the transaction. Without this step, each worker's writes would be lost.

How it works today with regular Zarr

sequenceDiagram
    Client->>Worker0: Compute Chunk 0
    Client->>Worker1: Compute Chunk 1
    Client->>Worker2: Compute Chunk 2
    Worker0->>Object Store: Store Chunk 0
    Worker1->>Object Store: Store Chunk 1
    Worker2->>Object Store: Store Chunk 2

Once each worker has written their piece of data to storage, it's done.

Instead with Icechunk, it needs to look like this

sequenceDiagram
    Client->>Worker0: Compute Chunk 0
    Client->>Worker1: Compute Chunk 1
    Client->>Worker2: Compute Chunk 2
    Worker0->>Object Store: Store Chunk QYNXD
    Worker0->>Client:  Chunk 0 stored at QYNXD
    Worker1->>Object Store: Store Chunk PQM3C
    Worker1->>Client:  Chunk 1 stored at PQM3C
    Worker2->>Object Store: Store Chunk BUZXP
    Worker2->>Client:  Chunk 2 stored at BUZXP
    Note right of Client: Commit Snapshot
    Client->>Object Store: Write manifest

The dask.array.store code does not allow for returning metadata back to from each write process:

https://github.com/dask/dask/blob/20eeeda610260287a0dd50e4dd7b6a3cd8e007f3/dask/array/core.py#L1067

To overcome this, we need to either:

Update dask.array to accommodate our scenario
Write a custom dask function which generates the appropriate graph for an icechunk distributed write

This does work with the dask threaded scheduler because the same icechunk store can be shared in memory between threads.

There are some parallels here to what is required to allow Dask to write iceberg tables (apache/iceberg#5800).

The text was updated successfully, but these errors were encountered:

dcherian · 2024-10-29T14:02:18Z

Tracking list:

dcherian · 2024-11-20T23:06:41Z

I brought up the fact that to_zarr with a distributed Client is almost always a major footgun (#383) at today's Xarray meeting.

@shoyer suggested marking the store as read-only during unpickling to loudly fail in this scenario. That would allow thread-only parallelism and fail for anything else.

I like this idea, with one modification. We can have the user explicitly opt-in to receiving a writeable store after pickling.

with store.enable_distributed_writes():
    do_smart_things

That way they know they need to be careful. In our own to_icechunk, we can opt-in for the user.

The following will still be a footgun

with store.enable_distributed_writes():
	ds.to_zarr(store, ...)

but we can have the error message suggest using to_icechunk instead of to_zarr as a solution.

Thoughts?

rabernat · 2024-11-21T13:07:52Z

I think this is a very reasonable approach Deepak. 👍 to what you have proposed above.

Closes #383 xref #185

* Set store to read only after unpickling Closes #383 xref #185 * tpying

rabernat added the dask 📊 label Oct 11, 2024

dcherian mentioned this issue Nov 8, 2024

Zero values after saving to a local store with dask localcluster. #383

Closed

dcherian added a commit that referenced this issue Nov 21, 2024

Set store to read only after unpickling

cc15ae2

Closes #383 xref #185

dcherian added a commit that referenced this issue Nov 21, 2024

Set store to read only after unpickling

b6e09d8

Closes #383 xref #185

dcherian mentioned this issue Nov 21, 2024

Set store to read only after unpickling #405

Merged

dcherian added a commit that referenced this issue Nov 22, 2024

Set store to read only after unpickling (#405)

ce83a62

* Set store to read only after unpickling Closes #383 xref #185 * tpying

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dask.array writing with distributed / multiprocessing schedulers #185

dask.array writing with distributed / multiprocessing schedulers #185

rabernat commented Oct 11, 2024

dcherian commented Oct 29, 2024 •

edited

Loading

dcherian commented Nov 20, 2024

rabernat commented Nov 21, 2024

dask.array writing with distributed / multiprocessing schedulers #185

dask.array writing with distributed / multiprocessing schedulers #185

Comments

rabernat commented Oct 11, 2024

dcherian commented Oct 29, 2024 • edited Loading

dcherian commented Nov 20, 2024

rabernat commented Nov 21, 2024

dcherian commented Oct 29, 2024 •

edited

Loading