Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow uploading ZARR files #6205

Closed
philippotto opened this issue May 12, 2022 · 5 comments · Fixed by #7397
Closed

Allow uploading ZARR files #6205

philippotto opened this issue May 12, 2022 · 5 comments · Fixed by #7397
Assignees

Comments

@philippotto
Copy link
Member

philippotto commented May 12, 2022

One of the following would probably make sense to have for better zarr support:

  • Support arbitrary zarr files (probably should be converted with wkcuber so that a mag hierarchy and meta data exist)
  • Support zarr files which already follow our format
@jstriebel
Copy link
Contributor

see also #6120

@normanrz normanrz changed the title Allow importing ZARR files Allow uploading ZARR files Jun 10, 2022
@philippotto philippotto assigned frcroth and unassigned normanrz Sep 18, 2023
@fm3 fm3 added the discussion label Sep 18, 2023
@fm3
Copy link
Member

fm3 commented Oct 16, 2023

I’d say it makes sense to run a conversion job for zarr uploads to do re-chunking, sharding, etc.

We could also skip that in case there is already a datasource-properties.json (assuming that the user used the libs to create the zarr dataset already with optimal parameters). In this case, the backend also does not have to infer anything, but can just put the dataset on disk as it comes.

@normanrz @philippotto Do you think the existence of a datasource-properties.json (maybe together with the format-identifying zarr.json) is a good enough heuristic here? My guess is that we would always want to do re-chunking for zarr2 because it does not support sharding?

@normanrz
Copy link
Member

We don't have a rechunking job yet. So, maybe just ingest the zarr as is and write a datasource-properties.json?

@fm3
Copy link
Member

fm3 commented Oct 16, 2023

Fair enough. @frcroth I guess a good spot for this would be postProcessUploadedDataSource in UploadService.scala.

It would be nice if you could reuse some of the Explorer code to create the json from the files. I’m not sure how to do that in the datastore. Maybe you can figure that out. The Explorer classes will probably need to be moved. Also have a look at #7389 for recent changes of the FileSystemDataVault.

The frontend should also be adapted to not set needsConversion=true in this case. You can find a heuristic in the frontend (I think it checks for wkw files being present?).

Please let us know if you need further information!

@philippotto
Copy link
Member Author

You can find a heuristic in the frontend (I think it checks for wkw files being present?).

Yes, if the uploaded files contain a WKW file (or a ZIP which contains a WKW file), it is assumed that no conversion is needed. This is implemented in this method:

validateFiles = async (files: FileWithPath[]) => {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants