-
-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent use of 'Zarr format 2 or 3' #2645
Conversation
I think "zarr format 2/3" works in some places, when talking about the spec, but when talking about data I think it's a bit odd. How about "zarr v2 data" or "zarr v3 data" when talking about actual data? I left an example suggestion for this below to show what I mean. |
Well, the whole point of this PR is to make it consistent. To me, "v2 data" or "v3 data" doesn't make any sense, because the data are n-dimensional arrays. Only through encoding in the codec pipeline, they become "Zarr format {2,3} arrays". That is probably another inaccuracy that needs fixing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense - still not a huge fan of format
, but it's an improvement and I don't have any better suggestions!
I left a couple of minor unrelated suggestions that I found when reviewing, feel free to take or leave.
"zarr format 2" reads less elegantly than "Zarr v2". If we are consistent about always referring to the library as And because it's sensible for someone to say "I saved my data using zarr v2", I think it's also sensible for someone to say "zarr v2 data", because they are denoting the stored representation of their n-dimensional arrays + groups. |
Co-authored-by: David Stansby <[email protected]>
I agree that is sounds more elegant. But the kwarg for specifying the "version" is |
Co-authored-by: David Stansby <[email protected]>
Using the keyword argument "zarr_format" makes sense in the context of a function for creating data, but to me that's pretty separate from the context of documentation about the library / format. For the docs, we should use the same language we expect people who save their data in zarr to use. I personally would say things like "I saved my data in zarr v2 and v3", (or simply "zarr 2" and "zarr 3"), definitely not "I saved my data in zarr format 2 and zarr format 3". I agree that we need to avoid confusion between Zarr (the format) and |
To be fair, most changes I did were in docstrings in the contexts of creating arrays.
I assume most users pop in and out of the docs through google searches or direct links. I don't think it would help to have this section, if nobody would find it. |
Even if the language doesn't please everyone, +1 to making things consistent. we can always tweak it later. |
I went through the docstrings, docs and user-facing error messages to make consistent use of "Zarr format 2 or 3" instead of v2 and v3. I think this is less confusing, because it clearly refers to the Zarr format spec version. Otherwise, it would be ambiguous whether the format version or library version is meant.