-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZEP9 (phase 1): add clarifications for extension naming #330
base: main
Are you sure you want to change the base?
Conversation
@joshmoore - really glad you got this started! 🙌 My feedback is that the PR is hard to review. It touches 15 files, including a ton of minor, unrelated formatting changes to the core spec document. If we want folks to engage and give meaningful feedback, we need to make it easier to review. I'd recommend starting fresh with a minimal PR in which the diffs are reflective exclusively of the actual proposed changes. |
Remaining text blocks are likely to be re-used under the more general "Extension points" section. see: zarr-developers#312
549cc16
to
454faaf
Compare
👍
You're right. I've extracted out #331.
I disagree that they are unrelated. Take a look. The sections I've modified were basically already un-parseable. Since I was adding sections, the outline was getting more convoluted.
👍 Give it a look and let me know what you think. |
Thanks for all of your work on this! My current understanding of the practical effect of proposal is as follows: -raw names will be granted fairly easily, e.g. zstd, bfloat16, and others I've proposed would be assigned to me, the ones that zarr-python has started using (string, bytes, vlen-utf8, etc.) would be assigned to someone from zarr-python. URL names will be used only for really experimental stuff, all commonly-used extensions will have raw names since they will be minimal effort. Therefore, the verbosity of the URLs is not really a problem in practice.
The lack of basically any review worries me a bit. But ultimately I'm in favor of this proposal because I think it reflects the reality that the ZEP process isn't working for the existing extension points, and it would be better to just rely on a less formal process. |
I share your concerns to some degree. I think we can adapt the governance structure for extensions in the future, if we think that a more thorough review process would be necessary. We are thinking of forming a zarr specs team that could take on that responsibility. |
thanks so much for working on this josh! I have a few high-level comments: URIs as names still feels unmotivated.How would we explain to someone developing a new data type why they would need to use a URI for the name? I don't think I could give a good justification for this decision right now, and that's a problem.
On a practical level, I think it would be good to have a guide for people who want to make their own codecs / datatypes / chunk grids. Should they use a "raw name" or a URI name? That isn't clear in the text right now. |
It feels like we have given an explanation for the reasons of the URI names a few times now. Let me reiterate one more time:
We know that there might be other options here, but that is the design we landed on. |
In this comment I asked several design questions which are all posed in the following form: "if we want |
Stores are *not* extension points since they define the mechanism | ||
for loading metadata documents such that extensions can be loaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If "extensibility" is defined as a property of a field in an a metadata document, then we don't need this note, because stores are not defined in metadata.
another very basic question about, e.g., a new data type that uses a URI for its name. That new data type should have a standard handle (like "complex256") that can be used as an identifier in most programming languages, which rules out a URI. Should the URI be defined such that the final path component of the URI sans "."-delimited suffixes is the handle for the data type? e.g. "http://foo.com/complex256.schema.json" would be a URI for a data type called "complex256"? Without a constraint like this, it's not clear that an extension has a human-and-software-friendly name, but I think this is an important feature. |
Thanks for the various suggestions, @d-v-b. I've pushed a commit for the comments that I've resolved to this point. Based on the discussion above, I went ahead and restricted this PR as phase 1 of ZEP9 to just discuss URLs and depending on that discussion phase 3 can address the issue of URIs if at all. That leaves as the other major next steps:
|
I think my concerns in the discussion above apply equally to URLs and URIs alike. |
How should implementations interpret the requirement that extension names be either a string registered on a zarr extensions github repo, or a URL? Suppose a user is working on a new dtype. It's unpublished code on a single computer; it has no spec, and no URL. Should I think it's really important we support this use case, because solo tinkering and experimentation is where many new dtypes / codecs / etc come from. I think this argues against stating that extension identifiers MUST be registered on github or a URL. More broadly, I don't think the spec should make any MUST statements about things that cannot be locally evaluated at runtime, which excludes any dynamic online registry lookups. Of course it's vital that extensions are discoverable, documented properly, maintained, etc. But IMO the rules for this process should be defined outside the core spec. Otherwise we will make normative statements that are very hard for implementations to work with. |
I don't see that as a practical issue. The spec defines spec-compliant metadata and behavior with the intention of organizing interoperability. Here is what I think zarr-python might do:
|
This PR clarifies the extension mechanism concept in the v3 specification. Comments on any changes which will break existing implementations are STRONGLY encouraged. Please see zarr-developers/zeps#65 for background material.
TODOs:
Post-merge: