-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NGFF json validation #69
Conversation
I don't see how to make a field 'optional' to fit with the
EDIT - fixed below |
Opened issue common-workflow-language/schema_salad#460 to ask about optional fields needed at #69 (comment) - EDIT: fixed! |
That last commit was testing JSON_schema for validation (see #75). Need to serve the schemas at their URLs. e.g.
Then in a different Terminal:
|
Avoids the need to server omero.schema via http.server
Fails to find no_multiscales, no_axes and no_datasts as invalid
examples/invalid/missing_name.json
Outdated
], | ||
"type": "gaussian", | ||
"metadata": { | ||
"method": "skimage.transform.pyramid_gaussian", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, since this is very Python focus, so if one is reading the file in Java for example a similar function will need to be found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's just documenting how the file was created. It doesn't affect how you'd read the data in Python or Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking if we should follow a similar strategy for the rendering settings e.g. Fiji.salt&Pepper cc @dominikl
@joshmoore I added a couple more examples.
|
Branch updated so all files live under 0.3 and tests are run automatically. The general plan would be:
Description will need updating. |
Hmm... ok. I had tried to make v0.2 disallowed in the shacl ("if version != 0.3, fail()"), but I guess that was failing for some reason. |
Yes, the v0.2 file fails for shac1 (which is not skipped). |
Guess I'm not overly convinced that we want to consider a v0.2 versioned file as a valid v0.3. i.e. I'd almost (even in this PR) copy the schema to v0.2 and do that validation under |
OK, that comes back to my previous question of how to handle versions. Does the validation code need to inspect the file, read the version and pick a schema, or can a single top-level schema handle that? |
"$schema": "http://json-schema.org/draft-07/schema#", | ||
"$id": "collection_schema.json", | ||
"name": "JSON Schema for NGFF Collection", | ||
"description": "Attempt at a JSON Schema to create valid JSON-LD descriptions. Limited to using a few schema.org properties.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this was used to drive the validation but having a collection.schema
under the 0.3
namespace feels fairly incorrect. Can we split into a separate branch or potentially a separate directory e.g. {draft,in-progress}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove it for now and keep a local copy until we're ready to work on this again.
@@ -0,0 +1,37 @@ | |||
{ | |||
"$id": "http://localhost:8000/omero.schema.json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not 0.3/schemas/json_schema/omero.schema
to match the previous naming scheme?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually used for validation any more, so I'll remove it.
I think it will be the former. Certainly in the JSON-LD space, you use the context to decide what version it is. For the moment, ome-zarr-py already detects and can then use the correct json-schema file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the addition of a first set of validation schemas together with valid/invalid examples as well as the validation code makes sense and I am all for getting this merged so that more can be added.
The newly introduced validation GitHub workflow is invalid as noted inline. Can we get this CI check to pass before merging?
A few other longer term thoughts:
- are the additional validation todos from the description going to be turned into an issue?
- I assume the
<version>/schemas/{json_schema,jsonld,salad_schema}
hierarchy is largely justified by the fact we are still investigating validation framework. I assume we would simply put schemas under<version>/schemas/
in the long run
As a final point, the new schemas effectively starts defining a ngff:Image
concept which will probably force us to clarify certain aspects in the specification. It certainly raises a few immediate questions e.g. must a ngff:Image
always contain multiscales
metadata or are label images subclasses of ngff:Image
?
|
Likely, unless we definitely will be keeping multiple over time. |
You mean from the shacl work? I think that would be introduced in 0.4+, they are only in the examples so that the validation works at all. (But generally, yes: I assume images always have multiscales and Labels are subclasses of Image) |
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/next-call-on-next-gen-bioimaging-data-tools-2022-01-27/60885/11 |
In this PR we have evaluated 3 options for validating the JSON data within OME-NGFF files:
This PR includes a bunch of example files, both valid and invalid samples used in the tests.
Tests for JSON-schema and shacl are run with:
SALAD
JSON-schema
This is a checklist of rules for
multiscales
that a valid OME-ZarrMUST
obey to satisfy the current 0.3 schema (not including HCS). Checked indicates they are checked by this validation. (NB: there are lots of rules that are not indicated by a MUST in the spec):To discuss or look into...
SHOULD
rules, although we do check the data structure.ome-zarr-py
picks the correct schema to validate with, based on the version number. This wouldn't be able to handle different version numbers for different parts of the spec. I don't see a way to do "if version==0.3 then axes must be a list of strings, but it's optional for version 0.1 and 0.2 and for version 0.4 is should be a list of axes objects"omero
schema