-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API for writing HCS metadata #153
Conversation
Define two new methods write_plate_metadata and write_well_metadata including internal validation methods
Codecov Report
@@ Coverage Diff @@
## master #153 +/- ##
==========================================
+ Coverage 71.28% 80.49% +9.20%
==========================================
Files 11 11
Lines 1132 1174 +42
==========================================
+ Hits 807 945 +138
+ Misses 325 229 -96
Continue to review full report at Codecov.
|
For writing HCS datasets where group names are case-sensitive (e.g. A), this option alters the name and creates a mismatch with the plate metadata.
After some quick investigation of https://github.com/ome/ome-zarr-py/runs/4714215369?check_suite_focus=true, the failure is unrelated to the Zarr development branch but the tests were simply failing on Ubuntu. Reproducing them in a Docker environment, I found the issue comes from the key normalization which leads to the creation of Zarr group |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 but one "future warning" so to speak.
ome_zarr/writer.py
Outdated
images[index] = {"path": str(image)} | ||
elif isinstance(image, dict): | ||
if not all(e in VALID_KEYS for e in image.keys()): | ||
raise ValueError(f"{image} contains invalid keys") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec has been updated with a "MUST NOT" have other keys? Eventually, that could be problematic. Perhaps the prefix mechanism that @will-moore found in json-schema that leaves them unchecked could eventually be introduced here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking up on this. I have not found a clear statement regarding the validity of additional keys in the ngff spec. This strict implementation is probably derived from my reading of multiscales
where I assume metadata
is the single key that tried to capture extra arguments.
That being said, I totally see this MUST NOT interpretation is and will be limiting both for the extension of the specification itself as well as for supporting external metadata not covered by the spec.
The well
metadata is a perfect example as ome/ngff#24 defines more keys. Assuming someone wanted to write a version 0.3 with these new keys populated, is it legit and desired to let the writer implementation write this metadata? If so, is it worth a logging statement at WARN, INFO or DEBUG level? Or should this writer only care about the keys defined in the spec and ignore anything extra?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was salad schema
where you could use a prefix:
to avoid validation, but I think for json-schema you don't need to do that and it won't fail for unrecognised attributes.
Pushed 12b341d in response to the concerns expressed in #153 (comment) about unspecified keys. With this commit, the writer handles leniently unspecified keys are present in lists of dictionaries (acquisitions, images). The unit tests are adjusted to ensure the group creation is successful with all keys been written. If we agree with this approach, I think it makes sense to implement immediately the proposal made at the end of this PR description i.e. update the |
Good catch. ff197a0 should unify both behaviors. Since the behavior is a simple logging statement, that's not something that the unit tests picked up |
@sbesson Thanks PR looks good. Updating |
Originally, that's what I thought but with different fires coming up, I feel this will get pushed. If you and @joshmoore are happy with the state of the latest proposed API, I'd propose to get this merged and have the new tests as well as the re-activated Ubuntu tests merged into the mainline. I'll handle the new wells type handling in a follow-up PR which should be self-contained and easier to review |
Follow-up of #149, this adds two new top-level APIs to the
ome_zarr.writer
module:write_plate_metadata
andwrite_well_metadata
for creating plate and well specifications.Unit tests are added to cover the combinations of parameters both in writing mode (
test_writer
) as well as in writing/reading mode (test_node/test_reader
). This should significantly increase the coverage of the HCS codepath especially inreader.py
.The
test_plate_2D5D
unit test reproduces the issue raised in #145 and is currently marked asxfail
. As the bug should be addressed in #148, I would propose to prioritize the review and integration of this PR first and remove the test marker in #148 together with the fix.Thinking of the upcoming sepcification changes, the biggest potential implication for this new API are the changes proposed in ome/ngff#24 to the
wells
attribute of theplate
specification with the addition ofrowIndex
andcolumnIndex
attributes. A proposal would be to preemptively change type of thewells
argument toList[dict]
and add an internal API similar to_validate_plate_acquisitions
to inspect the required fields conditionally to the format version.