-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add class annotations and/or other metadata properties to labels #60
Comments
This issue has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/multi-scale-image-labels-v0-1/43483/3 |
Hi @DragaDoncila. Sorry for the slow response. Took some time to get caught up after the call. 😉 Having this conversation kicked off is great! And I certainly like what you're proposing with 3, but even though the v0.1 proposal hasn't really been officially released, there are a number of repositories that are already implementing it. A few options I can imagine are:
I should add that I think another similar breaking change may come when tabular data is supported in which case we may move some of this metadata into arrays for dealing with very large numbers of labels. |
Option 3 looks the cleanest but a big disadvantage is future additions to the spec may use property names that now clash with the user-defined ones unless there is some way to indicate reserved names. In this respect Option 2 seems better despite the duplication of |
Option 4 could be a variant of 3 where the user properties are under a dedicated subkey (I can't think of a good name so I've called it "properties": [
{
"label-value": 1,
"rgba": [
255,
100,
100,
255
],
"extra-properties": {
"class": "Urban",
"area_m2": "400",
"other": [1, 2, 3, 4]
}
}, |
I think I prefer Option 2! |
Hi everyone, Sorry for the late response - I've been finishing my honours thesis over the last few days so it's been packed. Thanks for all the input! Having read through the suggestions here, I think @manics concern about clashes with future reserved names is the biggest disadvantage of Option 3. The Despite initially thinking Option 3 was the way to go, I now actually think I agree with @will-moore that Option 2 seems preferable, as it fully separates spec properties and user defined properties. @joshmoore how does that mesh with your longer term view of tabular metadata? |
I don't think we necessarily need to constrain ourselves to the equivalent of tabular metadata for the |
This issue has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/multi-scale-image-labels-v0-1/43483/9 |
I was thinking about the reverse. I can see having the JSON keys be "deeper", but what does one do when wants to add gigabytes of tabular data? It's not required to solve that now but it will come up eventually. For what it's worth, https://www.w3.org/TR/csv2json/ has some examples. Looks like the method there is a top-level object per row. All the being said, I can definitely still see option 2 as a first non-breaking change that we iterate on. cc: @manzt |
Hello, we were also thinking about image regions where objects overlap, see discussion here: https://forum.image.sc/t/multi-scale-image-labels-v0-1/43483/7 I am not sure, but maybe this could be tackled by something like:
This would mean that label 3 is a region where labels 1 and 2 overlap. The "associated-label-values" is redundant with the "child-labels" and maybe should be removed. @constantinpape, do you maybe have comments or suggestions? |
@tischi yes, I think this could be a good solution for overlapping labels. I think this opens up a few more questions that are maybe also relevant for the overall discussion of the label properties:
|
I would say if we go for above list based approach we should not require a field to be present for all labels. If the storage layout would be more table based, then, I guess, yes, we would have to. I think above list based approach is nice as it provides a lot of flexibility in terms of different labels having more or less information attached to them. The disadvantage that I see compared with a table based approach is that it will require more storage space and could thus be quite slow to download and parse in order to e.g. build a table from it. Thus for use cases with millions of labels I am a bit worried about performance. |
I think we'll want both options: JSON style nested dictionaries for arbitary properties and support for tabular data. In the short term JSON dictionaries are relatively easy to add to the spec so it makes sense to start there. |
Whew. Ok. So it sounds like we have some points for future discussion, but generally a consensus that we could start building, no? @DragaDoncila, have you already started on a branch anywhere? If not but were looking to start, do you think you have everything you need for a first pass? |
@joshmoore I've started a branch, which has Option 1 already implemented. From what I read here, Option 2 is the consensus to start with, before we move on to adding support for tabular data. I think I have everything I need for a first pass, so I'll put up a WIP PR by Monday afternoon if that timeline is okay |
Sounds amazing. Thanks, @DragaDoncila ! |
This issue has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/ome-zarr-mask-label-metadata-class/103630/2 |
Currently the labels spec supports the declaration of a
label-value
and its associatedcolor
.Commonly, label values have other associated information including the most obvious, the class name. napari also supports display of label properties, so this would be a nice additional feature for the reader plugin.
I think the critical requirements for these properties should be:
label-value
/s it is associated withThere are three ways I can see the spec supporting these additional properties:
label-value
e.g.I think this is least explicit, and less intuitive than the next approaches.
label-value
has its own associated properties:This is explicit, but has the disadvantage of duplicating the
label-value
definitions.This doesn't duplicate
label-values
, and has the benefit of keeping all properties associated with a particularlabel-value
in one spot.On the implementation side, I think the differences in parsing the properties are negligible.
I'd love to hear what other people think are appropriate ways to represent the properties in the label metadata, or what they think the best option is.
The text was updated successfully, but these errors were encountered: