-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data dimensionality and axes metadata #35
Comments
TCZYX. ;)
Yes please! And in fact the axis names should be whatever, I we should not be limited to subsets of "TCZYX". eg could be ["lat", "lon"] or ["left-right", "superior-inferior", "anterior-posterior"]. |
@jni: what behavior would you expect for an array with no x or y? |
Based on how this discussion evolved: #28 I guess the axis names may be part of the specification of the transformation from data space to physical space, is it? |
@joshmoore What do you mean by "behavior", maybe "how it would be rendered in a viewer"? |
In my opinion, a generic image viewer should have no intrinsic opinion about the particular axis names of the data it displays. If the user has 2D data with axes labelled X and B, then the viewer should display the data (with a default, but overrideable, mapping from data coordinates to viewer coordinates) as an image with one axis labelled "X" and the other axis labelled "B". If the data axis labelled "X" happens to be mapped to a display axis also called "X", then that is just a happy coincidence. A general-purpose data visualization tool should not assign any "meaning" to an axis name like "X" or "T". A more specialized tool might have an opinion about axis names, though. |
The way I interpreted the status of our discussion at #28 is that there is no default mapping, but a mapping must be always provided, or did I get this wrong? |
Ah, sorry for causing confusion (and maybe we are straying away from the original question @joshmoore posed) -- Yes, I have the same interpretation of the discussion in #28. My (confusingly stated) point in the comment above was just that general purpose data visualization tools shouldn't have an opinion / preference for specific axis names in the transform metadata. |
I think this is still up for discussion. @axtimwalde made the point that no transformation could just be interpreted as identity transform. And no axes labels would mean that the data stays in pixel space. @joshmoore what do you think about allowing to save also 2d, 3d and 4d data. I think this is the first important decision to drive #28 (and probably also other discussions) forward. |
@jni based on state of the discussion in #28 I wonder now whether your comment is about axis names in data space or in physical space. Currently, I would think we simply have no axis names at all in data space. In physical space I think it is nice to know which axis should be the "x" axis such that the viewer can display the data accordingly. Thus I think this information should be there. What we could think of, on top of the specification which on is the "x" axis, to have something like optional axis_names metadata:
Would that work for you? |
I think I'd prefer that it is required to specify the axes labels, because in practice it makes a big difference whether one displays a 3D data as xyz or xyc 😉 Unless we agree that specifying nothing defaults to axes of |
@tischi as mentioned on #28 we do not want to prescribe here where physical axes go on the screen. There is a third space, which is the screen space, and all kinds of transformations can happen between physical/world space and screen space, not least of which is a 3D -> 2D projection. I also don't think axis label specification should be a requirement, but a strongly encouraged metadata. As mentioned by others, requirement makes the spec not backward-compatible. Indeed, treating channels as spatial by default is fine: most viewers have the ability to separate out channels. (napari notably doesn't 😅 but we are definitely planning it!) |
OK, I guess I could live with "strongly encouraged" 😉 |
@jni I get the point about requirements and backwards compatibility. But, in practice, let's say the vision is to be able to chain a set of napari plugins into an image processing workflow. My feeling is that it may be necessary to require to know which axes are spatial and which axis is the channel axis. What do you think? |
I've been working under the assumption that it would eventually be necessary (cf. the IMS file structure). It certainly has the potential to complicate and possibly slow-down implementations, so I'd just urge balancing how soon its introduced against immediate need. |
On the topic of XYZ or not necessarily XYZ, I have some concern that not having these takes us outside the realm of OME-* specs and closer to underlying numpy/zarr/n5/etc. specs, which is fine, but is something we should consider. If the axes are named arbitrarily, then quite possibly the axes metadata SHOULD additionally define which are orthogonal to one another and in what right-handed order cf. (har) http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#coordinate-types Edit: ah, I see while working through issues that this also came up in #28 (comment) |
As this is quite a big change and has implications for other parts of the spec, I would argue that this change should be done sooner than later if deemed necessary.
I personally also think we shouldn't allow for arbitrary axis naming and stick to XYZCT. |
To be clear, I can certainly imagine having additional axes. But if there is no traditional X, Y, or Z axes in a given zarray, I don't know if I would consider it an image in the sense that is currently defined in this repository. (If anyone has a counter-example, I'd love to hear it.) |
Medical imaging often uses anatomical coordinates, which do not involve the letters "X", "Y", or "Z": https://www.slicer.org/wiki/Coordinate_systems |
@d-v-b: I guess I'm less concerned with naming, that's "just metadata". ;) But in all three you are in a 3D, right-handed coordinate system, right? I guess in my head (forgive me if I'm being biased) the ALS and IJK coordinate systems from slider.org could be equated to XYZ and then one need just provide which system one is under. For comparison, in the high-content screening case, there are rows and plates but there's additionally metadata to say that the rows are letters and the columns are numbers. |
I think the axes metadata part of this issue became quite overlapping with the discussion in this issue: #28, where the last posts were also about the handedness of the coordinate system and how much we want to commit to x, y, and z. Could it therefore make sense to continue this discussion on axes metadata in #28 and here just discuss how many data dimensions we would like to support? |
A data format that supports only 5 dimensions is asking to be obsolete within 2 weeks ;). |
As a concrete case of more-than-5-D data, a team here is developing polarization microscopy, so each pixel has 7 coordinates: 3 spatial, the 3 components of the polarization vector, and time. Of course you can store the polarization as channels, but it gets tricky to encode a transformation then, as for example a rotation needs to apply to both the spatial and polarization coordinates. |
Ok, so I think dropping the requirement for 5d is not really controversial, whereas there's still some discussion about the axes labels. I have been thinking a bit about how to drive the spec forward, and I think it would make most sense to start with a rather small change:
What do you think @joshmoore? I can start working on this. |
I want this on a 👕 😉
How would you optionally encode them?
💯 |
{
"axes": ["x", "y", "z", "rho", "theta", "phi", "t"],
"units": ["micrometer", "micrometer", "micrometer", "radians", "radians", "radians"]
} |
@glyg That's an interesting use case! As mentioned above, I think this may be quite overlapping with #28 where we discuss how to map from data space (no units, just dimensions) to physical space (e.g. spatial or possibly angles). So maybe it could be useful to look at this issue and maybe re-post your example there. |
#39 now introduces axes as a MUST field in To summarize, I think we have discussed the following possible changes (relative to 0.2):
As far as I can see none of these changes would be breaking with the 0.2 proposal. |
Hi - adding in a few cents here as well... When I was reading it, I was thinking about what viewers would like best. I think this issue/discussion should allow a complete newcomer to design a super simple viewer, that enables rudimentary viewing of all data that claims to be ngff. One of the reasons people still go around using pngs, jpgs, tifs and the likes is that they can view them with their system image viewer, by simply drag and drop. Ever tried this with an hdf5 with the de-facto image viewer of the bioimage community - Fiji?! No dice. When the outcome of this discussion here is, we allow arbitrary data with arbitrary axes, then this is as good as doing nothing. No new developer will be able to come up with a viewer that makes sense based on the specification. I think this encourages fragmentation. No one would be able to "understand" the data. With a fixed, limited set of axes in the data/pixel/image/voxel space you could truly have a format that all viewers could support, where looking into the image space will look more or less the same in all. Isn't this one of the goals? The semantic meaning of the axes and units and the likes can be handled by smarter viewers: depending on the application they might use the transformation (as discussed in #28). |
Adding to the comment above: I think some axes should have fixed meaning and name: |
See the new PR at #46 |
This issue has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/next-call-on-next-gen-bioimaging-data-tools-early-september-2021/55333/14 |
To summarize the current state:
I think it's straightforward to also add an optional field In addition, I can see two more controversial potential changes that lift the restrictions above:
I am personally more in favor of keeping the spec more restrictive, but we need to see if there are some important use-cases that cannot be covered with the current spec. This is also very relevant for the issue of specifying transformations. |
Note also the proposal by @bogovicj and @axtimwalde here, which introduces a label, type and unit per dimension with a list of objects (=map/dict). |
Once you've specified a unit (assuming it's an SI unit), you have basically already specified the axis type, no? So it seems like axis_type is unnecessary (and potentially confusing, if someone accidentally does something like |
The below was discussed in the ngff meeting on 01 Sept 2020 A counter example might be channels acquired at different wavelengths (physical unit), which clashes with spatial domain.
|
Maybe the word |
Follow up from last week's ngff meeting: there was fairly broad consensus that the axes label should be decoupled from the semantic meaning and in consequence a new field for the "semantic" axes type (time, space, channel (or similar, see comment by @tischi above). In addition, we want to add I will start to work on spec v0.4 now and begin by making a PR for the changes laid out above; I will implement the solution that seems best to my judgment and try to lay out all discussion points I can see in the PR. We will announce once the PR is ready to be discussed on github and on image.sc. |
|
HI @constantinpape et al. Just wanted to make you aware of some of discussion around axes metadata in this neuroglancer issue. It'd be good to know how some of the discussions therein could be fed into the discussion/proposal process for the ome-ngff specs on axes metadata. |
as a slight aside: regarding units as text we have found this text representation quite useful: https://people.csail.mit.edu/jaffer/MIXF/CMIXF-12 and we adopted this in the BIDS standard (https://bids-specification.readthedocs.io/en/stable/99-appendices/05-units.html). here is a python library to support parsing: https://github.com/sensein/cmixf |
I have started to put something together for the new axes metadata based on the discussions here in #57. |
This is now implemented with v0.4 :). |
In last weeks meeting the question of data dimensionality came up again (in the morning it was raised by @jni, and I think it came up in the afternoon as well).
Currently, the spec demands that all data is 5 dimensional (I think with axis order TCXYZ, but I am not quite sure).
Do we want to lift the restriction and allow data of lower dimensionality? In this case, we would add metadata in
multiscales
to describe the axes (e.g."axes": ["x", "y", "z"]
).Note that this is also important for the transformation spec #28, where we need to clarify which axes a transformation applies to.
Independent of the decisions, we should add a field that describes the physical units of the axes, e.g.
"units": ["micrometer", "micrometer", "micrometer"]
.The text was updated successfully, but these errors were encountered: