From 4703d89c518190ee78d95c6543d5125492bdd1f0 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Tue, 7 Sep 2021 15:52:16 +0200 Subject: [PATCH 01/18] Start extending the axes related fields in multiscales --- latest/index.bs | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index e82eee26..5c0176a3 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -219,12 +219,15 @@ Each dictionary in "datasets" MUST contain the field "path", whose value contain to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest. It MUST contain the field "axes", which is a list of dimension names of the axes. -The values MUST be unique and one of `{"t", "c", "z", "y", "x"}`. -The number of values MUST be the same as the number of dimensions of the arrays corresponding to this image. -In addition, the "axes" values MUST be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups -(i.e. groups containing arrays with the multiscale data). +The values provide axes labels and MUST be unique, the number of values MUST be the same as the number of dimensions of the arrays corresponding to this image. +In addition, the "axes" values MUST be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). This ensures compatibility with the [xarray zarr encoding](http://xarray.pydata.org/en/stable/internals/zarr-encoding-spec.html#zarr-encoding). +It MUST contain the field "axes_types", which is a list that describes the semantic type of each axis. It MUST contain only the elements "space", "time" and "channel" and MUST have the same length as "axes". +These types are hints TODO + +It MUST contain the field "units" TODO + It SHOULD contain the field "name". It SHOULD contain the field "version", which indicates the version of the From e72ee48802554614f42629828d8ffa382e66a2b3 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Fri, 1 Oct 2021 10:27:08 +0200 Subject: [PATCH 02/18] Move axes to its own section --- latest/index.bs | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index 5c0176a3..b495691f 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -205,6 +205,18 @@ Metadata {#metadata} The various `.zattrs` files throughout the above array hierarchy may contain metadata keys as specified below for discovering certain types of data, especially images. +"axes" metadata {#axes-md} +-------------------------- + +Describes axes of a physical coordinate space. It is a dictionary, which MUST contain the fields: +- "labels": list of strings that specify the name per dimension. The values MUST be unique. +- "types": list of strings that specify the type per dimension. The values SHOULD be one of "space", "channel", "time" or "". Use "" if none of the other options apply. +- "units": list of strings that specify the unit per dimension. + +The three lists MUST have the same length. +If part of [[#multiscale-md]], the length MUST be equal to the array. + + "multiscales" metadata {#multiscale-md} --------------------------------------- @@ -218,9 +230,8 @@ the arrays storing the individual resolution levels. Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest. -It MUST contain the field "axes", which is a list of dimension names of the axes. -The values provide axes labels and MUST be unique, the number of values MUST be the same as the number of dimensions of the arrays corresponding to this image. -In addition, the "axes" values MUST be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). +It MUST contain the field "axes", see [[#axes-md]] and the length of the lists in "axes" must be equal to the number of dimensions in the array. +The "labels" list must be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). This ensures compatibility with the [xarray zarr encoding](http://xarray.pydata.org/en/stable/internals/zarr-encoding-spec.html#zarr-encoding). It MUST contain the field "axes_types", which is a list that describes the semantic type of each axis. It MUST contain only the elements "space", "time" and "channel" and MUST have the same length as "axes". From 212b2dd09a9d9ec1178c9eb236d72d177476c23b Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Fri, 1 Oct 2021 10:32:14 +0200 Subject: [PATCH 03/18] Remove left over axes description from multiscales --- latest/index.bs | 5 ----- 1 file changed, 5 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index b495691f..316dd756 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -234,11 +234,6 @@ It MUST contain the field "axes", see [[#axes-md]] and the length of the lists i The "labels" list must be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). This ensures compatibility with the [xarray zarr encoding](http://xarray.pydata.org/en/stable/internals/zarr-encoding-spec.html#zarr-encoding). -It MUST contain the field "axes_types", which is a list that describes the semantic type of each axis. It MUST contain only the elements "space", "time" and "channel" and MUST have the same length as "axes". -These types are hints TODO - -It MUST contain the field "units" TODO - It SHOULD contain the field "name". It SHOULD contain the field "version", which indicates the version of the From df835f1f84b3e2dbcaa82d0e11c61d4080c8f704 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Fri, 1 Oct 2021 10:59:17 +0200 Subject: [PATCH 04/18] Remove empty type; not necessary for SHOULD req --- latest/index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/latest/index.bs b/latest/index.bs index 316dd756..d68b3e97 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -210,7 +210,7 @@ keys as specified below for discovering certain types of data, especially images Describes axes of a physical coordinate space. It is a dictionary, which MUST contain the fields: - "labels": list of strings that specify the name per dimension. The values MUST be unique. -- "types": list of strings that specify the type per dimension. The values SHOULD be one of "space", "channel", "time" or "". Use "" if none of the other options apply. +- "types": list of strings that specify the type per dimension. The values SHOULD be one of "space", "channel", "time". - "units": list of strings that specify the unit per dimension. The three lists MUST have the same length. From b3523c9ae0678b6b7cef76c88f20c3e3f58a8fdb Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Fri, 1 Oct 2021 11:03:09 +0200 Subject: [PATCH 05/18] Make number of dim requirement more explicit --- latest/index.bs | 1 + 1 file changed, 1 insertion(+) diff --git a/latest/index.bs b/latest/index.bs index d68b3e97..621eb16b 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -229,6 +229,7 @@ Each dictionary contained in the list MUST contain the field "datasets", which i the arrays storing the individual resolution levels. Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest. +All arrays MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. It MUST contain the field "axes", see [[#axes-md]] and the length of the lists in "axes" must be equal to the number of dimensions in the array. The "labels" list must be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). From 70ebede458dc74f34f5714ba304092ffb547cc40 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Mon, 18 Oct 2021 21:29:17 +0200 Subject: [PATCH 06/18] Change axes to list of dicts --- latest/index.bs | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index 621eb16b..bbd24d9d 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -208,13 +208,12 @@ keys as specified below for discovering certain types of data, especially images "axes" metadata {#axes-md} -------------------------- -Describes axes of a physical coordinate space. It is a dictionary, which MUST contain the fields: -- "labels": list of strings that specify the name per dimension. The values MUST be unique. -- "types": list of strings that specify the type per dimension. The values SHOULD be one of "space", "channel", "time". -- "units": list of strings that specify the unit per dimension. +"axes" describes the dimensions of a physical coordinate space. It is a list of dictionaries, where each dictionary describes an dimension (axis) and: +- MUST contain the field "name" that gives the name for this dimension. The values MUST be unique across all "name" fields. +- SHOULD contain the field "type" to specify the type of this dimension. The value SHOULD be one of "space", "channel" or "time". +- SHOULD contain the field "unit" to specify the physical unit of this dimension. The value SHOULD be a valid unit according to UDUNITS-2. -The three lists MUST have the same length. -If part of [[#multiscale-md]], the length MUST be equal to the array. +If part of [[#multiscale-md]], the length of "axes" MUST be equal to the number of dimensions of the arrays that contain the image data. "multiscales" metadata {#multiscale-md} @@ -231,8 +230,8 @@ Each dictionary in "datasets" MUST contain the field "path", whose value contain to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest. All arrays MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. -It MUST contain the field "axes", see [[#axes-md]] and the length of the lists in "axes" must be equal to the number of dimensions in the array. -The "labels" list must be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). +It MUST contain the field "axes", see [[#axes-md]]. +The values of the "name" field must be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). This ensures compatibility with the [xarray zarr encoding](http://xarray.pydata.org/en/stable/internals/zarr-encoding-spec.html#zarr-encoding). It SHOULD contain the field "name". @@ -256,7 +255,11 @@ It SHOULD contain the field "metadata", which contains a dictionary with additio {"path": "2"} ], "axes": [ - "t", "c", "z", "y", "x" + {"name": "t", "type": "time", "unit": "millisecond"}, + {"name": "c", "type": "channel"}, + {"name": "z", "type": "space", "unit": "micrometer"}, + {"name": "y", "type": "space", "unit": "micrometer"}, + {"name": "x", "type": "space", "unit": "micrometer"} ], "type": "gaussian", "metadata": { # the fields in metadata depend on the downscaling implementation From a8597d7a018c79d7b155f0810d92e9949d2fe566 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Mon, 18 Oct 2021 22:01:23 +0200 Subject: [PATCH 07/18] Add transformation spec with only simple transformations from #63 --- latest/index.bs | 57 ++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 47 insertions(+), 10 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index bbd24d9d..e2d2fde6 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -216,6 +216,27 @@ keys as specified below for discovering certain types of data, especially images If part of [[#multiscale-md]], the length of "axes" MUST be equal to the number of dimensions of the arrays that contain the image data. +"transformation" metadata {#trafo-md} +------------------------------------- + +Describes a transformation, e.g. to transform the discrete data space of an array to the physical space. +It is a dictionary, which MUST contain the field "type". +The value of "type" MUST be one of the elements of the `type` column in the table below. +Additional fields are defined by the column `fields`. + +| type | fields | description | +|- |- |- | +| `identity` | | identity transformation, is the default transformation and is typically not explicitly defined | +| `translation` | one of: `"translation":List[float]`, `"path":str` | translation vector, stored either as a list of floats (`"translation"`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. | +| `scale` | one of: `"scale":List[float]`, `"path":str` | scale vector, stored either as a list of floats (`scale`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. | +| `rotation` | one of: `"rotation": List[float], "path:str"` | rotation vector, stored either as a list of floats (`rotation`) pr as binary data at a location in this container (`path`). The length of the vector must either be one for a rotation in two dimensions or three for a rotation in three dimenions. In 3d, the rotation angles are given in the order TODO. | + +In addition, the field "axisIndices" MAY be given to specify the subset of axes that the transformation is applied to, leaving other axes unchanged. If not given, the transformation is applied to all axes. The length of "axisIndices" MUST be equal to the dimensionality of the transformation. If "axisIndices" are not given, the dimensionality of the transformation MUST be equal to the number of dimensions of the space that the transformation is applied to. +If given, "axisIndices" MUST be given in increasing order. + +If transformations are stored in a list, e.g. as part of [[#multiscale-md]] metadata, they are always applied sequentally and in order. + + "multiscales" metadata {#multiscale-md} --------------------------------------- @@ -224,15 +245,21 @@ found under the "multiscales" key in the group-level metadata. "multiscales" contains a list of dictionaries where each entry describes a multiscale image. -Each dictionary contained in the list MUST contain the field "datasets", which is a list of dictionaries describing +Each dictionary MUST contain the field "axes", see [[#axes-md]]. +The values of the "name" field must be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). +This ensures compatibility with the [xarray zarr encoding](http://xarray.pydata.org/en/stable/internals/zarr-encoding-spec.html#zarr-encoding). + +It MUST contain the field "datasets", which is a list of dictionaries describing the arrays storing the individual resolution levels. Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest. All arrays MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. +Each dictionary MAY contain the field "transformations", which contains a list of [[#trafo-md]] that specifies the transfromation from the data coordinate space to the physical coordinate space (as specified by "axes") for this resolution level. +The transformations MUST only be of type `identity`, `translation`, `scale` or `rotation`. This restrictions ensures a simple mapping from data space to physical space. +The list MUST contain at most one "scale" transformation per dimensioon, which specifies the size of one physical unit for that dimension. +The transformations in the list are applied sequentially and in order. If not given, the identity transformation is assumed. -It MUST contain the field "axes", see [[#axes-md]]. -The values of the "name" field must be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). -This ensures compatibility with the [xarray zarr encoding](http://xarray.pydata.org/en/stable/internals/zarr-encoding-spec.html#zarr-encoding). +It MAY contain the field "transformations" with a list of [[#trafo-md]], describing transformations that are applied in the same manner to each resolution level, following the same rules as "transformations" in "datasets". It SHOULD contain the field "name". @@ -243,17 +270,12 @@ It SHOULD contain the field "type", which gives the type of downscaling method u It SHOULD contain the field "metadata", which contains a dictionary with additional information about the downscaling method. -```json +``` { "multiscales": [ { "version": "0.3", "name": "example", - "datasets": [ - {"path": "0"}, - {"path": "1"}, - {"path": "2"} - ], "axes": [ {"name": "t", "type": "time", "unit": "millisecond"}, {"name": "c", "type": "channel"}, @@ -261,6 +283,21 @@ It SHOULD contain the field "metadata", which contains a dictionary with additio {"name": "y", "type": "space", "unit": "micrometer"}, {"name": "x", "type": "space", "unit": "micrometer"} ], + "datasets": [ + { + "path": "0", + "transformations": [{"type": "scale", "scale": [0.5, 0.5, 0.5], "axisIndices": [2, 3, 4]}] # the voxel size for the first scale level (0.5 micrometer) + } + { + "path": "1", + "transformations": [{"type": "scale", "scale": [1.0, 1.0, 1.0], "axisIndices": [2, 3, 4]}] # the voxel size for the second scale level (downscaled by a factor of 2 -> 1 micrometer) + }, + { + "path": "2", + "transformations": [{"type": "scale", "scale": [2.0, 2.0, 2.0], "axisIndices": [2, 3, 4]}] # the voxel size for the second scale level (downscaled by a factor of 4 -> 2 micrometer) + } + ], + "transformations": [{"type": "scale", "scale": [0.1], "axisIndices": [0]], # the time unit (0.1 milliseconds), which is the same for each scale level "type": "gaussian", "metadata": { # the fields in metadata depend on the downscaling implementation "method": "skimage.transform.pyramid_gaussian", # here, the paramters passed to the skimage function are given From 8ca7976b8b106e6c3b2a7dbe414b85442aa60c4b Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Mon, 18 Oct 2021 22:20:15 +0200 Subject: [PATCH 08/18] Clarify transformation order --- latest/index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/latest/index.bs b/latest/index.bs index e2d2fde6..b9dc8f09 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -259,7 +259,7 @@ The transformations MUST only be of type `identity`, `translation`, `scale` or ` The list MUST contain at most one "scale" transformation per dimensioon, which specifies the size of one physical unit for that dimension. The transformations in the list are applied sequentially and in order. If not given, the identity transformation is assumed. -It MAY contain the field "transformations" with a list of [[#trafo-md]], describing transformations that are applied in the same manner to each resolution level, following the same rules as "transformations" in "datasets". +It MAY contain the field "transformations" with a list of [[#trafo-md]], describing transformations that are applied in to each resolution level, following the same rules as "transformations" in "datasets". These transformations are applied after the per resolution level transformations specified in "datasets". It SHOULD contain the field "name". From 77790b3e9b836f1fbd9ccbb89457b2795aebcca2 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Wed, 27 Oct 2021 10:47:08 +0200 Subject: [PATCH 09/18] Simplify transformations for multiscales --- latest/index.bs | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index b9dc8f09..b10af951 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -229,7 +229,6 @@ Additional fields are defined by the column `fields`. | `identity` | | identity transformation, is the default transformation and is typically not explicitly defined | | `translation` | one of: `"translation":List[float]`, `"path":str` | translation vector, stored either as a list of floats (`"translation"`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. | | `scale` | one of: `"scale":List[float]`, `"path":str` | scale vector, stored either as a list of floats (`scale`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. | -| `rotation` | one of: `"rotation": List[float], "path:str"` | rotation vector, stored either as a list of floats (`rotation`) pr as binary data at a location in this container (`path`). The length of the vector must either be one for a rotation in two dimensions or three for a rotation in three dimenions. In 3d, the rotation angles are given in the order TODO. | In addition, the field "axisIndices" MAY be given to specify the subset of axes that the transformation is applied to, leaving other axes unchanged. If not given, the transformation is applied to all axes. The length of "axisIndices" MUST be equal to the dimensionality of the transformation. If "axisIndices" are not given, the dimensionality of the transformation MUST be equal to the number of dimensions of the space that the transformation is applied to. If given, "axisIndices" MUST be given in increasing order. @@ -254,12 +253,18 @@ the arrays storing the individual resolution levels. Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest. All arrays MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. -Each dictionary MAY contain the field "transformations", which contains a list of [[#trafo-md]] that specifies the transfromation from the data coordinate space to the physical coordinate space (as specified by "axes") for this resolution level. -The transformations MUST only be of type `identity`, `translation`, `scale` or `rotation`. This restrictions ensures a simple mapping from data space to physical space. -The list MUST contain at most one "scale" transformation per dimensioon, which specifies the size of one physical unit for that dimension. +Each dictionary MAY contain the field "transformations", which contains a list of [[#trafo-md]] that specifies the transformation from the data coordinates to the physical coordinates (as specified by "axes") for this resolution level. +The transformations MUST only be of type `identity`, `translation` or `scale`. +The list MUST contain at most one `scale` transformation per axis taht specifies the size in physical units. +It also MUST contain at most one `translation` per axis that specifies the offset in physical units. +If both `scale` and `translation` are given `translation` must be listed after `scale` to ensure that it is given in physical coordinates. The transformations in the list are applied sequentially and in order. If not given, the identity transformation is assumed. +The requirements (only `scale` and `translation`, restrictions on order) are in place to provide a simple mapping from data coordinates to physical coordinates while +being compatible with the general transformation spec. -It MAY contain the field "transformations" with a list of [[#trafo-md]], describing transformations that are applied in to each resolution level, following the same rules as "transformations" in "datasets". These transformations are applied after the per resolution level transformations specified in "datasets". +It MAY contain the field "transformations" containing a list of [[#trafo-md]], describing transformations that are applied to each resolution level. +The transformations MUST follow the same rules about allowed values and order of `scale` and `translation` as in "datasets". +These transformations are applied after the per resolution level transformations specified in "datasets". They can for example be used to specify the `scale` for a dimension that is the same for all resolutions. It SHOULD contain the field "name". From 755eaf10aa2b0f5b191ce84af0206b0ef6ba8705 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Thu, 28 Oct 2021 13:38:44 +0200 Subject: [PATCH 10/18] Define list of transformations instead of a single transformation --- latest/index.bs | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index b10af951..dfde7d42 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -216,13 +216,13 @@ keys as specified below for discovering certain types of data, especially images If part of [[#multiscale-md]], the length of "axes" MUST be equal to the number of dimensions of the arrays that contain the image data. -"transformation" metadata {#trafo-md} +"transformations" metadata {#trafo-md} ------------------------------------- -Describes a transformation, e.g. to transform the discrete data space of an array to the physical space. -It is a dictionary, which MUST contain the field "type". +"transformations" describes a series of transformations, e.g. to map discrete data space of an array to the corresponding physical space. +It is a list of dictionaries. Each entry describes a single transformation and MUST contain the field "type". The value of "type" MUST be one of the elements of the `type` column in the table below. -Additional fields are defined by the column `fields`. +Additional fields for the entry depend on "type" and are defined by the column `fields`. | type | fields | description | |- |- |- | @@ -233,7 +233,7 @@ Additional fields are defined by the column `fields`. In addition, the field "axisIndices" MAY be given to specify the subset of axes that the transformation is applied to, leaving other axes unchanged. If not given, the transformation is applied to all axes. The length of "axisIndices" MUST be equal to the dimensionality of the transformation. If "axisIndices" are not given, the dimensionality of the transformation MUST be equal to the number of dimensions of the space that the transformation is applied to. If given, "axisIndices" MUST be given in increasing order. -If transformations are stored in a list, e.g. as part of [[#multiscale-md]] metadata, they are always applied sequentally and in order. +The transformations in the list are applied sequentally and in order. "multiscales" metadata {#multiscale-md} @@ -253,17 +253,16 @@ the arrays storing the individual resolution levels. Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest. All arrays MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. -Each dictionary MAY contain the field "transformations", which contains a list of [[#trafo-md]] that specifies the transformation from the data coordinates to the physical coordinates (as specified by "axes") for this resolution level. -The transformations MUST only be of type `identity`, `translation` or `scale`. -The list MUST contain at most one `scale` transformation per axis taht specifies the size in physical units. -It also MUST contain at most one `translation` per axis that specifies the offset in physical units. -If both `scale` and `translation` are given `translation` must be listed after `scale` to ensure that it is given in physical coordinates. -The transformations in the list are applied sequentially and in order. If not given, the identity transformation is assumed. +Each dictionary MAY contain the field "transformations", which contains a list of transformations that map the data coordinates to the physical coordinates (as specified by "axes") for this resolution level. +The transformations are defined according to [[#trafo-md]]. In addition, the transformation types MUST only be `identity`, `translation` or `scale`. +They MUST contain at most one `scale` transformation per axis that specifies the pixel size in physical units. +It also MUST contain at most one `translation` per axis that specifies the offset from the origin in physical units. +If both `scale` and `translation` are given `translation` must be listed after `scale` to ensure that it is given in physical coordinates. If "transformations" is not given, the identity transformation is assumed. The requirements (only `scale` and `translation`, restrictions on order) are in place to provide a simple mapping from data coordinates to physical coordinates while being compatible with the general transformation spec. -It MAY contain the field "transformations" containing a list of [[#trafo-md]], describing transformations that are applied to each resolution level. -The transformations MUST follow the same rules about allowed values and order of `scale` and `translation` as in "datasets". +It MAY contain the field "transformations", describing transformations that are applied to each resolution level. +The transformations MUST follow the same rules about allowed types, order, etc. as in "datasets:transformations". These transformations are applied after the per resolution level transformations specified in "datasets". They can for example be used to specify the `scale` for a dimension that is the same for all resolutions. It SHOULD contain the field "name". From fa390f68e9e90f4ca602eb64469a71ea8e3bb23a Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Thu, 28 Oct 2021 13:47:56 +0200 Subject: [PATCH 11/18] Add null as default value for axes:type --- latest/index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/latest/index.bs b/latest/index.bs index dfde7d42..39c0f4e2 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -210,7 +210,7 @@ keys as specified below for discovering certain types of data, especially images "axes" describes the dimensions of a physical coordinate space. It is a list of dictionaries, where each dictionary describes an dimension (axis) and: - MUST contain the field "name" that gives the name for this dimension. The values MUST be unique across all "name" fields. -- SHOULD contain the field "type" to specify the type of this dimension. The value SHOULD be one of "space", "channel" or "time". +- SHOULD contain the field "type" to specify the type of this dimension. The value SHOULD be one of "space", "channel" or "time". If "type" is not given, it is assumed to be "null", i.e. unkown or not represented by the spec yet. - SHOULD contain the field "unit" to specify the physical unit of this dimension. The value SHOULD be a valid unit according to UDUNITS-2. If part of [[#multiscale-md]], the length of "axes" MUST be equal to the number of dimensions of the arrays that contain the image data. From 47620eddae0d04a078505ae7189b7e0e94f5feed Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Thu, 28 Oct 2021 14:25:23 +0200 Subject: [PATCH 12/18] Add restrictions on number of axes and axis order to multiscale metadata; correct description of xarray metadata --- latest/index.bs | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index 39c0f4e2..114154ff 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -239,20 +239,22 @@ The transformations in the list are applied sequentally and in order. "multiscales" metadata {#multiscale-md} --------------------------------------- -Metadata about the multiple resolution representations of the image can be -found under the "multiscales" key in the group-level metadata. +Metadata about an image can be found under the "multiscales" key in the group-level metadata. Here, image refers to 2 to 5 dimensional data representing image or volumetric data with optional time or channel axes. It is stored in a multiple resolution representation. "multiscales" contains a list of dictionaries where each entry describes a multiscale image. Each dictionary MUST contain the field "axes", see [[#axes-md]]. -The values of the "name" field must be repeated in the field "_ARRAY_DIMENSIONS" of all scale groups (i.e. groups containing arrays with the multiscale data). +The length of "axes" must be between 2 and 5 and MUST be equal to the dimensionality of the zarr arrays storing the image data (see "datasets:path"). +The "axes" MUST contain 2 or 3 entries of "type:space" and MAY contain one additional entry of "type:time" and MAY contain one additional entry of "type:channel" or a null / custom type. +The order of the entries MUST correspond to the order of dimensions of the zarr arrays. In addition, the entries MUST be ordered by "type" where the "time" axis must come first (if present), followed by the "channel" or custom axis (if present) and the axes of type "space". +The values of the "name" fields must be given as a list in the field "_ARRAY_DIMENSIONS" in the attributes (.zattr) of the zarr arrays. This ensures compatibility with the [xarray zarr encoding](http://xarray.pydata.org/en/stable/internals/zarr-encoding-spec.html#zarr-encoding). +E.g. for "axes: [{"name": "x"}, {"name": "y"}, {"name": z}]", the zarr arrays must contain "{"_ARRAY_DIMENSIONS": ["x", "y", "z"]}" in their attributes. -It MUST contain the field "datasets", which is a list of dictionaries describing -the arrays storing the individual resolution levels. +It MUST contain the field "datasets", which is a list of dictionaries describing the arrays storing the individual resolution levels. Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest. -All arrays MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. +All arrays MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. The number of dimensions and order MUST correspond to number and order of "axes". Each dictionary MAY contain the field "transformations", which contains a list of transformations that map the data coordinates to the physical coordinates (as specified by "axes") for this resolution level. The transformations are defined according to [[#trafo-md]]. In addition, the transformation types MUST only be `identity`, `translation` or `scale`. They MUST contain at most one `scale` transformation per axis that specifies the pixel size in physical units. From ffcb2a8dc3e7358e8d7acff83e1906b57d2dd242 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Thu, 4 Nov 2021 15:52:29 +0100 Subject: [PATCH 13/18] Clarify ordering of spatial axes --- latest/index.bs | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index 114154ff..51c106a5 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -107,6 +107,8 @@ Images {#image-layout} The following layout describes the expected Zarr hierarchy for images with multiple levels of resolutions and optionally associated labels. +Note that the number of dimensions is variable between 2 and 5 and that axis names are arbitrary, see [[#multiscale-md]] for details. +For this example we assume an image with 5 dimensions and axes called `t,c,z,y,x`. ``` . # Root folder, potentially in S3, @@ -127,7 +129,7 @@ multiple levels of resolutions and optionally associated labels. │ │ # by the "multiscales" metadata, but is often a sequence starting at 0. │ │ │ ├── .zarray # All image arrays must be up to 5-dimensional - │ │ # with dimension order (t, c, z, y, x). + │ │ # with the axis of type time before type channel, before spatial axes. │ │ │ └─ t # Chunks are stored with the nested directory layout. │ └─ c # All but the last chunk element are stored as directories. @@ -231,7 +233,7 @@ Additional fields for the entry depend on "type" and are defined by the column ` | `scale` | one of: `"scale":List[float]`, `"path":str` | scale vector, stored either as a list of floats (`scale`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. | In addition, the field "axisIndices" MAY be given to specify the subset of axes that the transformation is applied to, leaving other axes unchanged. If not given, the transformation is applied to all axes. The length of "axisIndices" MUST be equal to the dimensionality of the transformation. If "axisIndices" are not given, the dimensionality of the transformation MUST be equal to the number of dimensions of the space that the transformation is applied to. -If given, "axisIndices" MUST be given in increasing order. +If given, "axisIndices" MUST be given in increasing order. It uses zero-based indexing. The transformations in the list are applied sequentally and in order. @@ -247,9 +249,10 @@ Each dictionary MUST contain the field "axes", see [[#axes-md]]. The length of "axes" must be between 2 and 5 and MUST be equal to the dimensionality of the zarr arrays storing the image data (see "datasets:path"). The "axes" MUST contain 2 or 3 entries of "type:space" and MAY contain one additional entry of "type:time" and MAY contain one additional entry of "type:channel" or a null / custom type. The order of the entries MUST correspond to the order of dimensions of the zarr arrays. In addition, the entries MUST be ordered by "type" where the "time" axis must come first (if present), followed by the "channel" or custom axis (if present) and the axes of type "space". +If there are three spatial axes where two correspond to the image plane ("yx") and images are stacked along the other (anisotropic) axis ("z"), the spatial axes SHOULD be ordered as "zyx". The values of the "name" fields must be given as a list in the field "_ARRAY_DIMENSIONS" in the attributes (.zattr) of the zarr arrays. This ensures compatibility with the [xarray zarr encoding](http://xarray.pydata.org/en/stable/internals/zarr-encoding-spec.html#zarr-encoding). -E.g. for "axes: [{"name": "x"}, {"name": "y"}, {"name": z}]", the zarr arrays must contain "{"_ARRAY_DIMENSIONS": ["x", "y", "z"]}" in their attributes. +E.g. for "axes: [{"name": "z"}, {"name": "y"}, {"name": x}]", the zarr arrays must contain "{"_ARRAY_DIMENSIONS": ["z", "y", "x"]}" in their attributes. It MUST contain the field "datasets", which is a list of dictionaries describing the arrays storing the individual resolution levels. Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative From e27bdecc06dedc71c6ec7e20a2bd7dd0755d43ea Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Thu, 4 Nov 2021 15:59:23 +0100 Subject: [PATCH 14/18] Try to fix rendering of table --- latest/index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/latest/index.bs b/latest/index.bs index 51c106a5..3a7dc076 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -227,7 +227,7 @@ The value of "type" MUST be one of the elements of the `type` column in the tabl Additional fields for the entry depend on "type" and are defined by the column `fields`. | type | fields | description | -|- |- |- | +| ------------- | ------ |------------ | | `identity` | | identity transformation, is the default transformation and is typically not explicitly defined | | `translation` | one of: `"translation":List[float]`, `"path":str` | translation vector, stored either as a list of floats (`"translation"`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. | | `scale` | one of: `"scale":List[float]`, `"path":str` | scale vector, stored either as a list of floats (`scale`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. | From 0661115b93026f197d3787d99b74ec4d01614c99 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Thu, 25 Nov 2021 14:34:28 +0100 Subject: [PATCH 15/18] Add suggested units (copied from comment by @will-moore) --- latest/index.bs | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/latest/index.bs b/latest/index.bs index 3a7dc076..57d794c4 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -212,8 +212,10 @@ keys as specified below for discovering certain types of data, especially images "axes" describes the dimensions of a physical coordinate space. It is a list of dictionaries, where each dictionary describes an dimension (axis) and: - MUST contain the field "name" that gives the name for this dimension. The values MUST be unique across all "name" fields. -- SHOULD contain the field "type" to specify the type of this dimension. The value SHOULD be one of "space", "channel" or "time". If "type" is not given, it is assumed to be "null", i.e. unkown or not represented by the spec yet. - SHOULD contain the field "unit" to specify the physical unit of this dimension. The value SHOULD be a valid unit according to UDUNITS-2. +SHOULD contain the field "unit" to specify the physical unit of this dimension. The value SHOULD be one of the following strings, which are valid units according to UDUNITS-2. + - Units for "space" axes: 'angstrom', 'attometer', 'centimeter', 'decimeter', 'exameter', 'femtometer', 'foot', 'gigameter', 'hectometer', 'inch', 'kilometer', 'megameter', 'meter', 'micrometer', 'mile', 'millimeter', 'nanometer', 'parsec', 'petameter', 'picometer', 'terameter', 'yard', 'yoctometer', 'yottameter', 'zeptometer', 'zettameter' + - Units for "time" axes: 'attosecond', 'centisecond', 'day', 'decisecond', 'exasecond', 'femtosecond', 'gigasecond', 'hectosecond', 'hour', 'kilosecond', 'megasecond', 'microsecond', 'millisecond', 'minute', 'nanosecond', 'petasecond', 'picosecond', 'second', 'terasecond', 'yoctosecond', 'yottasecond', 'zeptosecond', 'zettasecond' If part of [[#multiscale-md]], the length of "axes" MUST be equal to the number of dimensions of the arrays that contain the image data. From 23be956d6ec39e70bc36c03a1e0f9312a3291361 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Thu, 25 Nov 2021 14:41:52 +0100 Subject: [PATCH 16/18] Fix typo --- latest/index.bs | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index 57d794c4..0dd75425 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -212,8 +212,7 @@ keys as specified below for discovering certain types of data, especially images "axes" describes the dimensions of a physical coordinate space. It is a list of dictionaries, where each dictionary describes an dimension (axis) and: - MUST contain the field "name" that gives the name for this dimension. The values MUST be unique across all "name" fields. -- SHOULD contain the field "unit" to specify the physical unit of this dimension. The value SHOULD be a valid unit according to UDUNITS-2. -SHOULD contain the field "unit" to specify the physical unit of this dimension. The value SHOULD be one of the following strings, which are valid units according to UDUNITS-2. +- SHOULD contain the field "unit" to specify the physical unit of this dimension. The value SHOULD be one of the following strings, which are valid units according to UDUNITS-2. - Units for "space" axes: 'angstrom', 'attometer', 'centimeter', 'decimeter', 'exameter', 'femtometer', 'foot', 'gigameter', 'hectometer', 'inch', 'kilometer', 'megameter', 'meter', 'micrometer', 'mile', 'millimeter', 'nanometer', 'parsec', 'petameter', 'picometer', 'terameter', 'yard', 'yoctometer', 'yottameter', 'zeptometer', 'zettameter' - Units for "time" axes: 'attosecond', 'centisecond', 'day', 'decisecond', 'exasecond', 'femtosecond', 'gigasecond', 'hectosecond', 'hour', 'kilosecond', 'megasecond', 'microsecond', 'millisecond', 'minute', 'nanosecond', 'petasecond', 'picosecond', 'second', 'terasecond', 'yoctosecond', 'yottasecond', 'zeptosecond', 'zettasecond' From 0419ce174dc5b3fb5217e15f866fec5f1fb9f183 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Wed, 26 Jan 2022 20:33:17 +0100 Subject: [PATCH 17/18] Add 'type' to the axes definition --- latest/index.bs | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/latest/index.bs b/latest/index.bs index 0dd75425..9fdd57c4 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -210,8 +210,9 @@ keys as specified below for discovering certain types of data, especially images "axes" metadata {#axes-md} -------------------------- -"axes" describes the dimensions of a physical coordinate space. It is a list of dictionaries, where each dictionary describes an dimension (axis) and: +"axes" describes the dimensions of a physical coordinate space. It is a list of dictionaries, where each dictionary describes a dimension (axis) and: - MUST contain the field "name" that gives the name for this dimension. The values MUST be unique across all "name" fields. +- SHOULD contain the field "type". It SHOULD be one of "space", "time" or "channel", but MAY take other values for custom axis types that are not part of this specification yet. - SHOULD contain the field "unit" to specify the physical unit of this dimension. The value SHOULD be one of the following strings, which are valid units according to UDUNITS-2. - Units for "space" axes: 'angstrom', 'attometer', 'centimeter', 'decimeter', 'exameter', 'femtometer', 'foot', 'gigameter', 'hectometer', 'inch', 'kilometer', 'megameter', 'meter', 'micrometer', 'mile', 'millimeter', 'nanometer', 'parsec', 'petameter', 'picometer', 'terameter', 'yard', 'yoctometer', 'yottameter', 'zeptometer', 'zettameter' - Units for "time" axes: 'attosecond', 'centisecond', 'day', 'decisecond', 'exasecond', 'femtosecond', 'gigasecond', 'hectosecond', 'hour', 'kilosecond', 'megasecond', 'microsecond', 'millisecond', 'minute', 'nanosecond', 'petasecond', 'picosecond', 'second', 'terasecond', 'yoctosecond', 'yottasecond', 'zeptosecond', 'zettasecond' From e94a0c15746005129946ec78e3ccaaff077c1597 Mon Sep 17 00:00:00 2001 From: Constantin Pape Date: Wed, 26 Jan 2022 20:41:55 +0100 Subject: [PATCH 18/18] Clarify the reference object for each new paragraph in the multiscales definition --- latest/index.bs | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/latest/index.bs b/latest/index.bs index 9fdd57c4..5e8d07a0 100644 --- a/latest/index.bs +++ b/latest/index.bs @@ -247,7 +247,7 @@ Metadata about an image can be found under the "multiscales" key in the group-le "multiscales" contains a list of dictionaries where each entry describes a multiscale image. -Each dictionary MUST contain the field "axes", see [[#axes-md]]. +Each "multiscales" dictionary MUST contain the field "axes", see [[#axes-md]]. The length of "axes" must be between 2 and 5 and MUST be equal to the dimensionality of the zarr arrays storing the image data (see "datasets:path"). The "axes" MUST contain 2 or 3 entries of "type:space" and MAY contain one additional entry of "type:time" and MAY contain one additional entry of "type:channel" or a null / custom type. The order of the entries MUST correspond to the order of dimensions of the zarr arrays. In addition, the entries MUST be ordered by "type" where the "time" axis must come first (if present), followed by the "channel" or custom axis (if present) and the axes of type "space". @@ -256,10 +256,11 @@ The values of the "name" fields must be given as a list in the field "_ARRAY_DIM This ensures compatibility with the [xarray zarr encoding](http://xarray.pydata.org/en/stable/internals/zarr-encoding-spec.html#zarr-encoding). E.g. for "axes: [{"name": "z"}, {"name": "y"}, {"name": x}]", the zarr arrays must contain "{"_ARRAY_DIMENSIONS": ["z", "y", "x"]}" in their attributes. -It MUST contain the field "datasets", which is a list of dictionaries describing the arrays storing the individual resolution levels. +Each "multiscales" dictionary MUST contain the field "datasets", which is a list of dictionaries describing the arrays storing the individual resolution levels. Each dictionary in "datasets" MUST contain the field "path", whose value contains the path to the array for this resolution relative to the current zarr group. The "path"s MUST be ordered from largest (i.e. highest resolution) to smallest. -All arrays MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. The number of dimensions and order MUST correspond to number and order of "axes". + +Each "datasets" dictionary MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. The number of dimensions and order MUST correspond to number and order of "axes". Each dictionary MAY contain the field "transformations", which contains a list of transformations that map the data coordinates to the physical coordinates (as specified by "axes") for this resolution level. The transformations are defined according to [[#trafo-md]]. In addition, the transformation types MUST only be `identity`, `translation` or `scale`. They MUST contain at most one `scale` transformation per axis that specifies the pixel size in physical units. @@ -268,24 +269,20 @@ If both `scale` and `translation` are given `translation` must be listed after ` The requirements (only `scale` and `translation`, restrictions on order) are in place to provide a simple mapping from data coordinates to physical coordinates while being compatible with the general transformation spec. -It MAY contain the field "transformations", describing transformations that are applied to each resolution level. +Each "multiscales" dictionary MAY contain the field "transformations", describing transformations that are applied to each resolution level. The transformations MUST follow the same rules about allowed types, order, etc. as in "datasets:transformations". These transformations are applied after the per resolution level transformations specified in "datasets". They can for example be used to specify the `scale` for a dimension that is the same for all resolutions. -It SHOULD contain the field "name". - -It SHOULD contain the field "version", which indicates the version of the -multiscale metadata of this image (current version is 0.3). - -It SHOULD contain the field "type", which gives the type of downscaling method used to generate the multiscale image pyramid. +Each "multiscales" dictionary SHOULD contain the field "name". It SHOULD contain the field "version", which indicates the version of the multiscale metadata of this image (current version is 0.4). +Each "multiscales" dictionary SHOULD contain the field "type", which gives the type of downscaling method used to generate the multiscale image pyramid. It SHOULD contain the field "metadata", which contains a dictionary with additional information about the downscaling method. ``` { "multiscales": [ { - "version": "0.3", + "version": "0.4", "name": "example", "axes": [ {"name": "t", "type": "time", "unit": "millisecond"},