-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix problems with missing groups/metadata and sparse plates #119
Fix problems with missing groups/metadata and sparse plates #119
Conversation
Test cases currently expecting to fail. There's a lot of sparse data handling with Plates that we are still failing to do and to keep up with our specification suggestions already made upstream (ome/ngff#24). |
int fieldCount = 1; | ||
|
||
// Only two rows are filled out with one column each (two Wells total) | ||
checkPlateGroupLayout(output, 2, 1, fieldCount, 2, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check won't be consistent with lines 907-924. The idea is to now have basically everything except arrays written for any wells that don't have data, right? Or have I misunderstood the problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed earlier today, and I did indeed misunderstand. This PR is meant to follow the example in https://github.com/ome/ngff/blob/0c286908a4fbaceb2c622b21e5a52cc041eea7c0/latest/index.bs#L468, with the added constraint that wells without pixel data cannot have a .zgroup.
@chris-allan : I think 293d03a covers the issues we discussed earlier today. In the end that's just updating the tests to make sure that they are checking the correct plate layouts including both the presence of existing rows/columns and absence of missing rows/columns. |
Updated description with the changes made so far. Need to discuss naming conventions and potential changes to the row/column group names with @melissalinkert. |
While reviewing the concepts described in ome/ngff#24, I have been reviewing this PR and testing it against a public IDR example of sparse plate - https://idr.openmicroscopy.org/webclient/?show=plate-9473. Comparing the output of (base) [sbesson@pilot-zarr2-dev sparse_hcs]$ diff -u bioformats2raw_0.3.0/9473.zarr/.zattrs bioformats2raw_hcs/9473.zarr/.zattrs
--- bioformats2raw_0.3.0/9473.zarr/.zattrs 2021-12-01 15:46:32.040089253 +0000
+++ bioformats2raw_hcs/9473.zarr/.zattrs 2021-12-01 16:17:21.231373373 +0000
@@ -3,55 +3,133 @@
"plate" : {
"columns" : [
{
+ "name" : "0"
+ },
+ {
+ "name" : "1"
+ },
+ {
+ "name" : "2"
+ },
+ {
+ "name" : "3"
+ },
+ {
+ "name" : "4"
+ },
+ {
+ "name" : "5"
+ },
+ {
+ "name" : "6"
+ },
+ {
+ "name" : "7"
+ },
+ {
+ "name" : "8"
+ },
+ {
+ "name" : "9"
+ },
+ {
+ "name" : "10"
+ },
+ {
"name" : "11"
+ },
+ {
+ "name" : "12"
+ },
+ {
+ "name" : "13"
+ },
+ {
+ "name" : "14"
+ },
+ {
+ "name" : "15"
+ },
+ {
+ "name" : "16"
+ },
+ {
+ "name" : "17"
+ },
+ {
+ "name" : "18"
+ },
+ {
+ "name" : "19"
+ },
+ {
+ "name" : "20"
+ },
+ {
+ "name" : "21"
+ },
+ {
+ "name" : "22"
}
],
"name" : "VAMP1_DEG_OLDPLATE_20120511 384_Evotec_CellCarrier",
"wells" : [
{
"path" : "4/11",
- "row_index" : 0,
- "column_index" : 0
+ "row_index" : 4,
+ "column_index" : 11
},
{
"path" : "5/11",
- "row_index" : 1,
- "column_index" : 0
+ "row_index" : 5,
+ "column_index" : 11
},
{
"path" : "6/11",
- "row_index" : 2,
- "column_index" : 0
+ "row_index" : 6,
+ "column_index" : 11
},
{
"path" : "7/11",
- "row_index" : 3,
- "column_index" : 0
+ "row_index" : 7,
+ "column_index" : 11
},
{
"path" : "12/11",
- "row_index" : 4,
- "column_index" : 0
+ "row_index" : 12,
+ "column_index" : 11
},
{
"path" : "13/11",
- "row_index" : 5,
- "column_index" : 0
+ "row_index" : 13,
+ "column_index" : 11
},
{
"path" : "14/11",
- "row_index" : 6,
- "column_index" : 0
+ "row_index" : 14,
+ "column_index" : 11
},
{
"path" : "15/11",
- "row_index" : 7,
- "column_index" : 0
+ "row_index" : 15,
+ "column_index" : 11
}
],
"field_count" : 30,
"rows" : [
{
+ "name" : "0"
+ },
+ {
+ "name" : "1"
+ },
+ {
+ "name" : "2"
+ },
+ {
+ "name" : "3"
+ },
+ {
"name" : "4"
},
{
@@ -64,6 +142,18 @@
"name" : "7"
},
{
+ "name" : "8"
+ },
+ {
+ "name" : "9"
+ },
+ {
+ "name" : "10"
+ },
+ {
+ "name" : "11"
+ },
+ {
"name" : "12"
},
{
And for the individual wells and images, (base) [sbesson@pilot-zarr2-dev sparse_hcs]$ diff -u bioformats2raw_0.3.0/9473.zarr/4/11/.zattrs bioformats2raw_hcs/9473.zarr/4/11/.zattrs
(base) [sbesson@pilot-zarr2-dev sparse_hcs]$ diff -u bioformats2raw_0.3.0/9473.zarr/4/11/0/.zattrs bioformats2raw_hcs/9473.zarr/4/11/0/.zattrs
(base) [sbesson@pilot-zarr2-dev sparse_hcs]$ From the indexing perspective, the outcome of this PR is as expected with all the rows and columns of the plate populated and the row/column index of the first well at (4,11) rather than (0,0). As a preamble of the upcoming discussion around the HCS NGFF spec, I also computed the diff with the metadata written by (base) [sbesson@pilot-zarr2-dev sparse_hcs]$ diff -uw omero-cli-zarr/9473.zarr/.zattrs bioformats2raw_hcs/9473.zarr/.zattrs
--- omero-cli-zarr/9473.zarr/.zattrs 2021-12-01 15:43:31.304371194 +0000
+++ bioformats2raw_hcs/9473.zarr/.zattrs 2021-12-01 16:17:21.231373373 +0000
@@ -1,18 +1,11 @@
{
- "_creator": {
- "name": "omero-zarr",
- "version": "0.2.1"
- },
+ "bioformats2raw.layout" : 3,
"plate": {
- "acquisitions": [
- {
- "id": 7323,
- "maximumfieldcount": 30,
- "name": "00-37-47.423"
- }
- ],
"columns": [
{
+ "name" : "0"
+ },
+ {
"name": "1"
},
{
@@ -47,85 +40,135 @@
},
{
"name": "12"
- }
- ],
- "field_count": 30,
- "name": "VAMP1_DEG_OLDPLATE_20120511/Meas_01",
- "rows": [
+ },
{
- "name": "A"
+ "name" : "13"
},
{
- "name": "B"
+ "name" : "14"
},
{
- "name": "C"
+ "name" : "15"
},
{
- "name": "D"
+ "name" : "16"
},
{
- "name": "E"
+ "name" : "17"
},
{
- "name": "F"
+ "name" : "18"
},
{
- "name": "G"
+ "name" : "19"
},
{
- "name": "H"
+ "name" : "20"
},
{
- "name": "I"
+ "name" : "21"
},
{
- "name": "J"
+ "name" : "22"
+ }
+ ],
+ "name" : "VAMP1_DEG_OLDPLATE_20120511 384_Evotec_CellCarrier",
+ "wells" : [
+ {
+ "path" : "4/11",
+ "row_index" : 4,
+ "column_index" : 11
+ },
+ {
+ "path" : "5/11",
+ "row_index" : 5,
+ "column_index" : 11
},
{
- "name": "K"
+ "path" : "6/11",
+ "row_index" : 6,
+ "column_index" : 11
},
{
- "name": "L"
+ "path" : "7/11",
+ "row_index" : 7,
+ "column_index" : 11
},
{
- "name": "M"
+ "path" : "12/11",
+ "row_index" : 12,
+ "column_index" : 11
},
{
- "name": "N"
+ "path" : "13/11",
+ "row_index" : 13,
+ "column_index" : 11
},
{
- "name": "O"
+ "path" : "14/11",
+ "row_index" : 14,
+ "column_index" : 11
},
{
- "name": "P"
+ "path" : "15/11",
+ "row_index" : 15,
+ "column_index" : 11
}
],
- "version": "0.3",
- "wells": [
+ "field_count" : 30,
+ "rows" : [
{
- "path": "P/12"
+ "name" : "0"
},
{
- "path": "H/12"
+ "name" : "1"
},
{
- "path": "G/12"
+ "name" : "2"
},
{
- "path": "N/12"
+ "name" : "3"
},
{
- "path": "O/12"
+ "name" : "4"
},
{
- "path": "E/12"
+ "name" : "5"
},
{
- "path": "F/12"
+ "name" : "6"
},
{
- "path": "M/12"
+ "name" : "7"
+ },
+ {
+ "name" : "8"
+ },
+ {
+ "name" : "9"
+ },
+ {
+ "name" : "10"
+ },
+ {
+ "name" : "11"
+ },
+ {
+ "name" : "12"
+ },
+ {
+ "name" : "13"
+ },
+ {
+ "name" : "14"
+ },
+ {
+ "name" : "15"
+ }
+ ],
+ "acquisitions" : [
+ {
+ "id" : "0"
}
]
}
(base) [sbesson@pilot-zarr2-dev sparse_hcs]$ diff -uw omero-cli-zarr/9473.zarr/E/12/.zattrs bioformats2raw_hcs/9473.zarr/4/11/.zattrs
--- omero-cli-zarr/9473.zarr/E/12/.zattrs 2021-12-01 15:42:52.103215449 +0000
+++ bioformats2raw_hcs/9473.zarr/4/11/.zattrs 2021-12-01 16:17:21.224373345 +0000
@@ -2,126 +2,125 @@
"well": {
"images": [
{
- "acquisition": 7323,
- "path": "0"
+ "path" : "0",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "1"
+ "path" : "1",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "2"
+ "path" : "2",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "3"
+ "path" : "3",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "4"
+ "path" : "4",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "5"
+ "path" : "5",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "6"
+ "path" : "6",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "7"
+ "path" : "7",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "8"
+ "path" : "8",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "9"
+ "path" : "9",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "10"
+ "path" : "10",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "11"
+ "path" : "11",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "12"
+ "path" : "12",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "13"
+ "path" : "13",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "14"
+ "path" : "14",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "15"
+ "path" : "15",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "16"
+ "path" : "16",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "17"
+ "path" : "17",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "18"
+ "path" : "18",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "19"
+ "path" : "19",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "20"
+ "path" : "20",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "21"
+ "path" : "21",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "22"
+ "path" : "22",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "23"
+ "path" : "23",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "24"
+ "path" : "24",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "25"
+ "path" : "25",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "26"
+ "path" : "26",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "27"
+ "path" : "27",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "28"
+ "path" : "28",
+ "acquisition" : 0
},
{
- "acquisition": 7323,
- "path": "29"
+ "path" : "29",
+ "acquisition" : 0
}
- ],
- "version": "0.3"
+ ]
}
}
\ No newline at end of file Part of the mismatch like the |
Latest 2 commits here should bring the well attributes in line with the latest changes on ome/ngff#24. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-using the sample IDR plate mentioned above, I generated three NGFF representations and uploaded them to a temporary bucket. This allows a client like vizarr with some awareness of the HCS spec to highlight differences:
- with bioformats2raw 0.3.0 (S3 link, Vizarr view)
- with omero-cli-zarr 0.2.1 (S3 source, Vizarr view)
- with bioformats2raw 0.3.1-SNAPSHOT (built from commit 4722e96) (S3 source, Vizarr view)
Comparing the difference in the .zattrs
of the plate and well groups between the two last implementations, the divergence is now fairly minimal and restricted to:
- plate name, acquisition ID and metadata. This heterogeneity is expected as explained above
- tool-specific metadata e.g. the
omero-zarr
version - extra empty plate columns (13-23) populated by
bioformats2raw
- the newly introduced
rowIndex/columnIndex
fields populated bybioformats2raw
Assuming we agree with the formal ome/ngff#24 proposal (which I will comment on separately), the last two items can be captured as issues on omero-cli-zarr
and/or ome-zarr-py
that would be required to implement the proposed specification change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had someway missed the fact this PR was not merged yet. Given #119 (review) and the fact the corresponding specification PR has now been merged and released as part of the official OME-NGFF 0.4 specification, I think it makes sense to get this in.
Once #121 and the coordinateTransformations
metadata is implemented, it should be possible to bump the metadata version to 0.4
across the board.
…to fix-missing-groups
Conflicts in the tests are now fixed, so 👍 to merging assuming @chris-allan agrees. |
Fix Plate Zarr group and metadata population issues with sparse or otherwise partially populated plates. Specifically:
-1
)Rows
orColumns
Plate attributes better and consistently with the upstream suggestions