-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GMT_IMAGE: Implement the GMT_IMAGE.to_dataarray method for 3-band images #3128
Conversation
Minimal xarray.DataArray output with data and coordinates, no metadata yet.
Extra metadata from the _GMT_GRID_HEADER struct.
name=header.name, | ||
attrs=header.data_attrs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The image name
is currently hardcoded to z
, is that ok for an RGB image?
pygmt/pygmt/datatypes/header.py
Lines 193 to 198 in ac44706
@property | |
def name(self) -> str: | |
""" | |
Name of the grid. | |
""" | |
return "z" |
The attrs
fields might need some work. I'm getting 'actual_range': array([ 1.79769313e+308, -1.79769313e+308])}
when loading the @earth_day_01d
image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree they make no sense, but they're consistent with the behavior in GMT.
gmt grdinfo @earth_day_01d
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Title: Grid imported via GDAL
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Command:
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Remark:
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Pixel node registration used [Geographic grid]
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Grid file format: gd = Import/export through GDAL
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: x_min: -180 x_max: 180 x_inc: 1 name: x n_columns: 360
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: y_min: -90 y_max: 90 y_inc: 1 name: y n_rows: 180
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: v_min: 1.79769313486e+308 v_max: -1.79769313486e+308 name: z
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: scale_factor: 1 add_offset: 0
/Users/seisman/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Default CPT:
+proj=longlat +R=6378137 +no_defs
The GMT's image support was likely added by Joaquim so that you may ping him for more information.
Reorder the dimensions to follow Channel, Height, Width (CHW) convention. Also added doctest checking output DataArray object and the image's x and y coordinates.
Get the registration and gtype info from the grid header and apply it to the GMT accessor attributes.
Trying to match some of the doctests in _GMT_GRID.
Remove hardcoded attribute that was only meant for NetCDF files, so that GeoTIFF files won't have it.
I think this PR is almost ready for review. Here are the known limitations/issues:
|
pygmt/datatypes/image.py
Outdated
title: | ||
history: | ||
description: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe don't add these attributes if they're empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These come from the header no? For GeoTIFF files, they don't make sense since GMT is not parsing the TIFF tags, but the title/history/description might show up for 3-band NetCDF files (if those can be passed into this GMT_IMAGE
struct).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, title
/history
/description
are recommended attributes of CF-1.7. So, for netCDF files, I guess we should always show them even if they're empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If those are only needed for CF-1.7, then maybe we can indent the 3 attrs["..."]
lines here to be inside the if self.type in {GridFormat.NX}
block?
pygmt/pygmt/datatypes/header.py
Lines 207 to 217 in cf48764
if self.type in { | |
GridFormat.NB, | |
GridFormat.NS, | |
GridFormat.NI, | |
GridFormat.NF, | |
GridFormat.ND, | |
}: # Only set the 'Conventions' attribute for netCDF. | |
attrs["Conventions"] = "CF-1.7" | |
attrs["title"] = self.title.decode() | |
attrs["history"] = self.command.decode() | |
attrs["description"] = self.remark.decode() |
(2) actual_range is the standard attribution defined by CF-1.7 convention, but it's only for netCDF files, not for images;
Same here, we can put attrs["actual_range"]
under the if-block above:
pygmt/pygmt/datatypes/header.py
Line 223 in cf48764
attrs["actual_range"] = np.array([self.z_min, self.z_max]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done at 62f0ce0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Currently, only 3-band images are supported. [We can work on 1-band and 4-band images in a separate PR to make it easier to review]
Yep, just do 3-band images here for now.
- For the
y
coordinates, both GMT API andrioxarray
return a descendingy
array, but in_GMT_GRID.to_dataarray()
method, we flip the y coordinate to make it ascending. Do we want to do the same thing for images?
I'd prefer descending y
to match that the output of rioxarray.open_rasterio
. In #2398 (comment), we kept the ascending y for GMT_GRID
for pragmatic reasons, namely to match the output of xr.load_datarray
so that the unit tests won't break. But I think the convention for most xarray.DataArray
objects is to have descending y and ascending x.
pygmt/datatypes/image.py
Outdated
>>> da.gmt.registration, da.gmt.gtype | ||
(1, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gtype for earth_day_01d
should be Geographic (1), not Cartesian (0).
>>> da.gmt.registration, da.gmt.gtype | |
(1, 0) | |
>>> da.gmt.registration, da.gmt.gtype | |
(1, 1) |
according to:
$ gmt grdinfo @earth_day_01d
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Title: Grid imported via GDAL
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Command:
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Remark:
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Pixel node registration used [Geographic grid]
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Grid file format: gd = Import/export through GDAL
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: x_min: -180 x_max: 180 x_inc: 1 name: x n_columns: 360
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: y_min: -90 y_max: 90 y_inc: 1 name: y n_rows: 180
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: v_min: 1.79769313486e+308 v_max: -1.79769313486e+308 name: z
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: scale_factor: 1 add_offset: 0
~/.gmt/server/earth/earth_day/earth_day_01d_p.tif: Default CPT:
We either need to modify the logic here:
pygmt/pygmt/datatypes/header.py
Line 254 in cf48764
gtype = 1 if dims[0] == "lat" and dims[1] == "lon" else 0 |
Or I can manually override the gtype
in the load_blue_marble
function which I'm trying to finish up at #2235.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the function (again, not a public API function) that GMT uses to determine the grid type. The logic here is not that complicated to duplicate in Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How easy would it be to make that C function public actually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How easy would it be to make that C function public actually?
It doesn't seem too difficult. But we still need to implement our own functions if we want to be compatible with GMT 6.4-6.5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it'll be some time before GMT 6.6 comes out. I just want the gtype determination to be consistent between GMT and PyGMT, and the best case is to bind to the GMT C function. But we can also reimplement it in PyGMT for now (in a separate PR), maybe in header.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possibility is to wrap the "hidden", private GMT_GRID_HEADER_HIDDEN
structure https://github.com/GenericMappingTools/gmt/blob/7809736ba32d87a4a96b15444419eb176c6a35f3/src/gmt_hidden.h#L151. The structure has many members but we can only wrap the first few members. The members that are most useful to us are: grdtype
(Cartesian or geographic), BC
(boundary condition), varname
(the actual variable name so that we don't have to hardcode the grid name to z
), and cpt
(the default CPT for this grid).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In f715aee, I've updated the logic for determining the grid/image gtype based on ProjRefPROJ4
.
The logic of codes come from: https://github.com/GenericMappingTools/gmt/blob/7809736ba32d87a4a96b15444419eb176c6a35f3/src/gmt_grdio.c#L3778
pygmt/datatypes/image.py
Outdated
title: | ||
history: | ||
description: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If those are only needed for CF-1.7, then maybe we can indent the 3 attrs["..."]
lines here to be inside the if self.type in {GridFormat.NX}
block?
pygmt/pygmt/datatypes/header.py
Lines 207 to 217 in cf48764
if self.type in { | |
GridFormat.NB, | |
GridFormat.NS, | |
GridFormat.NI, | |
GridFormat.NF, | |
GridFormat.ND, | |
}: # Only set the 'Conventions' attribute for netCDF. | |
attrs["Conventions"] = "CF-1.7" | |
attrs["title"] = self.title.decode() | |
attrs["history"] = self.command.decode() | |
attrs["description"] = self.remark.decode() |
(2) actual_range is the standard attribution defined by CF-1.7 convention, but it's only for netCDF files, not for images;
Same here, we can put attrs["actual_range"]
under the if-block above:
pygmt/pygmt/datatypes/header.py
Line 223 in cf48764
attrs["actual_range"] = np.array([self.z_min, self.z_max]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful, thanks again @seisman! Excited to have direct 3-band read support in PyGMT without rioxarray! There's still a few tiny details that could be improved (around the attrs/metadata), but this looks really good already as an initial implementation. 🚀
This PR implements the
to_dataarray
method forGMT_IMAGE
.Similar to #2398 but for images.
Need to wait for #3446.