-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow non-regularly gridded and lat-lon coordinates #32
Conversation
This is now ready for review 🥳 |
I'm in the process of giving this a thorough read through and executing the notebooks etc as I go through. In general this is really excellent (as usual), and I fully agree with all of the following three changes:
The following I also really like, but I have a question/suggestion wrt it:
As I am reading it the intended use in your new implementation has two main use cases (in terms of how the coordinates are defined): # 1. Using the coordinate values as-is, measuring distance in units of
# coordinate values with the euclidean distance metric. Relevant when working
# with cartesian coordinates.
graph = wmg.create.archetype.create_keisler_graph(
coords,
mesh_node_distance=10, # distance in units of coordinate values, measured as euclidean distance
projection=None,
)
# 2. First projecting the coordinates to a different coordinate system,
# measuring distance in units of the projection with the euclidean distance
# metric. Relevant when working with lat/lon coordinates.
graph = wmg.create.archetype.create_keisler_graph(
coords,
mesh_node_distance=1.0e6, # distance in units of projected coordinates, measured as euclidean distance
projection=ae_projection
) with the current function signature being: def create_keisler_graph(
coords: np.ndarray,
mesh_node_distance: float,
projection: Optional[cartopy.crs.CRSj] = None,
) -> nx.Graph: Where it is assumed that distance is measured with the euclidean distance metric, with either the coordinate values as-is or first projecting. I think it would be cool to extend this in future by allowing for a different distance metric to be used. That would allow us to create graphs from lat/lon coordinates directly (and thereby have graphs that fully wrap the Earth). For example we could then do: # 1. Using the coordinate values as-is, measuring distance in units of
# coordinate values with the euclidean distance metric. This would be useful
# for creating graphs from cartesian coordinates.
graph = wmg.create.archetype.create_keisler_graph(
coords,
mesh_node_distance=10,
projection=None,
distance_metric="euclidean"
)
# 2. First projecting the coordinates to a different coordinate system,
# measuring distance in units of the projection with the euclidean distance
# metric. This would be useful for creating graphs from lat/lons for limited
# areas.
graph = wmg.create.archetype.create_keisler_graph(
coords,
mesh_node_distance=mesh_distance,
projection=ae_projection,
distance_metric="euclidean"
)
# 3. Using the coordinate values as-is, measuring distance in great circle
# distance with the haversine distance metric. This would be useful for
# creating graphs from lat/lon coordinates directly. We should default to the
# distance being in SI units (as in meters) and so use a fixed Earth radius in
# meters.
graph = wmg.create.archetype.create_keisler_graph(
coords,
mesh_node_distance=1.0e4, # distance of 10 km between coordinates given in lat/lon
projection=None,
distance_metric="haversine"
) With the function signature changed to: def create_keisler_graph(
coords: np.ndarray,
mesh_node_distance: float,
projection: Optional[cartopy.crs.CRS] = None,
distance_metric: str = "euclidean"
) -> nx.Graph: So that by default the function will use euclidean distance with the coordinate values, but the metric can be changed to haversine (which gives approximate great circle distances in meters between lat/lon coordinates, assuming a fixed Earth radius). All that is then required is to pass the What do you think of this idea? If it sounds good to you then maybe we could go ahead and add Of course one could argue that it should be possible to provide the Earth radius, but maybe that could become an optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really great to me. There isn't that you've written that I think is incorrect, I've only made some minor suggestions for small improvements. And then there's the suggestion to make the distance metric used explicit so that we can later introduce other distance metrics (for example haversine)
docs/creating_the_graph.ipynb
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe where we create the coordinate values here we could point the reader towards your notebook on working with lat/lons in case that's what they have? Something like
... (x/y cartesian, see [notebook link] on how to use lat/lon coordinate values in weather-model-graphs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested locally that this way of linking to another notebook works in jupyter-book.
About the idea of having a I think that this is a good idea in general, but it seems to be quite a big project to generalize beyond euclidean metrics. I also wonder if the different ways to build graphs that are interesting are as easily summarized as just switching out a distance metric in the same algorithm. I have not given that any deeper thought, it's just not obvious to me that you would want to build graphs using lat-lons with the same algorithm, just a different metric. And if the algorithm is completely different, then maybe it's better to just have a different type of switch, e.g. But for this PR, while I think this keyword argument could be nice, I don't think it belongs with this change. This does not generalize the metric in any way, so it would just be adding an option that only has one accepted value. I think that option should be added in the potential future PR that actually generalizes the metric option instead. |
Ok, thanks for the feedback :) I am glad you like the idea anyway. I have quickly hacked together what I meant here: #37, and it seems to be working :) So I will continue with that for now, but glad to hear that this idea seems to fit. |
Co-authored-by: Leif Denby <[email protected]>
This is ready for another look @leifdenby . All comments above should be fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only some very minor comments/suggestions left. Great!
mesh_node_distance: float | ||
Distance (in x- and y-direction) between created mesh nodes, | ||
in coordinate system of coords | ||
projection: cartopy.crs.CRS or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this docstring line needs updating to reflect that coords_crs
and graph_crs
are now provided separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, forgot to change those docstrings as well. Thanks!
level_refinement_factor: int | ||
Refinement factor between grid points and bottom level of mesh hierarchy | ||
NOTE: Must be an odd integer >1 to create proper multiscale graph | ||
max_num_levels: int | ||
The number of levels of longer-range connections in the mesh graph. | ||
projection: cartopy.crs.CRS or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this docstring line needs updating to reflect that coords_crs
and graph_crs
are now provided separately.
level_refinement_factor: float | ||
Refinement factor between grid points and bottom level of mesh hierarchy | ||
projection: cartopy.crs.CRS or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And there, I think this docstring line needs updating to reflect that coords_crs
and graph_crs
are now provided separately.
Co-authored-by: Leif Denby <[email protected]>
Should be all fixed now @leifdenby :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, just minor tweak suggested to make the change to coordinate shape explicit in CHANGELOG
Co-authored-by: Leif Denby <[email protected]>
Describe your changes
This PR allows for more flexibility in the coordinates used to create graphs. They do no longer need to be on a regular$N_x \times N_y$ grid, but can be at arbitrary positions. This means that the main coordinate input no longer has the shape
[2, Ny, Nx]
, but instead[N_grid_points, 2]
with each column representing a coordinate for all grid points. This makes it easy to also allow for lat-lon input coordinates, and use a specified projection to map these to a euclidean space.To get an overview of the lat-lon functionality I recommend checking out the new page I added to the documentation.
The motivation for this change is to allow more flexibility in the regions graphs are built for. This is in particular useful when we consider grid points as the union of some LAM-region and a boundary of global grid-points, possibly on different gridding. With this change such cases can be handled without issues.
There are a few noteworthy changes that comes with this feature:
create_flat_singlescale_mesh_graph
, as it made more sense that also this type of mesh graph had its own function. This made sense to do here as the amount of related code was growing.tests/utils.py
file. This made sense as these functions were duplicated in multiple files before and I had to change them with this new coordinate setup.grid_refinement_factor
was earlier based on the number of grid nodes in x- and y-directions. As we do no longer assume that these grid nodes lay in regular rows and columns this definition will no longer work. Instead we will have to define the grid refinement in terms of "mesh nodes / distance", using the actual coordinate values. The argument is also being renamed tomesh_node_distance
to match this.xy
argument is renamed tocoords
, as this can be lat-lons or euclidean x- and y-coordinates. If a projection is given this is assumed to be lat-lons, otherwise euclidean. For functions further down (e.g.create_single_level_2d_mesh_graph
), that still take only euclidean coordinates, this argument is still calledxy
.New dependencies
This PR introduces a dependency to cartopy, as that is needed for the projection specification when working with lat-lons.
Merge and release planning
It would be good if this change went in after Fix wrong number of mesh levels when grid is multiple of refinement factor #26 as both deal with the refinement factor calculations.(Now merged)Issue Link
This PR closes #29
Type of change
Checklist before requesting a review
pull
with--rebase
option if possible).Checklist for reviewers
Each PR comes with its own improvements and flaws. The reviewer should check the following:
Author checklist after completed review
reflecting type of change (add section where missing):
Checklist for assignee