Replies: 5 comments
-
The problem is that there are variables (like temperature) that are computed the center of the cell, while other variables (such as the vertical velocity) that are computed at the top of the cell and still others (such as some associated with vertical mixing) that are computed at the bottom of the cell. So the vertical placement of Given the fact that we are using an Arakawa B-grid for horizontal operators, what do you propose we call These aren't "gratuitous non-interoperable differences", this is the reality of looking at output produced on the staggered computational grid determined to be the best choice for the numerical methods employed in the model. |
Beta Was this translation helpful? Give feedback.
-
@mnlevy1981 Thank you for replying. Yes, I understand that different variables are computed at grid cell centers vs edges, and I do not suggest we interpolate variables or double storage. What I am suggesting is that in the output files (or rather, in our subsequent Zarr stores), we make those differences explicit in a different way. Rather than different axes z_t, z_w_top, and z_w_bot, there should be one axis Z that has a Coordinate Reference System attribute Z_crs which, for each particular variable, says in effect "this variable uses the cell-center vertical CRS" (or the cell-top or cell-bottom CRS). Z_crs should actually be the URL of a machine-readable definition. It's not a change in how we compute, it's a change in the semantics of how we tell users what was computed and on what coordinate system. |
Beta Was this translation helpful? Give feedback.
-
Oh, gotcha -- sorry for the misunderstanding. So basically, you like the horizontal approach of a single
and are wondering why we can't do the same in z
That's a reasonable approach, though given the fact that most of the CESM ocean software engineering is focused on the transition from POP to MOM I don't think it would happen at the model level. So the big question (that I don't have an answer to) is "is it okay if the cloud data differs from the model-generated data available on glade in this manner." |
Beta Was this translation helpful? Give feedback.
-
Yes, except that (a) the units attribute would use a controlled vocabulary such as UDUNITS and (b) the coordinates attribute(s) would be URL(s) of machine-readable definitions for each coordinate instead of a human-readable string that lists multiple coordinates and requires prior knowledge or that users RTFM. |
Beta Was this translation helpful? Give feedback.
-
The CF conventions define the See, for example, their 2D lat lon coordinates example:
This is the right way to do it. This approach can easily be extended to the Z axis, with multiple different dimension variables (e.g. So POP output is not following CF conventions. I believe the best path forward for convincing modeling groups to modify their output is to cite the CF conventions document. If we think the CF conventions need some changes, there is also a process for doing that. |
Beta Was this translation helpful? Give feedback.
-
In another thread, it was pointed out that the different vertical coordinates signify different locations in the level:
z_t
is the center, whilez_w_top
is the top of the level andz_w_bot
is the bottom of the level. The response to the question "Should we somehow unify the dimension names for vertical levels, to simplify future user interaction with the data, or is it important to keep them distinct?" was "Keep them distinct, please."[Originally posted by @mnlevy1981 in https://github.com//issues/34#issuecomment-596026185]
In my opinion this should be Considered Harmful on the part of modelers! In terms of future reusability of data, this sort of boutique nomenclature is little better than oral tradition handed down through the generations.
Each coordinate axis should always have the same name (e.g., X, Y, Z, T). Each variable data file that contains data along those axes should specify which coordinate reference system (CRS) is used for each axis for that particular file, using an agreed-upon set of standard CRS names and definitions.
The correct approach has long (>25 years ago) been developed by the Open Geospatial Consortium and others, and enshrined in common formats such as GeoTIFF. For example, the east-west coordinate X might be in units of degrees based on the Universal Transverse Mercator (UTM) projection, or of feet in Colorado State Plane, but it's always called "X" as the fundamental name in data files and in web API calls.
By analogy, we define "temperature" as a variable and separately state what units the temperature is in, rather than having temp_kelvin and temp_celsius and temp_reaumur depending on someone's particular preference. (I will avoid including in this particular rant my objection to every model naming variables in non-standard ways).
In my opinion, we should fix these types of problems in the data we publish in Zarr in the cloud. We are already converting to a new analysis-optimized format; leaving in these gratuitous non-interoperable differences is a bad idea -- it perpetuates instead of solving the problem, and these are new derived datasets with new DOIs for a broader audience and therefore should not be limited to the traditional way of doing things.
-Jeff DLB
Beta Was this translation helpful? Give feedback.
All reactions