-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use PROJJSON instead of WKT2:2019 #96
Use PROJJSON instead of WKT2:2019 #96
Conversation
format-specs/geoparquet.md
Outdated
| ------------- | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ||
| encoding | string | **REQUIRED** Name of the geometry encoding format. Currently only 'WKB' is supported. | | ||
| geometry_type | string or \[string] | **REQUIRED** The geometry type(s) of all geometries, or 'Unknown' if they are not known. | | ||
| crs | JSON object | **OPTIONAL** [PROJJSON](https://proj.org/specifications/projjson.html) JSON object representing the Coordinate Reference System (CRS) of the geometry. If the crs field is not included then the data in this column must be stored in longitude, latitude, and CRS-aware implementations may assume a default value of [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84) (longitude-latitude coordinates). | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason exactly to change this?
(I am not necessarily opposed to be clear, just don't directly remember that we discussed this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent was: "may" indicates reader may do so at their own peril (i.e., if you really don't care about ellipsoid); "should" indicates we recommend that the reader do so - which seems a bit strong when the ellipsoid of the long,lat coords is not defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"should" indicates we recommend that the reader do so - which seems a bit strong when the ellipsoid of the long,lat coords is not defined.
It seems the last discussion seems to lean towards assuming WGS84? (which is still not a well defined ellipsoid without an epoch, but OK ;))
Pending #93 (and #64, a discussion of whether to make the json schema definition a core part of the geoparquet metadata spec), we should decide whether we want the json schema to be updated along with or after associated spec PRs. In this case, we'd want the json schema to be updated so that the |
format-specs/geoparquet.md
Outdated
|
||
Note: EPSG:4326 and OGC:CRS84 are equivalent with respect to this specification because this specification specifically overrides the coordinate axis order in the `crs` to be longitude-latitude. | ||
|
||
The current PROJJSON JSON object for OGC:CRS84 is: | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: apply JSON syntax highlighting to the block
``` | |
```json |
I can update the schema here once #93 is merged; I was thinking that ideally both the spec and required changes to the schema would get updated within the same PR. It may be good for us to come up with a checklist of required / recommended updates to files here as part of making a PR to update the spec. I'm assuming that we'd want to update the script for building an example parquet file after such PRs are in place, since that corresponds more directly to a specific version of the spec rather than intermediate changes within a PR, right? |
👍 would be a great thing for a PR template.
That's true, there are really three parts:
I don't have a strong opinion on when to update the Parquet script, but that could also be related to #40. For example, I think STAC spec developed entirely on a |
format-specs/geoparquet.md
Outdated
|
||
If an implementation is CRS-aware and needs a CRS representation of the data it may assume a default value is [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84), which is equivalent to the well-known [EPSG:4326](https://epsg.org/crs_4326/WGS-84.html) but changes the axis from latitude-longitude to longitude-latitude. However, there is no guarantee that coordinates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: The "However..." can be removed if we instead state above that
If CRS is not provided, then all coordinates in the geometry must use longitude, latitude based on the OGC:CRS84 CRS to store their data.
But without that, lacking a crs
means that coordinates are in an unspecified CRS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm for removing the however and stating what it should be. Though perhaps we just say 'must use longitude, latitude with the WGS84 datum' to cover both OGC:CRS84 and EPSG:4326 (with axis order flipped).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on removing the "however ..". So that if CRS is not available, readers can assume WGS84
Which form should we include here? The I think one of our goals in including it here is that at some point we can also make a recommendation for non-CRS aware tools that can safely assume they are producing OGC:CRS84 referenced data to copy / paste this json to their |
For non-CRS aware tools, perhaps we should recommend the simplest minimal reasonable form of OGC:CRS84 definition {
"type": "GeographicCRS",
"name": "WGS 84 longitude-latitude",
"datum": {
"type": "GeodeticReferenceFrame",
"name": "World Geodetic System 1984",
"ellipsoid": {
"name": "WGS 84",
"semi_major_axis": 6378137,
"inverse_flattening": 298.257223563
}
},
"coordinate_system": {
"subtype": "ellipsoidal",
"axis": [
{
"name": "Geodetic longitude",
"abbreviation": "Lon",
"direction": "east",
"unit": "degree"
},
{
"name": "Geodetic latitude",
"abbreviation": "Lat",
"direction": "north",
"unit": "degree"
}
]
},
"id": {
"authority": "OGC",
"code": "CRS84"
}
} |
format-specs/geoparquet.md
Outdated
|
||
#### crs | ||
|
||
The Coordinate Reference System (CRS) is an optional parameter for each geometry column defined in geoparquet format. | ||
|
||
The CRS must be provided in [WKT](https://en.wikipedia.org/wiki/Well-known_text_representation_of_coordinate_reference_systems) version 2, also known as **WKT2**. WKT2 has several revisions, this specification only supports [WKT2_2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html). | ||
The CRS must be provided in [PROJJSON](https://proj.org/specifications/projjson.html) format, which is a JSON encoding of [WKT2:2019 / ISO-19162:2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html), which itself implements the model of [OGC Topic 2: Referencing by coordinates abstract specification / ISO-19111:2019](http://docs.opengeospatial.org/as/18-005r4/18-005r4.html). Apart from the difference of encodings, the semantics is intended to be exactly the same as WKT2:2019, and PROJJSON can be morphed losslessly from/into WKT2:2019. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note there was a comment on this in #97 - #97 (comment) That's the 'require' PR, but this one should merge first so we can cut 0.4.0 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either one could theoretically go out as 0.4.0; per the discussion I thought the intent was to compare the optional PROJJSON with OGC:CRS84 default vs required PROJJSON, rather than them being additive. But I can certainly update this PR to address comments in #97 if we'd like to see this particular PR go out sooner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I think the plan was to aim for 0.4.0 without it - I organized the 0.4 milestone around that - and then to be sure to solve this for 0.5. Just to be sure to make iterative progress as it seemed like that one could take longer.
I think we targeted next week for the 0.4.0 release. I'm also not opposed to get this into 0.4 release with everyone happy, and just put up a discussion in #98 to help push it forward. So maybe see where that discussion is at on monday of next week, and if it's not fully resolved then it'd be great if you could clean up this PR and we push out 0.4
format-specs/geoparquet.md
Outdated
the OGC:CRS84 CRS based on elements of the `crs` field. For simplicity, Javascript | ||
object dot notation is used to refer to nested elements. | ||
|
||
The CRS is likely equivalent to OGC:CRS84 for a GeoParquet file if the `id` element is present: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not that important for this PR (can be done in a follow-up as well), but what do people think about moving the details of what the CRS repr of OGC:CRS84 and how to recognize this to a separate section towards the bottom of this file? So here we focus on the basic explanation of this metadata fields
format-specs/geoparquet.md
Outdated
* `id.authority` = `"OGC"` and `id.code` = `"CRS84"` | ||
* `id.authority` = `"EPSG"` and `id.code` = `4326` (due to longitude, latitude ordering in this specification) | ||
|
||
The CRS is likely equivalent to OGR:CRS84 if all of the following are true: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would somehow stress here that what follows is really optional, as IMO it is perfectly fine for a non-CRS aware implementation to only handle files that have a "proper" CRS with EPSG:4326 or OGC:CRS84 id. Being able to detect other equivalent json CRS objects without this id seems rather like a corner case (and they can give a clear error that the CRS of the data is not recognized, as they will do for files with non-WGS84 crs anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 - I think it's ok to even just remove this section. Like perhaps add it to some 'best practices' type document, but not have it in the main spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I put in one small comment, but I'm also happy to merge this and then tackle that in a separate PR.
format-specs/geoparquet.md
Outdated
|
||
If an implementation is CRS-aware and needs a CRS representation of the data it may assume a default value is [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84), which is equivalent to the well-known [EPSG:4326](https://epsg.org/crs_4326/WGS-84.html) but changes the axis from latitude-longitude to longitude-latitude. However, there is no guarantee that coordinates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm for removing the however and stating what it should be. Though perhaps we just say 'must use longitude, latitude with the WGS84 datum' to cover both OGC:CRS84 and EPSG:4326 (with axis order flipped).
Is this ready to merge? It's looking good to me. Thanks for all your awesome work on this @brendan-ward (and everyone else reviewing / suggesting). |
The only thing outstanding is what part (if any) of the "OGC:CRS84 details" section to cut out or move to a different file (if so, where?). |
Ah, right. I think the whole OGC:CRS84 details section could move out. I'd probably just do something simple like just make an 'additional-info.md' file that sits right next the spec, in the same folder, and puts the crs details in it, maybe also the example file above. It could expand out in time, could also evolve to live somewhere else, like become more of a guide to geospatial for non-geo people, help explain various aspects of the spec. I think it's also ok to just merge this and then we can do that one as a separate PR. I'd be happy to work on it. |
Given that it was moved to an additional section at the bottom, for me it is also fine to keep it there for now (we can move it another file later, if we prefer that), in light of getting this PR merged. EDIT: ah, so what you said in your last sentence ;)
|
Resolves #95 based on associated discussion in #90.
This assumes #94 is added separately, allowing
crs
to be explicitly set to null.This preserves the existing optional behavior and default if unspecified of
OGC:CRS84
. The default value and optional vs required status will be taken up in another PR.This includes a minor formatting change of the table including
crs
but only thecrs
row changed content.