Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require PROJJSON instead of WKT2:2019 #97

Closed
Closed
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
147 changes: 103 additions & 44 deletions format-specs/geoparquet.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,57 +50,116 @@ Version of the geoparquet spec used, currently 0.3.0

Each geometry column in the dataset must be included in the columns field above with the following content, keyed by the column name:

| Field Name | Type | Description |
| ---------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| encoding | string | **REQUIRED** Name of the geometry encoding format. Currently only 'WKB' is supported. |
| geometry_type | string or \[string] | **REQUIRED** The geometry type(s) of all geometries, or 'Unknown' if they are not known. |
| crs | string | **OPTIONAL** [WKT2](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html) string representing the Coordinate Reference System (CRS) of the geometry. If the crs field is not included then the data in this column must be stored in longitude, latitude. In the case where a crs is not provided, CRS-aware implementations should assume a default value of [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84) (longitude-latitude coordinates). |
| orientation | string | **OPTIONAL** Winding order of exterior ring of polygons. If present must be 'counterclockwise'; interior rings are wound in opposite order. If absent, no assertions are made regarding the winding order.
| edges | string | **OPTIONAL** Name of the coordinate system for the edges. Must be one of 'planar' or 'spherical'. The default value is 'planar'. |
| bbox | \[number] | **OPTIONAL** Bounding Box of the geometries in the file, formatted according to [RFC 7946, section 5](https://tools.ietf.org/html/rfc7946#section-5). |
| epoch | double | **OPTIONAL** Coordinate epoch in case of a dynamic CRS, expressed as a decimal year. |

| Field Name | Type | Description |
| ------------- | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| crs | JSON object | **REQUIRED** [PROJJSON](https://proj.org/specifications/projjson.html) JSON object representing the Coordinate Reference System (CRS) of the geometry. This field must be set to `null` if CRS is not known. |
| encoding | string | **REQUIRED** Name of the geometry encoding format. Currently only 'WKB' is supported. |
| geometry_type | string or \[string] | **REQUIRED** The geometry type(s) of all geometries, or 'Unknown' if they are not known. |
| orientation | string | **OPTIONAL** Winding order of exterior ring of polygons. If present must be 'counterclockwise'; interior rings are wound in opposite order. If absent, no assertions are made regarding the winding order. |
| edges | string | **OPTIONAL** Name of the coordinate system for the edges. Must be one of 'planar' or 'spherical'. The default value is 'planar'. |
| bbox | \[number] | **OPTIONAL** Bounding Box of the geometries in the file, formatted according to [RFC 7946, section 5](https://tools.ietf.org/html/rfc7946#section-5). |
| epoch | double | **OPTIONAL** Coordinate epoch in case of a dynamic CRS, expressed as a decimal year. |

#### crs

The Coordinate Reference System (CRS) is an optional parameter for each geometry column defined in geoparquet format.
The Coordinate Reference System (CRS) is a required parameter for each geometry
column defined in geoparquet format.

The CRS must be provided in
[PROJJSON](https://proj.org/specifications/projjson.html) format, which is a
JSON encoding of
[WKT2:2019 / ISO-19162:2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html),
which itself implements the model of
[OGC Topic 2: Referencing by coordinates abstract specification / ISO-19111:2019](http://docs.opengeospatial.org/as/18-005r4/18-005r4.html).
Apart from the difference of encodings, the semantics is intended to be exactly
the same as WKT2:2019, and PROJJSON can be morphed losslessly from/into
WKT2:2019.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like PROJJSON mentions a few deviations from WKT2:2019. Maybe "exactly" and "losslessly" are too strong here.

How about "Apart from the difference of encodings, the semantics of PROJJSON are intended to match WKT2:2019, and a CRS in one encoding can generally be represented in the other"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding this, it could be wise, for greater interoperability, to discourage the use of BoundCRS objects, softly for top-level (as this maps to standard WKT2), and strongly for BoundCRS embedded in CompoundCRS (as this is a PROJ extension), for the PROJJSON CRS of GeoParquet.
BoundCRS objects are mostly the port of PROJ.4 +nadgrids/+towgs84/+geoidgrids constructs that are less necessary nowadays since PROJ can use its database to infer transformations between datums.
So the list of top-level objects suggested for use in PROJJSON would be:

  • GeographicCRS 2D (or 3D if 3D data),
  • ProjectedCRS 2D (note: ProjectedCRS 3D with ellipsoidal height is supported by PROJ, but still a bit of esoteric)
  • CompoundCRS with horizontal part being a GeographicCRS 2D or a ProjectedCRS 2D and vertical part being a VerticalCRS
  • and perhaps, GeodeticCRS with cartesian geocentric coordinate system (X,Y,Z) (probably of marginal use)


For greater interoperability between implementations, data producers are
encouraged but not required to use
[OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84) as the CRS of the data.
Data that are more appropriately represented in a particular projection may use
an alternate coordinate reference system. Data produced primarily for internal
and intermediate processing steps rather than final products are likely best
represented in the native projection of those data in order to avoid unnecessary
coordinate transformations.

The `crs` field may be explicitly set to `null` to indicate that there is no CRS
assigned to this column (CRS is undefined or unknown).

Each geometry column may use a different `crs` value, if appropriate.

If an implementation is not CRS-aware and works exclusively with longitude,
latitude coordinates, the `crs` field should be set to the PROJJSON
representation of OGC:CRS84:

```json
{
"type": "GeographicCRS",
"name": "WGS 84 longitude-latitude",
"datum": {
"type": "GeodeticReferenceFrame",
"name": "World Geodetic System 1984",
"ellipsoid": {
"name": "WGS 84",
"semi_major_axis": 6378137,
"inverse_flattening": 298.257223563
}
},
"coordinate_system": {
"subtype": "ellipsoidal",
"axis": [
{
"name": "Geodetic longitude",
"abbreviation": "Lon",
"direction": "east",
"unit": "degree"
},
{
"name": "Geodetic latitude",
"abbreviation": "Lat",
"direction": "north",
"unit": "degree"
}
]
},
"id": {
"authority": "OGC",
"code": "CRS84"
}
}
```

The CRS must be provided in [WKT](https://en.wikipedia.org/wiki/Well-known_text_representation_of_coordinate_reference_systems) version 2, also known as **WKT2**. WKT2 has several revisions, this specification only supports [WKT2_2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html).
OGC:CRS84 is equivalent to the well-known
[EPSG:4326](https://epsg.org/crs_4326/WGS-84.html) but changes the axis from
latitude-longitude to longitude-latitude. EPSG:4326 and OGC:CRS84 are
equivalent with respect to this specification because this specification
specifically overrides the coordinate axis order of the stored coordinates to be
longitude-latitude.

If CRS is not provided, then all coordinates in the geometry must use longitude, latitude to store their data.
If an implementation is CRS-aware and needs a CRS representation of the data it should assume a default value is [OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84). It's equivalent to the well-known [EPSG:4326](https://epsg.org/crs_4326/WGS-84.html) but changes the axis from latitude-longitude to longitude-latitude. The WKT2:2019 string for OGC:CRS84 is:
Implementations that are not CRS-aware and operate entirely with longitude,
latitude coordinates may be able to infer that coordinates conform to the
OGC:CRS84 CRS based on elements of the `crs` field. For simplicity, Javascript
object dot notation is used to refer to nested elements.

```
GEOGCRS["WGS 84 (CRS84)",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic longitude (Lon)",east,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic latitude (Lat)",north,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
USAGE[
SCOPE["Not known."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["OGC","CRS84"]]
```
The CRS is likely equivalent to OGC:CRS84 for a GeoParquet file if the `id`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. I have some thoughts on it, but seemed to be a bigger scope than just a line in a PR, getting into if we might be able to tweak projjson. See #98

element is present:

* `id.authority` = `"OGC"` and `id.code` = `"CRS84"`
* `id.authority` = `"EPSG"` and `id.code` = `4326` (due to longitude, latitude ordering in this specification)

The CRS is likely equivalent to OGR:CRS84 if all of the following are true:
* `"type"` = `"GeographicCRS"`
* `coordinate_system.axis[0].unit` = `"degree"`
* `coordinate_system.axis[1].unit` = `"degree"`
* the values for `coordinate_system.axis[n].direction` are `"east"` and `"north"` (in either order, for n in [0,1])

and at least one of the following are true:
* `datum.id.authority` = `"EPSG"` and `datum.id.code` = `6326`
* `datum_ensemble.id.authority` = `"EPSG"` and `datum_ensemble.id.code` = `6326`
* `datum_ensemble.ellipsoid.semi_major_axis` = `6378137` and `datum_ensemble.ellipsoid.inverse_flattening` = `298.257223563`
* `datum.ellipsoid.semi_major_axis` = `6378137` and `datum.ellipsoid.inverse_flattening` = `298.257223563`

Due to the large number of CRSes available and the difficulty of implementing all of them, we expect that a number of implementations will start without support for the optional `crs` field.
Users are recommended to store their data in longitude, latitude (OGC:CRS84 or not including the `crs` field) for it to work with the widest number of tools. But data that is better served in particular projections can choose to use an alternate coordinate reference system. We expect many tools will support alternate CRSes, but encourage users to check to ensure their chosen tool supports their chosen crs.

#### epoch

Expand Down