Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarified use of geometries in aggregate_spatial processes #253

Merged
merged 4 commits into from
May 26, 2021

Conversation

m-mohr
Copy link
Member

@m-mohr m-mohr commented May 18, 2021

Improvements for aggregate_spatial processes, see #252:

  • Clarified: One value will be computed per GeoJSON geometry, which means that, for example, a single value will be computed for a MultiPolgon, but two values will be computed for a FeatureCollection or GeometryCollection containing two polygons.
  • Moved the details about geometries from the process description to the description of the parameter geometries to have related descriptions in a single place.

This behavior of MultiPolygons / Polygons is aligned with the VITO implementation AFAIK. Not sure about GeometryCollections.

…* geometries are a single entity in computations. GeometryCollections are considered being multiple entities. #252
@m-mohr m-mohr added this to the 1.1.0 milestone May 18, 2021
@m-mohr m-mohr requested review from soxofaan and jonathom May 18, 2021 16:21
@m-mohr m-mohr linked an issue May 18, 2021 that may be closed by this pull request
@edzer
Copy link
Member

edzer commented May 18, 2021

Hmm. According to the simple feature access standard, a GeometryCollection is the geometry for a single feature, and it sounds more logical to me that you'd like the aggregated value over that collection. For a FeatureCollection (not a concept in sfa), it makes sense to return the aggregate value for each feature.

@mkadunc
Copy link
Member

mkadunc commented May 18, 2021

I agree with Edzer - GeomCollection is a single geometry in the same way as a MultiPolygon.

I propose we require the parameter to always be an array, e.g. a FeatureCollection, and the aggregation happens over each of elements of the array (FeatureCollection).

I would try to avoid API with branching behaviour based on (runtime) type of parameter objects - such things tend to be difficult to document and cause weird issues.

@m-mohr
Copy link
Member Author

m-mohr commented May 18, 2021

Indeed, thanks for sounding in. I was wondering about that, too, and was unsure about GeometryCollection. I should have had a look into the spec. So instead of making it "per Geometry", the wording should be about "per Feature" (with plain geometries being a single Feature). I'll change it accordingly.

We can't change the actual input data type right now to not introduce breaking changes, but I think the actual input data type is okay, it just needs a clarification. @mkadunc

@m-mohr m-mohr self-assigned this May 18, 2021
@mkadunc
Copy link
Member

mkadunc commented May 18, 2021

It’s OK indeed. I guess we can’t formally specify geojson FeatureCollection as param type, so we could just clarify in text that this is the kind of geojson that is expected.

@m-mohr
Copy link
Member Author

m-mohr commented May 19, 2021

Okay, I rewrote that part:

One value will be computed per GeoJSON Feature, Geometry or GeometryCollection. For a FeatureCollection multiple values will be computed, one value per contained Feature. For example, a single value will be computed for a MultiPolygon, but two values will be computed for a FeatureCollection containing two polygons.

@m-mohr m-mohr requested review from mkadunc and edzer May 19, 2021 08:39
@m-mohr m-mohr force-pushed the clarify-use-of-geometries branch from e26e59f to 91eee13 Compare May 19, 2021 08:55
Copy link
Member

@jonathom jonathom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, seems logically straightforward and understandable to me.

Co-authored-by: Jonathan Bahlmann <[email protected]>
@soxofaan
Copy link
Member

So this would make the current VITO GeometryCollection handling invalid ? That would be annoying as we have internal use cases depending on it, and there is also a lot of documentation/demo's/webinars showing this "wrong way".

@m-mohr
Copy link
Member Author

m-mohr commented May 19, 2021

How do you handle GeometryCollection? @soxofaan

@soxofaan
Copy link
Member

we return a separate aggregation for each geometry in the collection

@m-mohr
Copy link
Member Author

m-mohr commented May 19, 2021

we return a separate aggregation for each geometry in the collection

Hmm... the issue here is that GeometryCollection is defined to be a Geometry itself, which makes sense to me if you'd, for example, want to run a use case, which e.g. aggregates over multiple geometry types at once. There's no other way to achieve this except for using a geometry collection.

Unfortunately, this was not defined well enough, but looking at SFA there's probably no other way to interpret it in a consistent way with SFA. And for interoperability, we should define it now rather than leaving it open...

Edit: The only other implementation of aggregate_spatial right now is Sentinel Hub. Let me quickly confirm how they handle those cases.

@jdries
Copy link
Contributor

jdries commented May 19, 2021

Perhaps we can also repeat the warning that's included in the geojson spec:

To maximize interoperability, implementations SHOULD avoid nested
GeometryCollections. Furthermore, GeometryCollections composed of a
single part or a number of parts of a single type SHOULD be avoided
when that single part or a single object of multipart type
(MultiPoint, MultiLineString, or MultiPolygon) could be used instead.

And reflect that in our own examples by not using geometrycollection at all. It's anyway a type that you would use for features composed of a combination of polygons and lines or points, so rather exotic.
That way, our implementation can stay as-is for now, and we gradually move away from this annoying type.
I also believe that if the user sends a featurecollection, which uses geometrycollection to define feature geometry, we would still do the right thing.

@edzer
Copy link
Member

edzer commented May 19, 2021

I also believe that if the user sends a featurecollection, which uses geometrycollection to define feature geometry, we would still do the right thing.

@jdries according to https://datatracker.ietf.org/doc/html/rfc7946#page-12 , a featureCollection is an array of feature objects; each feature object needs to have a geometry (section 3.2). Sets of geometries for multiple features should never be put in a GeometryCollection, that is how I read this. If a single feature has a GeometryCollection as geometry type, so be it.

I agree that we can discourage use of GeometryCollection in general. They are part of a formalism, and hence "needed", but almost always a nuisance, I see them rarely used intentionally.

@m-mohr
Copy link
Member Author

m-mohr commented May 19, 2021

I've added the following:

To maximize interoperability, a nested GeometryCollection should be avoided. Furthermore, a GeometryCollection composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. MultiPolygon).

GEE: has two processes, reduceRegion (takes a Geometry [including a GeometryCollection, in GEE terminology a 'MultiGeometry'], returns one value) and reduceRegions (takes a FeatureCollectin, returns multiple results).

Sentinel Hub: Seems to require passing an array. Have not looked into details, maybe Miha can sound in how they implement it?

@jonathom
Copy link
Member

GEE: has two processes, reduceRegion ...

@m-mohr I can't seem to find that process (in the web editor), what am I missing here?

@soxofaan I've been failing in passing multiple geometries to VITO back-end. Could you kindly explain how to do this? Passing geojson FeatureCollection (js client) doesn't work, passing shapely MultiPolygon results in only one output (which is expected I think, but shapely talks about Multi~ Collections as geometry collections, which completes my confusion). Already mentioned here.

@m-mohr
Copy link
Member Author

m-mohr commented May 19, 2021

@jonathom I was referring to the native GEE functions, not openEO processes.

@mkadunc
Copy link
Member

mkadunc commented May 19, 2021

Sentinel Hub: Seems to require passing an array. Have not looked into details, maybe Miha can sound in how they implement it?

Sentinel Hub service natively handles either a single bbox, a single Polygon or a single MultiPolygon; all these produce just one aggregation (i.e. MultiPolygon is treated as a single Geometry for the purpose of calculating statistics).

The sentinel-hub-python-driver does indeed expect the geometries argument to be an array of geometry objects. This is the relevant part of the code:

geometries = self.validate_parameter(arguments, "geometries", required=True, allowed_types=[list])

@soxofaan
Copy link
Member

I've been failing in passing multiple geometries to VITO back-end. Could you kindly explain how to do this?

@jonathom it is illustrated in this notebook: https://github.com/Open-EO/openeo-python-client/blob/master/examples/notebooks/openeo-terrascope-webinar.ipynb (scroll down half way, to the section about "time series")
In the Python client you pass a GeometryCollection object, and the VITO backend will calculate a separate aggregation for each geometry in that collection

@m-mohr
Copy link
Member Author

m-mohr commented May 20, 2021

Is anyone up for a final review?

PS: The openEO Sentinel Hub implementation is a bit confusing as it says in the process description of the back-end that it expects an object, but in reality expects an array.

@m-mohr
Copy link
Member Author

m-mohr commented May 26, 2021

Merging for now as no new feedback came in. If there's anything to discuss, please open a new issue.

@m-mohr m-mohr merged commit 7517219 into draft May 26, 2021
@m-mohr m-mohr deleted the clarify-use-of-geometries branch May 26, 2021 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify that Multi* geometries are a single entities in computations
6 participants