Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide about supporting geometry spatial filter #32

Closed
brendan-ward opened this issue Jan 28, 2022 · 5 comments
Closed

Decide about supporting geometry spatial filter #32

brendan-ward opened this issue Jan 28, 2022 · 5 comments

Comments

@brendan-ward
Copy link
Member

Fiona allows users to provide a spatial filter geometry for filtering records on read. This is currently used by GeoPandas when reading features with Fiona.

If unimplemented here, GeoPandas could instead use a bounding box filter and then use internal spatial predicate methods to filter those further to features with geometries that actually intersect, and do so in ways that are consistent with how those predicates are used elsewhere in GeoPandas.

From the description of the spatial filter in GDAL/OGR, it sounds like the worst case is that GDAL uses a bounding box filter (bounds of filter geometry), and in the best case may do a bit better.

Given that there is some complexity involved in transporting the filter geometry from the Python tier to GDAL (what format/representation?), I'm not sure we want to implement this in pyogrio anytime soon. In order to best guide use of pyogrio in GeoPandas, we should make a decision here about this feature.

crossref: geopandas/geopandas#2225

@martinfleis
Copy link
Member

I am not able to assess the complexity of an implementation of this. But I think that for the time being, we'll be fine with bbox. If user specifies mask in geopandas together with engine="pyogrio" we should simply raise an error, rather than trying to do the filtering under the hood (I think).

The main question is what to do in a long-term after the switch of the default to pyogrio, if we decide not to implement a spatial filter here. Deprecate the keyword? Do some magic under the hood (I'd rather not as it is not very transparent)?

From the perspective of geopandas, it would obviously be optimal to have it but as I said, I am not able to tell how hard that would be so we can try to figure out another solution if needs be.

@jorisvandenbossche
Copy link
Member

Given that the mask filter might be the same as bbox or a bit stricter (without the user really knowing when which is the case), I personally also think it is fine for now to only expose the bbox filter.

The main question is what to do in a long-term after the switch of the default to pyogrio, if we decide not to implement a spatial filter here. Deprecate the keyword? Do some magic under the hood (I'd rather not as it is not very transparent)?

To be clear, there is a spatial filter, just a bbox filter and not geometry filter. I think that for many use cases this will be sufficient.

If we want to keep the mask keyword in geopandas, I think it would actually be better to do it under the hood with a basic intersects filter, as then we can at least ensure consistent and predictable behaviour across formats.

Given that there is some complexity involved in transporting the filter geometry from the Python tier to GDAL (what format/representation?),

For the writing code, we use WKB to create the OGR Geometry? I suppose the same could be done here if we wanted to add it.

@brendan-ward
Copy link
Member Author

use WKB to create the OGR Geometry

I was thinking this too. User would provide this as a shapely geometry object, we'd convert that to WKB and pass into the Cython tier to use with GDAL.

It sounds like for now the general consensus here is that there is no immediate need to add in a mask filter in pyogrio separate from using bbox and applying predicates downstream. I'm going to close this this issue as record of that decision.

@Jaapel
Copy link

Jaapel commented Oct 12, 2023

I know this discussion happened 2 years. Looking at https://pyogrio.readthedocs.io/en/latest/introduction.html#filter-records-by-a-geometry I see that it is documented to be available, however, when looking at the code and trying it out myself, the mask keyword does not seem to do anything. Anything I missed?

@martinfleis
Copy link
Member

@Jaapel that is a recent addition available only in the dev version as of now. See #285 for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants