-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameterize types with CRS? #441
Comments
Great find haha! Yes and no.. I added basic support for parameterizing types in order to support specifying the CRS for the GEOMETRY in the spatial extension, but im not sure the current implementation is ready to be used for that just yet. The casting semantics between parameterized/non-parameterized types is hard-coded and its not very elegant to propagate type mods across function calls (you have to implement a custom bind hook and inspect the argument types and set the output type manually). Additionally you can only pass numbers or strings (not identifiers) as type mods, their position matters (e.g. A(B,C) is not equal to A(C,B)) and they cannot be "named". So feel free to give it a shot, but just know this is very new code and it's not really used anywhere else yet and we may make more changes to it in the future. As a side note, why parameterized the GEOGRAPHY type? I thought s2 only handled pure sphere calculations... I guess for the metadata? Another side note, we'll probably adda a GEOGRAPHY type to spatial at some point in the future as well, would be cool if the extensions could interoperate nicely! |
Now, I think the bigger problem with tagging the type with the CRS is, what do you actually use as the modifier? In PostGIS there is a And honestly, Im not sure it makes a lot of sense to ship a
There's also some technical limitations, DuckDB can't really query a table to do a SRID lookup from inside a function call effectively so the actual backing storage would have to be something else... but we still want users to be able to insert their own CRS definitions so it can't just be a VIEW either. I suppose it would be possible to tag the GEOMETRY type not with an integer SRID but instead a more elaborate CRS definition inline... but I don't think that's great UX either if users have to provide e.g. a whole PROJJSON string in the DDL, like I'd be happy to get your thoughts on this! |
Cool! No pressure to do this if it's not ready yet, although I do hope you go for the "CRS is a property of the type" rather than the "item level SRID" model. This is not a critique in any way because I know this is an infinite amount of work and that you know this already, but silently dropping the CRS everywhere is a common source of silently erroneous answers (and causes general distrust from the hardcore spatial folks).
The engine I'm using (s2geometry) doesn't care, but others (geographiclib) do because they use the elipsoid for distance calculation and (sometimes) edge intersection. There's also more than one lon/lat (e.g., NAD83/NAD27/WGS84), which applies equally to both. I'll probably punt this problem for later but I thought maybe eating that complexity early would prevent a massive refactor. It sounds like I might need a refactor either way until this gets sorted from DuckDB's end so I'll probably punt 🙂.
A string! People get into some pretty heavy flame wars about exactly what to use here but a string is what PROJ needs to construct a CRS and is what your (external) data sources will provide. PROJJSON is nice if you (or another extension) want(s) to circumvent PROJ to accelerate some common transformations but really all you need is something that works from GDAL/PROJ's "make me a CRS from a string".
I agree you don't want this here...this is one of the best arguments against item level CRS-as-srid for me: you can't very well store a full CRS definition at the element level, and if you don't do that you need a place to look them up. The only reason to do this is if you absolutely cannot parameterize your type (e.g., PostgreSQL) or if you need to integrate tightly with PostGIS as a source.
Typing that is obviously garbage but in terms of ensuring people reading from your table aren't going to get silently incorrect answers, you really do want a full definition of some kind there. Not sure if function calls are allowed there but
I'm not sure if my thing will amount to any more than a fun project to learn about DuckDB, but it's worth considering where this extension ends and another begins. There is more than one instance in the spatial universe of everything getting lumped into one big kitchen sink of spatial and I hope that can be avoided here. |
Apologies if I've already opened this! I have a vague recollection of discussing this at some point and having the answer be that it wasn't possible to register a parameterized type from an extension (at least then, which was probably quite a while ago). I'm having a go at porting s2geography as a DuckDB extension and ran across:
https://github.com/duckdb/duckdb/blob/e791508e9bc2eb84bc87eb794074f4893093b743/test/extension/loadable_extension_demo.cpp#L255-L261
...which seems to be a parameterized "bounded" type registered from an extension. So maybe it's now possible?
(Also loads of precedent for item-level CRS specification with PostGIS and alikes, but also quite a bit for type-level CRS with GeoParquet/GeoPandas/GeoArrow/R+sf!)
The text was updated successfully, but these errors were encountered: