-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make calculating spatial/datetime extents optional #143
Conversation
Approach sounds 👌 Let's add some tests to validate 🙏 |
The one thing I was wondering is if we should skip spatial/temporal columns altogether instead of passing those variables through to that function, but it seems like we need other attributes from the spatial columns besides
I added a few tests! |
I think it's fine to have them separate. Most people will have spatial index (hopefully) so them might want to avoid only the temporal calculation |
Let's wait for @bitner 👍 👎 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give my approval here, but I do have some concerns with going down this road.
- Having the spatial / temporal extents will be required if (when) we implement tooling to create PgSTAC items from the catalog of TiPG layers.
- This is basically kicking the can down the road. If appropriate indexes are not available on spatial/temporal columns, while the creation of the catalog may be faster, any usage of tables without those indexes is going to bog down and be incredibly slow anyway.
Currently, TiPG is stateless. It makes no changes (outside of the temporary schema) to the database at all and other than the CatalogUpdateMiddleware does not provide any caching or other state management within the application level.
In my opinion TiPG is meant to expose a Postgres/PostGIS database as-is and I don't think that TiPG should ever modify the structure of tables/functions it is exposing (ie renaming columns or adding indexes etc). I do think that in order to actually be scalable, we do need to enable options to manage/store state beyond the scope of a single instance. I do want to continue to have a version of TiPG that (similar to pg_featureserv) that does not touch the database at all - but, of course, has the caveat that there will be limitations to how well it can scale. I do think that it would be good to enable broadened ability to store cached state beyond the single instance by adding additional backends that can store state. Currently, the catalog creation on startup and the CatalogControlMiddleware just use the App State. In order to scale for something like a Serverless environment, we should allow for different ways to store this metadata - this could be storing it in a tipg schema within the database (or even possibly a different database), storing the catalog in an S3 bucket as json, and/or storing using something like Redis/Memcache.
I will note that when deploying TiPG in a serverless environment, you will most certainly want to make sure that you are using some type of connection pooling (ie pgbouncer) as without, it can be very very easy to hit the max_connections of the database, which if you do not have your database configured with appropriate statement_timeout or idle_in_transaction_timeout could fully lock your database up from any further use.
@bitner I think
|
As discussed in #139, Calculating the extent of spatial/datetime columns can be slow enough that serverless deployments of
tipg
are not possible.Skipping the extent calculations is one way to make application startup almost instantaneous! With this change, the user can set
TIPG_DB_SPATIAL_EXTENT=False
andTIPG_DB_DATETIME_EXTENT=False
to skip these calculations.I am not sure if passing these arguments all the way through to
pg_temp.tipg_properties
is the best way to accomplish this, but I got it working. If there is a cleaner way that you can think of please let me know!