Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfcs: geospatial feature #47762

Merged
merged 1 commit into from
May 1, 2020
Merged

rfcs: geospatial feature #47762

merged 1 commit into from
May 1, 2020

Conversation

sumeerbhola
Copy link
Collaborator

Release note: None

@sumeerbhola sumeerbhola requested review from rytaft, otan, petermattis, awoods187 and a team April 21, 2020 16:29
@sumeerbhola sumeerbhola requested a review from a team as a code owner April 21, 2020 16:29
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@blathers-crl
Copy link

blathers-crl bot commented Apr 21, 2020

❌ The GitHub CI (Cockroach) build has failed on f44e1ba2.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @otan, @petermattis, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 132 at r1 (raw file):

  the GEOS/SFCGAL libraries that have the builtins we require (see the
  builtins section), whilst also allowing some simple operations to be
  done without converting to GEOS/SFCGAL. In future, we will gradually

[nit] In the future


docs/RFCS/20200421_geospatial.md, line 170 at r1 (raw file):

  * MULTILINESTRING
  * MULTIPOLYGON
  * GEOMETRY (can be any shape except GEOMETRY) 

This feels like it needs more explanation...


docs/RFCS/20200421_geospatial.md, line 173 at r1 (raw file):

  * GEOMETRYCOLLECTION (can be any of the MULTI- shapes)
  
* In future, we will look to support the following shapes for the

[nit] In the future


docs/RFCS/20200421_geospatial.md, line 214 at r1 (raw file):

any.

Casts between other supported types (bbox, text and bytea) will also

what's bbox?


docs/RFCS/20200421_geospatial.md, line 255 at r1 (raw file):

CREATE INDEX [indexname] ON [tablename] USING GIST ( [geospatial_column] );

Will this create an inverted index?


docs/RFCS/20200421_geospatial.md, line 361 at r1 (raw file):

  locations. This is a potential source of confusion, which should be
  well documented.
  * We should include GEOS the docker containers we ship.

[nit] in the docker containers


docs/RFCS/20200421_geospatial.md, line 397 at r1 (raw file):

In PostGIS, spherical calculations are done within PostGIS, whilst
hard values (i.e. is metres, as opposed to "whether they intersect")

is meters? not sure I understand what this is saying


docs/RFCS/20200421_geospatial.md, line 412 at r1 (raw file):

Most geography functions that can optionally operate on a spheroid
(e.g. [ST_Area](https://postgis.net/docs/ST_Area.html)) have an

[nit] a default


docs/RFCS/20200421_geospatial.md, line 498 at r1 (raw file):

shape as keys. This is especially useful for ST_Distance/ST_DWithin
for repeated use of these operations. We may similarly look into doing
caching certain operations, but this is out of scope for v20.2.

caching of


docs/RFCS/20200421_geospatial.md, line 540 at r1 (raw file):

### Spatial Reference Identifiers (SRIDs) and Projections

SRIDs are a number which map to a projection of points on a

maps


docs/RFCS/20200421_geospatial.md, line 565 at r1 (raw file):

We need to be able to support some notion of a spatial_ref_sys table,
which allows management of user-defined insertable spatial_ref_sys
data. We can stick this as a `system` table, which will come preloaded

"stick this as" sounds a bit off


docs/RFCS/20200421_geospatial.md, line 567 at r1 (raw file):

data. We can stick this as a `system` table, which will come preloaded
with the same set of SRIDs. However, we will have additional
restrictions as we are no longer on the public schema:

I think you might mean something else here... system tables are in the public schema, but in the system catalog/database. I think the Postgres concepts are slightly different here...


docs/RFCS/20200421_geospatial.md, line 582 at r1 (raw file):

##### Availability on the Public Schema

For cross compatibility with PostGIS and it's nature as an extension

it's -> its


docs/RFCS/20200421_geospatial.md, line 588 at r1 (raw file):

names in single object resolution (e.g. `SELECT * FROM
spatial_ref_sys`) as well as double object resolution for the public
schema (e.g. `SELECT * FROM public.spatial_ref_sys`).

Does public.spatial_ref_sys really need to work? If not, we could store a view of spatial_ref_sys in the pg_extension schema you mention below, and have the search_path include that schema.


docs/RFCS/20200421_geospatial.md, line 613 at r1 (raw file):

around the [PROJ](https://proj.org/) library by transforming from one
proj4text to another. The library has a lot of extra additions that
may be considered extra for our use cases (e.g. requiring the

by "considered extra" you mean unnecessary?


docs/RFCS/20200421_geospatial.md, line 894 at r1 (raw file):

are numbered c[4*i+1]...c[4*i+4]. The following depicts this covering
as a tree with the leaf cells being the covering cells. Note that the
paths to the leaves to the root are not the same length.

from the leaves (or from the root)


docs/RFCS/20200421_geospatial.md, line 1364 at r1 (raw file):

With the S2 encoding mechanism, we can tell which are the items
intersect with the single cell covering of the Upper East Side by

intersecting


docs/RFCS/20200421_geospatial.md, line 1384 at r1 (raw file):

same, 129. So the optimizer can predict that the name filter will
match 2 rows. This filter will be pushed below the join, so the input
to the join from `nyc_neighborhoods` is 2.

2 rows


docs/RFCS/20200421_geospatial.md, line 1425 at r1 (raw file):

QUERY PLAN
-------------------------------------------------------------------------------------- project

project should be on a new line


docs/RFCS/20200421_geospatial.md, line 1534 at r1 (raw file):

  roachtests

Due to certain restrictions, we will be unable to utilize

what kind of restrictions?

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite make it all the way through, but better to send out the comments I have.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @otan, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 87 at r1 (raw file):

Geospatial is exposed through two new SQL types: GEOMETRY (planar) and
GEOGRAPHY (spheroid, which only works on latitude/longitude

The mention of latitude/longitude here is a bit confusing. Latitude/longitude is a specific spatial reference system. We'll be supporting other spatial reference systems in the future.


docs/RFCS/20200421_geospatial.md, line 102 at r1 (raw file):

* MULTIPOLYGON
* GEOMETRYCOLLECTION
* GEOMETRY

There is a "GEOMETRY" shape that is a "GEOMETRY" type? Or is this a reference to the GEOMETRY type? Something seems confused here, possibly me.


docs/RFCS/20200421_geospatial.md, line 125 at r1 (raw file):

* Geometry - for planar functionality compliant with the OGC spec. For
  the 20.2, we will look at only supporting 2D geometries, but still

Nit: s/the 20.2/20.2/g


docs/RFCS/20200421_geospatial.md, line 129 at r1 (raw file):

  a wrapper around the
  [twpayne/go-geom](https://github.com/twpayne/go-geom) library (BSD
  license), which allows us to easily transform the datum for use in

I don't think we need to call out the license for each of these dependencies. We're already doing the legwork to make sure the licenses are kosher.


docs/RFCS/20200421_geospatial.md, line 170 at r1 (raw file):

  * MULTILINESTRING
  * MULTIPOLYGON
  * GEOMETRY (can be any shape except GEOMETRY) 

This is super confusing. Is this some sort of wildcard type? I think some more words here, or a pointer to PostGIS docs, would be helpful.


docs/RFCS/20200421_geospatial.md, line 251 at r1 (raw file):

To maintain compatibility with PostGIS and existing tools, the syntax
to create indexes will involve using GiST, which will initialize with

Perhaps reinforce that while we're coopting the GIST syntax, we are not using an S2-based index underneath the covers.


docs/RFCS/20200421_geospatial.md, line 279 at r1 (raw file):

The default bounding box for 2D geometry will be decided after
consultation with customers.

Nit: s/customers/users/g


docs/RFCS/20200421_geospatial.md, line 285 at r1 (raw file):

For Geometry and Geography types, we will serialize the data as
EWKB. This allows us flexibility in changing libraries later, as well

How complex is the EKWB format? I believe we're currently using a C library for encoding/decoding. Are there any Go libraries? Would creating a Go library be easy or hard? I think the C library is fine in the near term, but I imagine we'll want to be able to perform serialization/deserialization in Go in the medium to long term.


docs/RFCS/20200421_geospatial.md, line 333 at r1 (raw file):

PostGIS mainly defers difficult 2D geometry functionality to the
[GEOS](https://trac.osgeo.org/geos/) library - however, some operations
are done in PostGIS itself if taking time to port from the internal

Nit: s/port/convert/g


docs/RFCS/20200421_geospatial.md, line 361 at r1 (raw file):

  locations. This is a potential source of confusion, which should be
  well documented.
  * We should include GEOS the docker containers we ship.

Nit: s/GEOS the/GEOS in the/g


docs/RFCS/20200421_geospatial.md, line 365 at r1 (raw file):

    steps for including the necessary geospatial libraries if the user
    wishes to play with geospatial data types (i.e. either run `sudo
    apt-get install libgeos-dev` / `brew install geos` before copying

I've been imagining we'd just ship libgeos in our tarballs. I'd really really really like to avoid using system installed dependencies.

@rytaft
Copy link
Collaborator

rytaft commented Apr 23, 2020

cc @cockroachdb/sql-opt-prs in case you have feedback on the optimizer sections here.

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the adjustments and squashed them in [is that how we do it?] (cc @sumeerbhola, you may want to force pull before making more changes)

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @petermattis, @rytaft, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 87 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

The mention of latitude/longitude here is a bit confusing. Latitude/longitude is a specific spatial reference system. We'll be supporting other spatial reference systems in the future.

right, but the GEOGRAPHY type ONLY works for lat/lng. helped clear it up a bit.


docs/RFCS/20200421_geospatial.md, line 102 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

There is a "GEOMETRY" shape that is a "GEOMETRY" type? Or is this a reference to the GEOMETRY type? Something seems confused here, possibly me.

Don't blame you. There is a GEOMETRY shape and a GEOMETRY type.


docs/RFCS/20200421_geospatial.md, line 125 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Nit: s/the 20.2/20.2/g

Done.


docs/RFCS/20200421_geospatial.md, line 129 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I don't think we need to call out the license for each of these dependencies. We're already doing the legwork to make sure the licenses are kosher.

Done.


docs/RFCS/20200421_geospatial.md, line 132 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] In the future

Done.


docs/RFCS/20200421_geospatial.md, line 170 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This is super confusing. Is this some sort of wildcard type? I think some more words here, or a pointer to PostGIS docs, would be helpful.

Yes! fixed.


docs/RFCS/20200421_geospatial.md, line 170 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

This feels like it needs more explanation...

Done.


docs/RFCS/20200421_geospatial.md, line 173 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] In the future

huh, does "in future" not work grammatically?


docs/RFCS/20200421_geospatial.md, line 214 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

what's bbox?

bounding box type. we're not supporting it anymore.


docs/RFCS/20200421_geospatial.md, line 251 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Perhaps reinforce that while we're coopting the GIST syntax, we are not using an S2-based index underneath the covers.

You mean we are right?


docs/RFCS/20200421_geospatial.md, line 255 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Will this create an inverted index?

Yes. mentioned.


docs/RFCS/20200421_geospatial.md, line 285 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

How complex is the EKWB format? I believe we're currently using a C library for encoding/decoding. Are there any Go libraries? Would creating a Go library be easy or hard? I think the C library is fine in the near term, but I imagine we'll want to be able to perform serialization/deserialization in Go in the medium to long term.

It's already got a Go-backed serdes in go-geom. It's WKT that we haven't got.
It's also very easy to roll our own.


docs/RFCS/20200421_geospatial.md, line 333 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Nit: s/port/convert/g

Done.


docs/RFCS/20200421_geospatial.md, line 361 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] in the docker containers

Done.


docs/RFCS/20200421_geospatial.md, line 361 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Nit: s/GEOS the/GEOS in the/g

Done.


docs/RFCS/20200421_geospatial.md, line 365 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I've been imagining we'd just ship libgeos in our tarballs. I'd really really really like to avoid using system installed dependencies.

Done.


docs/RFCS/20200421_geospatial.md, line 397 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

is meters? not sure I understand what this is saying

Fixed it up.


docs/RFCS/20200421_geospatial.md, line 412 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] a default

Done.


docs/RFCS/20200421_geospatial.md, line 498 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

caching of

hmm, think "caching of" makes it make less sense. reworded it more.


docs/RFCS/20200421_geospatial.md, line 588 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Does public.spatial_ref_sys really need to work? If not, we could store a view of spatial_ref_sys in the pg_extension schema you mention below, and have the search_path include that schema.

I believe it does, but I'm being paranoid of third party tools.


docs/RFCS/20200421_geospatial.md, line 1364 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

intersecting

reworded

@blathers-crl
Copy link

blathers-crl bot commented Apr 23, 2020

❌ The GitHub CI (Cockroach) build has failed on be5b607a.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @otan, @petermattis, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 173 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

huh, does "in future" not work grammatically?

Don't think so if you're using "future" as a noun.


docs/RFCS/20200421_geospatial.md, line 498 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

hmm, think "caching of" makes it make less sense. reworded it more.

Then maybe remove the word "doing" before "caching"


docs/RFCS/20200421_geospatial.md, line 1091 at r2 (raw file):

    the selectivity calculation for lookup joins in cases where the
    index is inverted.

I'd add a point here saying something like:

"Since the set operations required for each row make selectivity calculation on geospatial lookup joins especially difficult, we may consider adding a new type of histogram for geospatial data. The histogram buckets would be ranges of S2 cell IDs, and the counts would represent the number of objects in the geospatial column that overlap that cell. By joining histograms on two geospatial columns, we could estimate the selectivity of a real join on those columns."

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @otan, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 251 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

You mean we are right?

Oops. Yes, "we are".


docs/RFCS/20200421_geospatial.md, line 256 at r2 (raw file):

To maintain compatibility with PostGIS and existing tools, the syntax
to create indexes will involve the same syntax as using GiST, which will
initialize with the indexes with the S2-backed inverted index undernearth.

Nit: s/with the indexes/the indexes/g

@petermattis
Copy link
Collaborator

This is a very detailed RFC! Let's get geospatial built!

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @petermattis, @rytaft, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 251 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Oops. Yes, "we are".

Done.


docs/RFCS/20200421_geospatial.md, line 279 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Nit: s/customers/users/g

Done.


docs/RFCS/20200421_geospatial.md, line 588 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

I believe it does, but I'm being paranoid of third party tools.

Actually, let's go with your solution for now! Makes more sense.


docs/RFCS/20200421_geospatial.md, line 1534 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

what kind of restrictions?

licensing, but I'm told to avoid mentioning it in the RFC.


docs/RFCS/20200421_geospatial.md, line 256 at r2 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Nit: s/with the indexes/the indexes/g

Done.

@blathers-crl
Copy link

blathers-crl bot commented Apr 25, 2020

❌ The GitHub CI (Cockroach) build has failed on 7b97256b.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @otan, @petermattis, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 588 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

Actually, let's go with your solution for now! Makes more sense.

Sounds good.

I might be wrong, but I'm pretty sure the search path only includes schemas, not catalogs. It's a bit confusing since we have two very similar concepts:

  1. We have a system catalog, which is a database that contains real tables, like the jobs table, users table, etc.
  2. Separately, there is a pg_catalog schema, which is implicitly included in the search path (in addition to public) and includes virtual tables that allow us to be compatible with Postgres. Those virtual tables are basically views over the real tables in the system catalog.

So I think the sentence "...will look at the system catalog after looking at the public one" isn't quite right. Instead, you might want to say "...will look at the pg_extension schema after looking at public and pg_catalog".


docs/RFCS/20200421_geospatial.md, line 1534 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

licensing, but I'm told to avoid mentioning it in the RFC.

got it - thanks!

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm making a pass now -- not yet done (the r4 commit was me but shows up as otan in reviewable which is confusing)

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @otan, @petermattis, @rytaft, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 214 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

bounding box type. we're not supporting it anymore.

fixed typo s/bytea/bytes/


docs/RFCS/20200421_geospatial.md, line 540 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

maps

I've updated the text here because not all SRIDs are projections on a plane.


docs/RFCS/20200421_geospatial.md, line 585 at r3 (raw file):

For cross compatibility with PostGIS and its nature as an extension
which injects tables into the public schema, we need this table to be
available resolvable from anywhere by simply requesting "geometry_columns",

updated
s/available resolvable/resolvable/

@blathers-crl
Copy link

blathers-crl bot commented Apr 26, 2020

❌ The GitHub CI (Cockroach) build has failed on da2ae3e3.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @otan, @petermattis, and @rytaft)


docs/RFCS/20200421_geospatial.md, line 894 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

from the leaves (or from the root)

fixed


docs/RFCS/20200421_geospatial.md, line 1384 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

2 rows

Done.


docs/RFCS/20200421_geospatial.md, line 1425 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

project should be on a new line

Done.


docs/RFCS/20200421_geospatial.md, line 1091 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I'd add a point here saying something like:

"Since the set operations required for each row make selectivity calculation on geospatial lookup joins especially difficult, we may consider adding a new type of histogram for geospatial data. The histogram buckets would be ranges of S2 cell IDs, and the counts would represent the number of objects in the geospatial column that overlap that cell. By joining histograms on two geospatial columns, we could estimate the selectivity of a real join on those columns."

Done.

@blathers-crl
Copy link

blathers-crl bot commented Apr 26, 2020

❌ The GitHub CI (Cockroach) build has failed on 0f322c51.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@otan
Copy link
Contributor

otan commented Apr 26, 2020 via email

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, 1 of 1 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @otan, and @petermattis)

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is bytea though

undid it and formatted as code

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @otan, @petermattis, and @rytaft)

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @awoods187, @knz, @otan, and @rytaft)


docs/RFCS/20200421_geospatial.md, line 103 at r7 (raw file):

Previously, knz (kena) wrote…

Something I don't understand. You've commented on issue #21286 that "it's different from PostGIS" but the list of types here is largely the same as that on issue 21286.

How much of the work documented here will help solve issue #21286?

The underlying builtin machinery may be able to be re-used; however disk serialization and key storage as well as the inverted index ll_to_earth works a little differently and deserves more treatment and thought than this RFC alright provides.


docs/RFCS/20200421_geospatial.md, line 336 at r7 (raw file):

Previously, knz (kena) wrote…

PostgreSQL does allows using GEOMETRY and other geometry/geography types as primary keys.

What is the limitation here exactly? How common is this in user apps? How much are we giving up by not supporting it?
I think this compatibility aspect should be called out explicitly.

The limitation here is that we will have completely different ordering.
I don't see any tutorial code or tools that require using these columns as primary keys.
Furthermore, the primary key layout will have a completely different ordering without copying licensed code.
I think it's safer to avoid this for 20.2. We can look at consulting with customers and seeing if this is an issue and can reprioritize before 20.2 if that is the case.


docs/RFCS/20200421_geospatial.md, line 403 at r7 (raw file):

Previously, knz (kena) wrote…

Can you outline what CockroachDB looks like when you run it without the geos module:

  • what kind of error messages are reproted if you try to use the feature?
  • what happens if you initialize some geometry/geography data in a cluster with the geos module loaded, then try to access it later without the module loaded/

Done.


docs/RFCS/20200421_geospatial.md, line 531 at r7 (raw file):

Previously, knz (kena) wrote…

What's the maximum size of this cache? How do entries get invalidated? What's the replacement method?

I've deliberately avoided detailing that here as we're not looking at implementing it.
It's almost certainly worth it's own RFC when it happens, and would prefer to avoid going to such detail in this RFC.
I've added a comment about a further RFC coming if it happens.


docs/RFCS/20200421_geospatial.md, line 609 at r7 (raw file):

Previously, knz (kena) wrote…

This sounds like a compatibility issue waiting to happen. Can you not make the table entry visible in every table's public catalog, and populate it on-demand the first time it gets a KV scan?

This is especially important in multi-tenant: we don't want to pay the price of populating a system table upfront if the feature ends up never being used by a tenant.

Fixed to something akin to your suggestion.


docs/RFCS/20200421_geospatial.md, line 627 at r7 (raw file):

Previously, knz (kena) wrote…

This does not seem right.
If you look in detail you will discover that the reason why "it works" is that the extension populates its stuff in a pg schema that is part of the search_path.
It's either pg_catalog or some other schema that automatically gets added to the search_path.

I'd recommend investigating the same here. A special case to look up things in system is also pain waiting to happen.

I've changed this to match your idea mentioned above.

@otan
Copy link
Contributor

otan commented Apr 28, 2020


docs/RFCS/20200421_geospatial.md, line 336 at r7 (raw file):

Previously, otan (Oliver Tan) wrote…

The limitation here is that we will have completely different ordering.
I don't see any tutorial code or tools that require using these columns as primary keys.
Furthermore, the primary key layout will have a completely different ordering without copying licensed code.
I think it's safer to avoid this for 20.2. We can look at consulting with customers and seeing if this is an issue and can reprioritize before 20.2 if that is the case.

(or we can just bite the bullet and have our own primary key encoding system, given the use cases. but uniqueness is a funny case because you can have a differently ordered POLYGON that represents the same geometry figure underneath.)

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r9.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @awoods187, @knz, @otan, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 336 at r7 (raw file):

Previously, otan (Oliver Tan) wrote…

(or we can just bite the bullet and have our own primary key encoding system, given the use cases. but uniqueness is a funny case because you can have a differently ordered POLYGON that represents the same geometry figure underneath.)

I'm ok with your explanation and your idea but I think they should be spelled out in the RFC. Also mentioned in the in-scope/out-of-scope discussion.


docs/RFCS/20200421_geospatial.md, line 649 at r9 (raw file):

This will also be available on the `pg_extension` schema
mentioned below.

nit: I think you got yourself a stray triple backquote here

@blathers-crl
Copy link

blathers-crl bot commented Apr 28, 2020

❌ The GitHub CI (Cockroach) build has failed on a087b903.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@otan
Copy link
Contributor

otan commented Apr 28, 2020

(remember to squash these before merging!)

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @awoods187, @knz, @otan, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 330 at r10 (raw file):

however it is not yet ported to Go.

We will not support indexing geospatial types in default primary keys

in default primary keys and unique secondary indexes


docs/RFCS/20200421_geospatial.md, line 333 at r10 (raw file):

and indexes (as we do not for JSONB and ARRAYs today). This is because
we will not be able to match the PostGIS definition as it's based
on a hash of it's internal data structure, which means we will

nit: its internal ds

@blathers-crl
Copy link

blathers-crl bot commented Apr 28, 2020

❌ The GitHub CI (Cockroach) build has failed on c1a38fc6.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Apr 28, 2020

❌ The GitHub CI (Cockroach) build has failed on f6769724.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r11.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @awoods187, @knz, @otan, and @sumeerbhola)

Release note: None
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @awoods187, @knz, @otan, @rytaft, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 330 at r10 (raw file):

Previously, knz (kena) wrote…

in default primary keys and unique secondary indexes

Done.

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @awoods187, @knz, @otan, @rytaft, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 649 at r9 (raw file):

Previously, knz (kena) wrote…

nit: I think you got yourself a stray triple backquote here

Oops. fixed.

@otan
Copy link
Contributor

otan commented May 1, 2020

let's goooo

bors r+

@craig
Copy link
Contributor

craig bot commented May 1, 2020

Build failed

@otan
Copy link
Contributor

otan commented May 1, 2020

bors r+

Copy link
Contributor

@awoods187 awoods187 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @knz, @otan, @rytaft, and @sumeerbhola)


docs/RFCS/20200421_geospatial.md, line 177 at r12 (raw file):

* In the future, we will look to support the following shapes for the
  Geometry data type only, in line with PostGIS support:
  * 2.5D (all 2D with Z- and M- coordinates and 3D datatypes)

this is a super small nit but it kinda makes sense to show the 2D before the 2.5D

@craig
Copy link
Contributor

craig bot commented May 1, 2020

Build succeeded

@craig craig bot merged commit 6c68987 into cockroachdb:master May 1, 2020
@sumeerbhola sumeerbhola deleted the geo_rfc branch May 4, 2020 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants