Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL Backend #11048

Merged
merged 1 commit into from
Mar 29, 2022
Merged

SQL Backend #11048

merged 1 commit into from
Mar 29, 2022

Conversation

jimbishopp
Copy link
Contributor

This PR implements a new backend supporting PostgreSQL and CockroachDB SQL database platforms.

Implements #10253 without cloud connectivity.

@jimbishopp jimbishopp requested a review from r0mant March 10, 2022 22:39
@github-actions github-actions bot requested review from smallinsky and Tener March 10, 2022 22:40
@jimbishopp jimbishopp force-pushed the jim/sqlbk branch 2 times, most recently from 9781645 to f4750d4 Compare March 11, 2022 04:10
Copy link
Contributor

@Tener Tener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary review, I need to dig deeper into backend stuff to properly review that.

lib/backend/postgres/backend.go Outdated Show resolved Hide resolved
lib/backend/postgres/tx.go Outdated Show resolved Hide resolved
lib/backend/sqlbk/config.go Outdated Show resolved Hide resolved
lib/backend/sqlbk/driver.go Outdated Show resolved Hide resolved
lib/backend/sqlbk/driver.go Outdated Show resolved Hide resolved
@jimbishopp jimbishopp force-pushed the jim/sqlbk branch 2 times, most recently from 581e651 to 9bc6013 Compare March 15, 2022 00:09
Copy link
Contributor

@fspmarshall fspmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: I have had no prior experience with SQL. I did a small amount of background reading on SQL concurrency in order to review this PR, but take everything with a grain of salt.

lib/backend/sqlbk/backend.go Show resolved Hide resolved
lib/backend/sqlbk/background.go Outdated Show resolved Hide resolved
lib/backend/sqlbk/background.go Outdated Show resolved Hide resolved
lib/backend/sqlbk/background.go Outdated Show resolved Hide resolved
lib/backend/postgres/driver.go Outdated Show resolved Hide resolved
jimbishopp added a commit that referenced this pull request Mar 18, 2022
jimbishopp added a commit that referenced this pull request Mar 18, 2022
@jimbishopp
Copy link
Contributor Author

jimbishopp commented Mar 22, 2022

Latest changes include an optimization for the DeleteItems query.

Previous query:

DELETE FROM item WHERE (key, id) NOT IN (
SELECT key, id FROM lease WHERE (expires IS NULL OR expires > $1)
UNION
SELECT key, id FROM event
)`

New query:

DELETE FROM item WHERE (key, id) IN (
SELECT key, id
FROM item
LEFT JOIN lease USING (key, id)
LEFT JOIN event USING (key, id)
WHERE event.key IS NULL
AND (lease.key IS NULL OR lease.expires < $1)
)`

Performance improved from 7 minutes to 76 ms on a 300K record dataset (100K per table).

OLD QUERY
teleport=# explain analyze select key,id from item where (key,id) not in (
        select key,id from lease where (expires is null or expires > now())
        union
        select key,id from event
);
                                                                         QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on item  (cost=14489.54..114117475.54 rows=50000 width=19) (actual time=422813.252..422813.254 rows=0 loops=1)
   Filter: (NOT (SubPlan 1))
   Rows Removed by Filter: 100000
   SubPlan 1
     ->  Materialize  (cost=14489.54..16521.56 rows=100001 width=40) (actual time=0.002..1.936 rows=50000 loops=100000)
           ->  Unique  (cost=14489.54..15239.55 rows=100001 width=40) (actual time=90.174..105.293 rows=100000 loops=1)
                 ->  Sort  (cost=14489.54..14739.55 rows=100001 width=40) (actual time=90.173..96.160 rows=100000 loops=1)
                       Sort Key: lease.key, lease.id
                       Sort Method: external merge  Disk: 3336kB
                       ->  Append  (cost=8.60..3447.63 rows=100001 width=40) (actual time=0.038..27.681 rows=100000 loops=1)
                             ->  Bitmap Heap Scan on lease  (cost=8.60..12.62 rows=1 width=19) (actual time=0.027..0.028 rows=0 loops=1)
                                   Recheck Cond: ((expires IS NULL) OR (expires > now()))
                                   ->  BitmapOr  (cost=8.60..8.60 rows=1 width=0) (actual time=0.025..0.025 rows=0 loops=1)
                                         ->  Bitmap Index Scan on lease_expires  (cost=0.00..4.30 rows=1 width=0) (actual time=0.020..0.020 rows=0 loops=1)
                                               Index Cond: (expires IS NULL)
                                         ->  Bitmap Index Scan on lease_expires  (cost=0.00..4.30 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1)
                                               Index Cond: (expires > now())
                             ->  Seq Scan on event  (cost=0.00..1935.00 rows=100000 width=19) (actual time=0.009..19.800 rows=100000 loops=1)
 Planning Time: 0.413 ms
 Execution Time: 422814.115 ms

NEW QUERY
teleport=# explain analyze select key,id from item i left join lease l using (key,id) left join event e using (key,id) where e.key is null and (l.key is null or l.expires < now());
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=4021.42..8265.51 rows=1 width=19) (actual time=76.165..76.167 rows=0 loops=1)
   Filter: ((l.key IS NULL) OR (l.expires < now()))
   ->  Hash Anti Join  (cost=4021.00..8265.00 rows=1 width=19) (actual time=76.165..76.166 rows=0 loops=1)
         Hash Cond: ((i.key = e.key) AND (i.id = e.id))
         ->  Seq Scan on item i  (cost=0.00..1736.00 rows=100000 width=19) (actual time=0.025..8.708 rows=100000 loops=1)
         ->  Hash  (cost=1935.00..1935.00 rows=100000 width=19) (actual time=38.984..38.984 rows=100000 loops=1)
               Buckets: 65536  Batches: 2  Memory Usage: 3234kB
               ->  Seq Scan on event e  (cost=0.00..1935.00 rows=100000 width=19) (actual time=0.009..14.667 rows=100000 loops=1)
   ->  Index Scan using lease_pk on lease l  (cost=0.42..0.49 rows=1 width=27) (never executed)
         Index Cond: (key = i.key)
         Filter: (i.id = id)
 Planning Time: 0.875 ms
 Execution Time: 76.285 ms

@r0mant
Copy link
Collaborator

r0mant commented Mar 22, 2022

@smallinsky @Tener Could you please give this another look?

lib/backend/postgres/driver.go Show resolved Hide resolved
lib/backend/postgres/backend.go Show resolved Hide resolved
lib/backend/postgres/tx.go Show resolved Hide resolved
Copy link
Contributor

@smallinsky smallinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have little prior experience with SQL but the code LGTM.

Copy link
Contributor

@Tener Tener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell it looks good.

One thought that I have is that given the multitude of backends that we have we could benefit from having a comprehensive backend test suite testing backend correctness and performance. Preferably an automated or semi-automated one.

@jimbishopp
Copy link
Contributor Author

jimbishopp commented Mar 25, 2022

@Tener

we could benefit from having a comprehensive backend test suite testing backend correctness and performance

We do! The test suite was very helpful when putting this together, especially by validating the correct error is returned from specific backend functions. I was able to iterate very quickly in the end because of this.

test.RunBackendComplianceSuite(t, newBackend)

// RunBackendComplianceSuite runs the entire backend compliance suite,
// creating a collection of named subtests under the context provided
// by `t`.
//
// As each test requires a new backend instance it will invoke the supplied
// `newBackend` function, which callers will use inject instances of the
// backend under test.
func RunBackendComplianceSuite(t *testing.T, newBackend Constructor) {

@Tener
Copy link
Contributor

Tener commented Mar 25, 2022

@jimbishopp

We do! The test suite was very helpful when putting this together, especially by validating the correct error is returned from specific backend functions. I was able to iterate very quickly in the end because of this.

This is good stuff, thanks for pointing that out. We could go bigger still with the tests to get a sense of the relative speed of individual backends under production-like loads... but this is probably just my inner benchmark junkie speaking 😉

@jimbishopp jimbishopp enabled auto-merge (squash) March 28, 2022 22:10
@jimbishopp jimbishopp merged commit 06fef2a into master Mar 29, 2022
@jimbishopp jimbishopp deleted the jim/sqlbk branch March 29, 2022 00:18
jimbishopp added a commit that referenced this pull request Apr 1, 2022
Original PR: #11048

Add a new backend supporting PostgreSQL and CockroachDB.

Implements #10253 without cloud connectivity.
jimbishopp added a commit that referenced this pull request Apr 6, 2022
Original PR: #11048

Add a new backend supporting PostgreSQL and CockroachDB.

Implements #10253 without cloud connectivity.
jimbishopp added a commit that referenced this pull request Apr 6, 2022
Remove migration from backend API (#10835)

The Migrate method on the Backend interface was not implemented by any
backends.

Migration should be implemented in the New method of backends so they
can be sure migration happens before any background processes are
started.


Backport SQL Backend to v9

Original PR: #11048

Add a new backend supporting PostgreSQL and CockroachDB.

Implements #10253 without cloud connectivity.
jimbishopp added a commit that referenced this pull request Apr 11, 2022
Documentation for SQL Backend #11048.
jimbishopp added a commit that referenced this pull request Apr 12, 2022
Documentation for SQL Backend #11048.
jimbishopp added a commit that referenced this pull request Apr 12, 2022
Backport #11882
Documentation for SQL Backend #11048.
jimbishopp added a commit that referenced this pull request Apr 13, 2022
Backport #11882
Documentation for SQL Backend #11048.
espadolini added a commit that referenced this pull request Oct 5, 2022
espadolini added a commit that referenced this pull request Oct 7, 2022
* Revert "Azure AD authentication for the Postgres backend (#15757)"

This reverts commit 33c6d82.

* Revert "SQL Backend (#11048)"

This reverts commit 06fef2a.

* Remove Postgres backend from the docs

* Remove the Postgres backend from the testplan
github-actions bot pushed a commit that referenced this pull request Oct 7, 2022
espadolini added a commit that referenced this pull request Oct 7, 2022
* Revert "Azure AD authentication for the Postgres backend (#15757)"

This reverts commit 33c6d82.

* Revert "SQL Backend (#11048)"

This reverts commit 06fef2a.

* Remove Postgres backend from the docs

* Remove the Postgres backend from the testplan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants