-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: store database code as code [DET-9180] #9302
Conversation
✅ Deploy Preview for determined-ui canceled.
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9302 +/- ##
==========================================
+ Coverage 45.29% 51.64% +6.34%
==========================================
Files 1227 839 -388
Lines 154095 90398 -63697
Branches 2404 0 -2404
==========================================
- Hits 69805 46685 -23120
+ Misses 84098 43713 -40385
+ Partials 192 0 -192
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@jgongd tagged you wanted get thoughts if this design works for streaming updates since yall have a ton of database code |
This is a nice way to update views, triggers and functions. Streaming updates is working as expected. cc: @gt2345 @corban-beaird |
} | ||
|
||
// MustResolveTestPostgres is the same as ResolveTestPostgres but with panics on errors. | ||
func MustResolveTestPostgres(t *testing.T) *PgDB { | ||
pgDB, err := ResolveTestPostgres() | ||
func MustResolveTestPostgres(t *testing.T) (*PgDB, func()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this an emerging style in go land to return an object + closing/cancellation function? I thought something like
db := MustResolveTestPostgres(...)
defer db.Close()
would be more idiomatic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I have seen it in some places. I know Ryan used it in func ResolveNewPostgresDatabase() (*PgDB, func(), error) {
. I think I have seen it in other established Go APIs but I'm not recalling where. The one advantage I see is it more intentional that you need to close something. Its hard for someone to know if they need to or don't need to call the Close()
.
Doing that, the linter complains that we aren't checking the error. We could add another helper or something. I wanna do another pass on this code as a follow on later to reduce duplication and split tests into isolated db tests soon (so each package would be running in a seperate database instance).
internal/db/postgres_rbac_intg_test.go:354:18: Error return value of `pgDB.Close` is not checked (errcheck)
// ResolveTestPostgres resolves a connection to a postgres database. To debug tests that use this | ||
// (or otherwise run the tests outside of the Makefile), make sure to set | ||
// DET_INTEGRATION_POSTGRES_URL. | ||
func ResolveTestPostgres() (*PgDB, error) { | ||
func ResolveTestPostgres() (*PgDB, func(), error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we didn't close these tests meaning even before this change we have a ton of connections that get left over the course of a test run
I think adding the extra few queries to migrate the test pushed this over the edge. The db handling is kinda of a mess right now. We probably should move away from TestMain
everywhere and add per package db isolation.
} | ||
|
||
// MustResolveTestPostgres is the same as ResolveTestPostgres but with panics on errors. | ||
func MustResolveTestPostgres(t *testing.T) *PgDB { | ||
pgDB, err := ResolveTestPostgres() | ||
func MustResolveTestPostgres(t *testing.T) (*PgDB, func()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I have seen it in some places. I know Ryan used it in func ResolveNewPostgresDatabase() (*PgDB, func(), error) {
. I think I have seen it in other established Go APIs but I'm not recalling where. The one advantage I see is it more intentional that you need to close something. Its hard for someone to know if they need to or don't need to call the Close()
.
Doing that, the linter complains that we aren't checking the error. We could add another helper or something. I wanna do another pass on this code as a follow on later to reduce duplication and split tests into isolated db tests soon (so each package would be running in a seperate database instance).
internal/db/postgres_rbac_intg_test.go:354:18: Error return value of `pgDB.Close` is not checked (errcheck)
@@ -46,7 +46,7 @@ func runMigrate(cmd *cobra.Command, args []string) error { | |||
} | |||
}() | |||
|
|||
if _, err = database.Migrate(config.DB.Migrations, args); err != nil { | |||
if _, err = database.Migrate(config.DB.Migrations, config.DB.DatabaseCode, args); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't mean to bikeshed, but I'd love a more specific term than "database code". I hear database code and I picture all the files containing queries, migrations, etc. Took me a minute to remember what this ticket referred to. Can't think of a great suggestion myself. In my head these are the "stateless" entities but that's probably not the right term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "ViewsAndTriggers", just to make the intended contents very clear. Calling it "stateless" or "transient" also suggests that you should be putting anything in there that can't get deleted and recreated routinely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to views_and_triggers, its not 100% accurate but it probably is more clear the intention of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of suggestions / comments / questions, but I wouldn't block merging once you've considered them.
master/internal/db/migrations.go
Outdated
} | ||
|
||
// Update our hash and return we need to create db code. | ||
if _, err := db.sql.Exec("UPDATE db_code_hash SET hash = $1", ourHash); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reapplying DB code if the hash has changed makes sense, especially to ensure during development it's current. My first thought was that something human-readable here in another column would be great for troubleshooting. It should always match the current running master process anyway, so maybe that's not valuable.. Just thinking out loud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like overall kill to store something like all the view definitions in the column. Any other ideas?
|
||
// Update our hash and return we need to create db code. | ||
if _, err := db.sql.Exec("UPDATE db_code_hash SET hash = $1", ourHash); err != nil { | ||
return nil, false, fmt.Errorf("updating our database hash: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we be returning true
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no just returning the zero value.
If an error is not nil the other return values should not be looked at. it is really hard to work with code that returns non nil errors and 'meaningful' values.
return nil | ||
} | ||
|
||
var testOnlyDBLock func(sql *sqlx.DB) (unlock func()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This took me a minute to figure out - a comment indicating it gets initialized and acquired in init() in a test module would be a clarity improvement IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the way it was factored out, though - much cleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment
@@ -396,8 +399,8 @@ var ( | |||
} | |||
testGroups = []model.Group{testGroup, testGroupStatic} | |||
testUser = model.User{ | |||
ID: 1217651234, | |||
Username: fmt.Sprintf("IntegrationTest%d", 1217651234), | |||
ID: 1217651235, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious - why is this incrementing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was the same value as tests in rbac tests. technically these tests run concurrently as the other test so there is a race condition where one test is in progress running and other test tries to create the user but fails due to duplicate user.
i think this was intended to be globally unique but is a copypasta error.
master/static/db_code/README.md
Outdated
- The Postgres schema ```determined_code``` will be dropped if it exists. | ||
- Migrations run as they did before ```determined_code``` was added. | ||
- The Postgres schema ```determined_code``` will be recreated. | ||
- All SQL files in the ``db_code`` will be run in lexicographically order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-ally
master/static/db_code/README.md
Outdated
|
||
Migrations can't see or use views because migrations happen before ```determined_code``` is created. | ||
|
||
In the unlikely event this is an issue, you can track views in regular migrations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the most likely scenario for an interaction would be a view or trigger referencing a table in the same commit that the table gets created in. For that reason, should we drop code, run migrations, and then recreate the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the order it will run in?
On everytime the Determined master starts up
- The Postgres schema
determined_code
will be dropped if it exists. - Migrations run as they did before
determined_code
was added. - The Postgres schema
determined_code
will be recreated. - All SQL files in the
db_code
will be run in lexicographical order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one non blocking clarification. I really like the change!
master/internal/db/migrations.go
Outdated
} | ||
|
||
func (db *PgDB) dropDBCode() error { | ||
// SET search_path since the ALTER DATABASE ... SET SEARCH_PATH won't apply to this connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im confused by this comment - can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3f6fe3e
to
1848947
Compare
Ticket
Description
Store database code as code. See the added readme for details.
Test Plan
CI passes
NOTE TO REVIEWERS
To review the code, we can test that the code is the same schema with pg_dump
on main on a fresh database
on this branch on a fresh database
check the diff (or go to this link)
https://www.diffchecker.com/8X1MK9aw/
Checklist
docs/release-notes/
.See Release Note for details.