-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
schemadiff: validate views' referenced columns #12147
schemadiff: validate views' referenced columns #12147
Conversation
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
In a followup PR: optimize such that we only analyze columns tables/views that are referenced by some view in the first place; there is no need to analyze e.g. a table that is never read by a view. This will allow us to save some memory & cpu. |
go/vt/schemadiff/diff_test.go
Outdated
@@ -667,6 +667,7 @@ func TestDiffSchemas(t *testing.T) { | |||
// validate schema1 unaffected by Apply | |||
assert.Equal(t, schema1SQL, schema1.ToSQL()) | |||
|
|||
require.NotNil(t, schema2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check useful? We already check earlier that there's no error returned when schema2
is created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
go/vt/schemadiff/schema.go
Outdated
@@ -750,3 +759,258 @@ func (s *Schema) Apply(diffs []EntityDiff) (*Schema, error) { | |||
} | |||
return dup, nil | |||
} | |||
|
|||
// TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this TODO
reference to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehe, it references the fact that is is already done... removing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
go/vt/schemadiff/schema.go
Outdated
// TODO | ||
|
||
func (s *Schema) ValidateViewReferences() error { | ||
var allerrors error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually use errs
as a the name for this kinda thing to collect errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
go/vt/schemadiff/schema.go
Outdated
for _, e := range s.Entities() { | ||
entityColumns, err := s.getEntityColumnNames(e.Name(), availableColumns) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this add to the error list instead? And then we continue through the rest of the entities? If this is a totally unexpected error, maybe good to add a comment about that and why it's safe to return immediately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
go.mod
Outdated
@@ -113,6 +113,7 @@ require ( | |||
github.com/kr/text v0.2.0 | |||
github.com/nsf/jsondiff v0.0.0-20210926074059-1e845ec5d249 | |||
github.com/openark/golib v0.0.0-20210531070646-355f37940af8 | |||
go.uber.org/multierr v1.8.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also update to 1.9.0
here since that's the latest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
} | ||
return true, nil | ||
}, view.Select) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI - sqlparser.Walk
will only return the error you return from inside the visitor function, and since you don't return any errors from that, no errors will ever make it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@systay ah, great! Thank you. I'll keep the check as it is, for safety, but good to know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@systay Right, but the errors are gathered here in errs
? And those are at the end returned if the walker itself doesn't error (which is not expected).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbussink yeah, that was my point. Not really necessary to catch the returned error from sqlparser.Walk
since that is not how we are dealing with the errors. When I know it will never return an error, I usually write.
_ = sqlparser.Walk(...)
OTOH - it's probably good defensive programming to do as @shlomi-noach is doing here and catching and checking the error anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@systay Right, guess as someone now knowing the details too much about how Walk
is implemented, I wouldn't know that it can't return an error and would still write it defensively 😄.
Looking for a 2nd review/approval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can/should avoid importing the external multierr
package but it otherwise LGTM.
Not adding a formal approval as we should, IMO, get @systay or @harshit-gangal to review as they are in the query serving team and working on the overall Vitess View support work. This way we can ensure that we're moving forward together and all the pieces will fit together nicely.
@systay or @harshit-gangal , could you please review? Please note if you cannot and I can add a formal approval here if needed. Thanks!
go/vt/schemadiff/schema.go
Outdated
for _, e := range s.Entities() { | ||
entityColumns, err := s.getEntityColumnNames(e.Name(), availableColumns) | ||
if err != nil { | ||
errs = multierr.Append(errs, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the Vitess concurrency package here along with vterrors.Aggregate
? You can see that used throughout the code base if you look for .AggrError(vterrors.Aggregate)
. Between the concurrency
and vterrors
packages we should have whatever related functionality you need here for error handling. For example:
vitess/go/vt/wrangler/traffic_switcher.go
Lines 146 to 161 in 390ddad
func (ts *trafficSwitcher) ForAllSources(f func(source *workflow.MigrationSource) error) error { | |
var wg sync.WaitGroup | |
allErrors := &concurrency.AllErrorRecorder{} | |
for _, source := range ts.sources { | |
wg.Add(1) | |
go func(source *workflow.MigrationSource) { | |
defer wg.Done() | |
if err := f(source); err != nil { | |
allErrors.RecordError(err) | |
} | |
}(source) | |
} | |
wg.Wait() | |
return allErrors.AggrError(vterrors.Aggregate) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it does use locking which is unnecessary here. Dunno if we care about the overhead of that here? That logic seems designed for concurrent error gathering which isn't what we're doing here.
Using multierr
seems simpler here and it's already an indirect dependency? Not a really strong opinion though, we can also use this but it seems a bit off for what it was designed for.
I think this logic here is temporary anyway, since once golang/go#53435 is available with Go 1.20 we probably want to switch to that anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @dbussink
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, race condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How strongly do people feel about this? Looking at the two implementation multierr
does seem to be more fitting to our purpose, but I don't feel strongly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you only need multiple errors you can use vterrors.Aggregate()
, no need for the concurrency piece (as noted, geared toward N goroutines) like we do here e.g.:
vitess/go/vt/vttablet/tabletmanager/vreplication/vcopier.go
Lines 597 to 621 in 3f55e40
var terrs []error | |
for !empty { | |
select { | |
case result := <-resultCh: | |
switch result.state { | |
case vcopierCopyTaskCancel: | |
// A task cancelation probably indicates an expired context due | |
// to a PlannedReparentShard or elapsed copy phase duration, | |
// neither of which are error conditions. | |
case vcopierCopyTaskComplete: | |
// Get the latest lastpk, purely for logging purposes. | |
lastpk = result.args.lastpk | |
case vcopierCopyTaskFail: | |
// Aggregate non-nil errors. | |
terrs = append(terrs, result.err) | |
} | |
default: | |
empty = true | |
} | |
} | |
if len(terrs) > 0 { | |
terr := vterrors.Aggregate(terrs) | |
log.Warningf("task error in workflow %s: %v", vc.vr.WorkflowName, terr) | |
return fmt.Errorf("task error: %v", terr) | |
} |
I don't feel strongly about it though. Up to you.
Signed-off-by: Shlomi Noach <[email protected]>
…t views reading from DUAL Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Incorporated #12189 into this PR, as it deals with view analysis |
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
I've switched away from |
Looking for another review/approval 🙏 |
ping @vitessio/query-serving for review |
ping https://github.com/orgs/vitessio/teams/query-serving for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything else LGTM!
go.mod
Outdated
@@ -187,7 +187,7 @@ require ( | |||
github.com/tidwall/pretty v1.2.0 // indirect | |||
go.opencensus.io v0.24.0 // indirect | |||
go.uber.org/atomic v1.10.0 // indirect | |||
go.uber.org/multierr v1.9.0 // indirect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason for this downgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was upgraded in a previous PR, that was eventually not using multierr
. I just ran go mod tidy
and it auto-downgraded, so... I don't know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shlomi-noach might be because the direct dependency was dropped? A go get -u go.uber.org/multierr
probably would add this back correctly (it should be there, so good catch @GuptaManan100).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixed.
Signed-off-by: Shlomi Noach <[email protected]>
Stalling this PR a bit as I'm to look into the |
Signed-off-by: Shlomi Noach <[email protected]>
Replaced by #12565 ; will close shortly. |
…12551) * WIP: schemadiff, analyzing diff dependencies Signed-off-by: Shlomi Noach <[email protected]> * ongoing work; many tests added Signed-off-by: Shlomi Noach <[email protected]> * partitioning tests Signed-off-by: Shlomi Noach <[email protected]> * function refactor Signed-off-by: Shlomi Noach <[email protected]> * minor reordering of variables Signed-off-by: Shlomi Noach <[email protected]> * copyright year Signed-off-by: Shlomi Noach <[email protected]> * stages to be commited Signed-off-by: Shlomi Noach <[email protected]> * math: permutations Signed-off-by: Shlomi Noach <[email protected]> * mathutil Signed-off-by: Shlomi Noach <[email protected]> * sort table alter options on diff Signed-off-by: Shlomi Noach <[email protected]> * alter reordering: first drop foreign key, then drop key Signed-off-by: Shlomi Noach <[email protected]> * remove tags logic from EquivalenceRelation Signed-off-by: Shlomi Noach <[email protected]> * removed go/mathutil/permutations.go Signed-off-by: Shlomi Noach <[email protected]> * refactor/rename Signed-off-by: Shlomi Noach <[email protected]> * cleanup unused code Signed-off-by: Shlomi Noach <[email protected]> * OrderedDiffs Signed-off-by: Shlomi Noach <[email protected]> * added an impossible-order test Signed-off-by: Shlomi Noach <[email protected]> * clarify comment Signed-off-by: Shlomi Noach <[email protected]> * AllSequentialExecutionDeps() Signed-off-by: Shlomi Noach <[email protected]> * test: expected order of entities in ordered diffs Signed-off-by: Shlomi Noach <[email protected]> * function comments Signed-off-by: Shlomi Noach <[email protected]> * remove unused code Signed-off-by: Shlomi Noach <[email protected]> * code comments; privatize functions; remove unused code Signed-off-by: Shlomi Noach <[email protected]> * code comments. Privatize functoin Signed-off-by: Shlomi Noach <[email protected]> * test deferred till #12147 is merged Signed-off-by: Shlomi Noach <[email protected]> * code comments Signed-off-by: Shlomi Noach <[email protected]> * fix permutation logic, add tests Signed-off-by: Shlomi Noach <[email protected]> * unindent using early returns Signed-off-by: Shlomi Noach <[email protected]> * code comment Signed-off-by: Shlomi Noach <[email protected]> * reposition some functions in file (no actualy changes) Signed-off-by: Shlomi Noach <[email protected]> * relocated the functions back because the diff became really confusing Signed-off-by: Shlomi Noach <[email protected]> * grammar Signed-off-by: Shlomi Noach <[email protected]> * code comments, minor optimization Signed-off-by: Shlomi Noach <[email protected]> * typo Signed-off-by: Shlomi Noach <[email protected]> * Added tests, minor edits per review. changed language Signed-off-by: Shlomi Noach <[email protected]> * handling empty diff slice; improved comments Signed-off-by: Shlomi Noach <[email protected]> * Clarify test reasoning Signed-off-by: Shlomi Noach <[email protected]> * grammar Signed-off-by: Shlomi Noach <[email protected]> --------- Signed-off-by: Shlomi Noach <[email protected]>
Description
Up till now,
schemadiff
views validation and analysis was limited to:As of this PR,
schemadiff
also validates columns used and returned by views:SELECT
and inWHERE
clause do in fact exist in tables/views referenced by the view. Error when an unknown column is encountered.create view v as select t.* from t, t2 ...
) or unqualified (create view v as select * from t1,t2
) ; infer list of columns ; support cascaded views based on star expression views.Plenty tests added to validate behavior.
Joint with @dbussink and based on his preliminary work.
Related Issue(s)
Checklist
Deployment Notes