Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ApplySchema: deprecate '--skip_preflight' flag #10716

Merged

Conversation

shlomi-noach
Copy link
Contributor

Description

This PR deprecates the flag --skip_preflight in vtctlclient ApplySchema. The new behavior is as if --skip_preflight=true. That is, to always skip preflight. In fact, the preflight code is completely removed in this PR.

The Vitess team discussed deprecating --skip_preflight a long time ago, with the advent of Online DDL. Moreover, we have noticed that in production environments we always set the flag. And furthermore, the logic used by --skip_preflight is not on par with new schema change logic: Online DDL and schemadiff, and we've encountered scenarios where the logic was flawed.

As Vitess now recommends running migrations via Online DDL, we should move away from preflight checks.

Related Issue(s)

#6926

Checklist

  • "Backport me!" label has been added if this change should be backported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

@vitess-bot
Copy link
Contributor

vitess-bot bot commented Jul 17, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • If a test is added or modified, there should be a documentation on top of the test to explain what the expected behavior is what the test does.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

@mattlord
Copy link
Contributor

I feel that deprecating the flag would be setting the default to true and adding a deprecation warning in the CLI and docs while removal would be removing the code done here along the the preflight tablet manager client code. The PR seems to be in the middle. Do we need to do a deprecation cycle? What's the value of keeping the flag if you can't set it?

Please keep in mind that I am glad we're getting rid of this and appreciate you working on that. 🙂

@shlomi-noach
Copy link
Contributor Author

shlomi-noach commented Jul 20, 2022

Do we need to do a deprecation cycle? What's the value of keeping the flag if you can't set it?

@mattlord we have to have a deprecation cycle. So in v15 the user should get a warning (you're right! I did not add a warning log -- will fix), and in v16 we can remove the flag altogether.

You're also right that the easiest way forward would be to just take care of the flag, force it to true, then keep the rest of internal logic. However, we're doing some refactoring in tablet_executor.go to reduce footprint, and this would help in simplifying these efforts as it would remove some bulk of code.


UPDATE v16 is already released, so this work applies to v17 & v18.

@shlomi-noach
Copy link
Contributor Author

Added deprecation warning

@@ -141,7 +141,7 @@ func commandApplySchema(cmd *cobra.Command, args []string) error {
AllowLongUnavailability: applySchemaOptions.AllowLongUnavailability,
DdlStrategy: applySchemaOptions.DDLStrategy,
Sql: parts,
SkipPreflight: applySchemaOptions.SkipPreflight,
SkipPreflight: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments I made in #10717 (comment), but for this flag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments addressed

deepthi
deepthi previously approved these changes Jul 25, 2022
@deepthi deepthi dismissed their stale review July 25, 2022 21:58

Some feedback still needs to be addressed.

deepthi
deepthi previously approved these changes Jul 27, 2022
@@ -3110,7 +3112,7 @@ func commandApplySchema(ctx context.Context, wr *wrangler.Wrangler, subFlags *fl
AllowLongUnavailability: *allowLongUnavailability,
DdlStrategy: *ddlStrategy,
Sql: parts,
SkipPreflight: *skipPreflight,
SkipPreflight: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should be deprecating the protobuf field for this as well? I'm not certain about that though - whether we do this now and deprecate the protobuf field in the next release.
@ajm188 what do you suggest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know i just said "yes" on the other PR, but i think we need to wait one more cycle, since old tablets will be consuming this field, which will take the zero value (false) instead of what we unconditionally pass here (true).

so deprecate flag + always pass true (v15) => remove flag + deprecate protobuf field + tablets ignore value and always skip-preflight (v16)

@shlomi-noach
Copy link
Contributor Author

Cluster (vtgate_schema) seems to be consistently failing. Unsure yet how this PR affects it -- but it looks like it does.

@deepthi deepthi dismissed their stale review August 16, 2022 23:10

Some feedback still to be addressed

@github-actions
Copy link
Contributor

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

  • Push additional commits to the associated branch.
  • Remove the stale label.
  • Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

@shlomi-noach shlomi-noach removed the Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. label Sep 18, 2022
@shlomi-noach
Copy link
Contributor Author

This is not stale. I'm just stuck. Can't solve the CI issue.

@shlomi-noach
Copy link
Contributor Author

Still stuck here.

@rohit-nayak-ps rohit-nayak-ps self-requested a review as a code owner November 10, 2022 22:08
@github-actions
Copy link
Contributor

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

  • Push additional commits to the associated branch.
  • Remove the stale label.
  • Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

@github-actions github-actions bot added the Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. label Dec 13, 2022
@shlomi-noach shlomi-noach removed the Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. label Dec 13, 2022
@github-actions
Copy link
Contributor

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

  • Push additional commits to the associated branch.
  • Remove the stale label.
  • Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

@github-actions github-actions bot added the Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. label Jan 13, 2023
@github-actions
Copy link
Contributor

This PR was closed because it has been stale for 7 days with no activity.

@github-actions github-actions bot closed this Jan 21, 2023
@shlomi-noach shlomi-noach reopened this Feb 27, 2023
@shlomi-noach shlomi-noach requested a review from a team February 27, 2023 08:47
@shlomi-noach shlomi-noach removed the Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. label Feb 27, 2023
@shlomi-noach
Copy link
Contributor Author

Fashionably late, this PR is ready for review!

Copy link
Contributor

@dbussink dbussink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay code cleanup!

@shlomi-noach
Copy link
Contributor Author

Current error in Upgrade Downgrade Testing Query Serving (Schema):

2023-02-27T15:36:44.2867788Z ERROR: Go version reported: go version go1.19.4 linux/amd64. Version 1.20.1+ required. See https://vitess.io/contributing/build-from-source for install instructions.
2023-02-27T15:36:44.2873075Z ##[error]Process completed with exit code 1.

@shlomi-noach
Copy link
Contributor Author

Upgrade Downgrade Testing Query Serving (Schema) failure is a known issue to be fixed imminently; meanwhile merging this PR.

@shlomi-noach shlomi-noach merged commit 607a9a4 into vitessio:main Feb 28, 2023
@shlomi-noach shlomi-noach deleted the apply-schema-deprecate-skip-preflight branch February 28, 2023 07:09
@shlomi-noach shlomi-noach mentioned this pull request Aug 23, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants