Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Tablet schema initialization refactoring #10533

Closed
wants to merge 63 commits into from

Conversation

rsajwani
Copy link
Contributor

@rsajwani rsajwani commented Jun 16, 2022

Signed-off-by: Rameez Sajwani [email protected]

Description

This is an attempt to consolidated all schema initialization and DB creation at the tablet start-up. The idea is if we can do all schema changes in one place and then seal the db with super-read-only, it will help us avoid errant GTIDs.

Details

We have created Schema_Initialization class which contains list of initialization modules needed to be run during tablet setup. Every module needs to register themselves with schema_Initializer e.g metadata_tables.go, online DDL/executor.go, rpc_replication.go etc.

During tablet initialization at various places in code-base whether we are bootstrapping a completely new tablet or restoring from backup we executes these modules registered as part of their init() routine. Once all the modules are executed we set 'super_read_only' false. This means we won't allow any mutation outside primary tablet. This will help us avoid errand GTIDS.

During tablet promotion we set super-read-only to false and during tablet demotion we set super-read-only to true. This way apart from primary tablet no other replica will be able to produce schema changes from outside.

For Reference

Some related work items done in the past

#10094
#9312
#10448

Unresolved Issued during code changes:

  • I have to create a separate init_db.sql for maria db since maria db does not have super_read_only. I could either have an if-then-else condition inside maria db but then I choose instead of creating a separate init_db file.
  • Adding # add custom sql here inside init-db.sql file. This is because in our test we modify the file and add custom slq like (change password of dba etc). Now that we expect super_read_only to execute at the end, we need these custom sql to be added before that last line. See example go/test/endtoend/backup/vtbackup/main_test.go
  • In one of our end to end test we have to set super-read-only test to false explicitly else our upgrade-downgrad test failed since they run with N-1 vttablet binary, which does not have code to disable super-read-only during InitializeShard. e.g
    go/test/endtoend/cluster/cluster_process.go
  • In tests like backup_only_test.go, I have to induce sleep else the tests are failing. I have a hunch why it is failing but I am not able to find a good solution for that.
  • Was not able to initialize the schema for health_streamer. Because if we create schema-copy-table beforehand then a test case like vitess.io/vitess/go/test/endtoend/vtgate/schematracker/loadkeyspace -run TestBlockedLoadKeyspace fails. The reason is they expect schema-copy-table to present only if we have enabled schema tracking.
  • Should we return false / error if any of the schema initialization fails … Right now I just log them but I believe we should return fail tablet initialization if any of the schema initialization fails e.g createReparentJournal.

Related Issue(s)

#10363

Checklist

  • "Backport me!" label has been added if this change should be backported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Signed-off-by: Rameez Sajwani <[email protected]>
@deepthi deepthi changed the title Draft: Non-Production: Experimenting with super-read-only user Draft: Experimenting with super-read-only user Jun 22, 2022
@rsajwani rsajwani changed the title Draft: Experimenting with super-read-only user Draft: Tablet schema initialization refactoring Jul 8, 2022
rsajwani added 17 commits July 14, 2022 17:43
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Comment on lines 1 to 2
# This file is executed immediately after mysql_install_db,
# to initialize a fresh data directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this separate file created for the super_read_only changes which are only present in the MySQL file?

config/mycnf/mysql57.cnf Show resolved Hide resolved
Comment on lines +272 to +276
// We need to switch off super-read-only before we create database.
_ = mysqld.SetSuperReadOnly(false)
defer func() {
_ = mysqld.SetSuperReadOnly(true)
}()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not read the super_readonly value here too before we do a defer to true? What if super read-only was turned off in the beginning, do we want to enable it in the end?

Comment on lines +361 to +372
if err := c.WriteComQuery("SELECT @@global.super_read_only"); err != nil {
return nil, err
}
res, _, _, err := c.ReadQueryResult(1, false)
if err == nil && len(res.Rows) == 1 {
sro := res.Rows[0][0].ToString()
if sro == "1" || sro == "ON" {
if err = c.WriteComQuery("SET GLOBAL super_read_only='OFF'"); err != nil {
return nil, err
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, isn't it easier to turn off super read-only regardless of what its value is before, since we'll only end up running one query then?

Comment on lines +379 to +397
func (c *Conn) ExecuteSetSuperReadOnly() (result *sqltypes.Result, err error) {
// Note: MariaDB does not have super_read_only but support for it is EOL in v14.0+
if !c.IsMariaDB() {
if err := c.WriteComQuery("SELECT @@global.super_read_only"); err != nil {
return nil, err
}
res, _, _, err := c.ReadQueryResult(1, false)
if err == nil && len(res.Rows) == 1 {
sro := res.Rows[0][0].ToString()
if sro == "0" || sro == "OFF" {
if err = c.WriteComQuery("SET GLOBAL super_read_only='ON'"); err != nil {
return nil, err
}
}
}
}

return result, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Comment on lines +401 to +414
if err := c.WriteComQuery("SELECT @@global.read_only"); err != nil {
return nil, err
}
res, _, _, err := c.ReadQueryResult(1, false)
if err == nil && len(res.Rows) == 1 {
sro := res.Rows[0][0].ToString()
if sro == "1" || sro == "ON" {
if err = c.WriteComQuery("SET GLOBAL read_only='OFF'"); err != nil {
return nil, err
}
}
}

return result, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@@ -51,7 +51,7 @@ func TestSysNumericPrecisionScale(t *testing.T) {

func TestCreateAndDropDatabase(t *testing.T) {
// note that this is testing vttest and not vtgate
conn, err := mysql.Connect(ctx, &vtParams)
conn, err := mysql.Connect(ctx, &mysqlParams)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem correct. We should be connecting to vtgate and not mysql

Comment on lines +151 to +154
var initFile = path.Join(os.Getenv("VTROOT"), "config/init_db.sql")
if !env.IsSQLFlavor() {
initFile = path.Join(os.Getenv("VTROOT"), "config/init_maria_db.sql")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of asking the environment, we could query the MySQL instance directly to know if it is mariadb distribution or not. This would prevent users from running into configuration issues.

Comment on lines +131 to +134
{{if (eq .Platform "mariadb103")}}
# mariadb103
export MYSQL_FLAVOR="MariaDB103"
{{end}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to go with reading the MySQL version from the instance itself, this won't be required.

Comment on lines 65 to 74
if *srvTopoNoCacheForGet {
entry := &queryEntry{
key: wkey,
}
result, err := q.query(ctx, entry)
if err != nil {
return nil, err
}
return result, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change related to schema initialization?

@@ -103,6 +103,7 @@ Usage of vtctld:
--hot_row_protection_concurrent_transactions int Number of concurrent transactions let through to the txpool/MySQL for the same hot row. Should be > 1 to have enough 'ready' transactions in MySQL and benefit from a pipelining effect. (default 5)
--hot_row_protection_max_global_queue_size int Global queue limit across all row (ranges). Useful to prevent that the queue can grow unbounded. (default 1000)
--hot_row_protection_max_queue_size int Maximum number of BeginExecute RPCs which will be queued for the same row (range). (default 20)
--init_populate_metadata (init parameter) populate metadata tables even if restore_from_backup is disabled. If restore_from_backup is enabled, metadata tables are always populated regardless of this flag.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag shouldn't be in vtctld

Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
Signed-off-by: Rameez Sajwani <[email protected]>
@github-actions
Copy link
Contributor

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

  • Push additional commits to the associated branch.
  • Remove the stale label.
  • Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

@github-actions github-actions bot added the Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. label Oct 22, 2022
@github-actions
Copy link
Contributor

This PR was closed because it has been stale for 7 days with no activity.

@github-actions github-actions bot closed this Oct 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Cluster management Stale Marks PRs as stale after a period of inactivity, which are then closed after a grace period. Type: Feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants