-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change default for super_read_only to true #9312
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -22,6 +22,7 @@ import ( | |||||||||||||||||||||||||||||||||||||||
"fmt" | ||||||||||||||||||||||||||||||||||||||||
"os" | ||||||||||||||||||||||||||||||||||||||||
"path/filepath" | ||||||||||||||||||||||||||||||||||||||||
"strconv" | ||||||||||||||||||||||||||||||||||||||||
"strings" | ||||||||||||||||||||||||||||||||||||||||
"time" | ||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||
|
@@ -58,8 +59,6 @@ const ( | |||||||||||||||||||||||||||||||||||||||
const ( | ||||||||||||||||||||||||||||||||||||||||
// replicationStartDeadline is the deadline for starting replication | ||||||||||||||||||||||||||||||||||||||||
replicationStartDeadline = 30 | ||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||
Error1193 = "Unknown system variable" | ||||||||||||||||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||
var ( | ||||||||||||||||||||||||||||||||||||||||
|
@@ -343,8 +342,8 @@ func Restore(ctx context.Context, params RestoreParams) (*BackupManifest, error) | |||||||||||||||||||||||||||||||||||||||
// This is safe, since we're restarting MySQL after the restore anyway | ||||||||||||||||||||||||||||||||||||||||
params.Logger.Infof("Restore: disabling super_read_only") | ||||||||||||||||||||||||||||||||||||||||
if err := params.Mysqld.SetSuperReadOnly(false); err != nil { | ||||||||||||||||||||||||||||||||||||||||
if strings.Contains(err.Error(), Error1193) { | ||||||||||||||||||||||||||||||||||||||||
params.Logger.Warningf("Restore: server does not know about super_read_only; maybe MariaDB? Continuing anyway.") | ||||||||||||||||||||||||||||||||||||||||
if strings.Contains(err.Error(), strconv.Itoa(mysql.ERUnknownSystemVariable)) { | ||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This feels a little brittle to me, but it's not new in this PR. This should be an INVALID_ARGUMENT vitess error right? vitess/go/vt/vttablet/tabletserver/tabletserver.go Lines 1575 to 1592 in f238c0b
So can we leverage this here? https://pkg.go.dev/errors#Is
Or if that doesn't work for some reason:
The other option would be to do this I think:
This is non-blocking though, as we were already doing the substring search. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a good idea. We should combine the Code + substring. I'll make the change before I merge. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I forgot that we can't do this yet. See #9240 (comment) vitess/go/vt/mysqlctl/query.go Line 84 in d74de0c
I opened #9311 for this. Once that is done, we can clean up error handling in these files. |
||||||||||||||||||||||||||||||||||||||||
params.Logger.Warningf("Restore: server does not know about super_read_only, continuing anyway...") | ||||||||||||||||||||||||||||||||||||||||
} else { | ||||||||||||||||||||||||||||||||||||||||
params.Logger.Errorf("Restore: unexpected error while trying to set super_read_only: %v", err) | ||||||||||||||||||||||||||||||||||||||||
return nil, err | ||||||||||||||||||||||||||||||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,6 +19,7 @@ package tabletmanager | |
import ( | ||
"flag" | ||
"fmt" | ||
"strconv" | ||
"strings" | ||
"time" | ||
|
||
|
@@ -39,7 +40,7 @@ import ( | |
|
||
var ( | ||
enableSemiSync = flag.Bool("enable_semi_sync", false, "Enable semi-sync when configuring replication, on primary and replica tablets only (rdonly tablets will not ack).") | ||
setSuperReadOnly = flag.Bool("use_super_read_only", false, "Set super_read_only flag when performing planned failover.") | ||
setSuperReadOnly = flag.Bool("use_super_read_only", true, "Set super_read_only flag when performing planned failover.") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we also want to deprecate this flag? We can even keep it around so that anyone using Vitess with mysql 5.6 will just have to pass this as a false parameter There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should keep it for one release so that users can fallback if this causes any unforeseen problems. |
||
) | ||
|
||
// ReplicationStatus returns the replication status | ||
|
@@ -431,7 +432,11 @@ func (tm *TabletManager) demotePrimary(ctx context.Context, revertPartialFailure | |
if *setSuperReadOnly { | ||
// Setting super_read_only also sets read_only | ||
if err := tm.MysqlDaemon.SetSuperReadOnly(true); err != nil { | ||
return nil, err | ||
if strings.Contains(err.Error(), strconv.Itoa(mysql.ERUnknownSystemVariable)) { | ||
log.Warningf("server does not know about super_read_only, continuing anyway...") | ||
} else { | ||
return nil, err | ||
} | ||
} | ||
} else { | ||
if err := tm.MysqlDaemon.SetReadOnly(true); err != nil { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think this is a good idea — always starting mysqld in
super-read-only
mode — it does mean that it's something we'd need to turn off on the primary tablet during the tablet repair. I vaguely recall that we don't do that today?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we don't touch it in repair:
vitess/go/vt/vttablet/tabletmanager/rpc_replication.go
Lines 554 to 669 in 93c811b
We only touch it in demote primary:
vitess/go/vt/vttablet/tabletmanager/rpc_replication.go
Lines 378 to 469 in 93c811b
And undo demote primary:
vitess/go/vt/vttablet/tabletmanager/rpc_replication.go
Lines 479 to 513 in 93c811b
When this would cause problems is when the mysqld instance for a primary tablet crashes or is otherwise restarted. Ironically, I happen to be doing that in a test I just added here: ef39eb8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure what the historical reasons were for not fixing read-only in tablet repair, but IMO it would be a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean.
Tablet restarts and demotes itself to replica through a path that doesn't go throughdemotePrimary
. I don't know if there was an actual reason for not doing it. Probably a miss from whenuse_super_read_only
was implemented.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the code some more. When the tablet demotes itself, it first calls
demotePrimary
which takes care of read_only (and super_read_only) beforeSetReplicationSource
is called. So that path is fine.The other case where tablet is up but mysqld is down or restarted is something that vtorc will handle. It has never been handled properly prior to vtorc (and replication_reporter was not designed to do that either).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as the primary goes, we call
SetReadOnly(false)
which also turns off super_read_only (per MySQL docs).