-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change default for super_read_only to true #9312
Conversation
…n cnf files Signed-off-by: deepthi <[email protected]>
@@ -12,6 +12,9 @@ relay_log_info_repository = TABLE | |||
relay_log_purge = 1 | |||
relay_log_recovery = 1 | |||
|
|||
# we should never need super privileges | |||
super-read-only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think this is a good idea — always starting mysqld in super-read-only
mode — it does mean that it's something we'd need to turn off on the primary tablet during the tablet repair. I vaguely recall that we don't do that today?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we don't touch it in repair:
vitess/go/vt/vttablet/tabletmanager/rpc_replication.go
Lines 554 to 669 in 93c811b
func (tm *TabletManager) setReplicationSourceLocked(ctx context.Context, parentAlias *topodatapb.TabletAlias, timeCreatedNS int64, waitPosition string, forceStartReplication bool) (err error) { | |
// End orchestrator maintenance at the end of fixing replication. | |
// This is a best effort operation, so it should happen in a goroutine | |
defer func() { | |
go func() { | |
if tm.orc == nil { | |
return | |
} | |
if err := tm.orc.EndMaintenance(tm.Tablet()); err != nil { | |
log.Warningf("Orchestrator EndMaintenance failed: %v", err) | |
} | |
}() | |
}() | |
// Change our type to REPLICA if we used to be PRIMARY. | |
// Being sent SetReplicationSource means another PRIMARY has been successfully promoted, | |
// so we convert to REPLICA first, since we want to do it even if other | |
// steps fail below. | |
// Note it is important to check for PRIMARY here so that we don't | |
// unintentionally change the type of RDONLY tablets | |
tablet := tm.Tablet() | |
if tablet.Type == topodatapb.TabletType_PRIMARY { | |
if err := tm.tmState.ChangeTabletType(ctx, topodatapb.TabletType_REPLICA, DBActionNone); err != nil { | |
return err | |
} | |
} | |
// See if we were replicating at all, and should be replicating. | |
wasReplicating := false | |
shouldbeReplicating := false | |
status, err := tm.MysqlDaemon.ReplicationStatus() | |
if err == mysql.ErrNotReplica { | |
// This is a special error that means we actually succeeded in reading | |
// the status, but the status is empty because replication is not | |
// configured. We assume this means we used to be a primary, so we always | |
// try to start replicating once we are told who the new primary is. | |
shouldbeReplicating = true | |
// Since we continue in the case of this error, make sure 'status' is | |
// in a known, empty state. | |
status = mysql.ReplicationStatus{} | |
} else if err != nil { | |
// Abort on any other non-nil error. | |
return err | |
} | |
if status.IOThreadRunning || status.SQLThreadRunning { | |
wasReplicating = true | |
shouldbeReplicating = true | |
} | |
if forceStartReplication { | |
shouldbeReplicating = true | |
} | |
// If using semi-sync, we need to enable it before connecting to primary. | |
// If we are currently PRIMARY, assume we are about to become REPLICA. | |
tabletType := tm.Tablet().Type | |
if tabletType == topodatapb.TabletType_PRIMARY { | |
tabletType = topodatapb.TabletType_REPLICA | |
} | |
if err := tm.fixSemiSync(tabletType); err != nil { | |
return err | |
} | |
// Update the primary/source address only if needed. | |
// We don't want to interrupt replication for no reason. | |
if parentAlias == nil { | |
// if there is no primary in the shard, return an error so that we can retry | |
return vterrors.New(vtrpc.Code_FAILED_PRECONDITION, "Shard primaryAlias is nil") | |
} | |
parent, err := tm.TopoServer.GetTablet(ctx, parentAlias) | |
if err != nil { | |
return err | |
} | |
host := parent.Tablet.MysqlHostname | |
port := int(parent.Tablet.MysqlPort) | |
if status.SourceHost != host || status.SourcePort != port { | |
// This handles both changing the address and starting replication. | |
if err := tm.MysqlDaemon.SetReplicationSource(ctx, host, port, wasReplicating, shouldbeReplicating); err != nil { | |
if err := tm.handleRelayLogError(err); err != nil { | |
return err | |
} | |
} | |
} else if shouldbeReplicating { | |
// The address is correct. Just start replication if needed. | |
if !status.ReplicationRunning() { | |
if err := tm.MysqlDaemon.StartReplication(tm.hookExtraEnv()); err != nil { | |
if err := tm.handleRelayLogError(err); err != nil { | |
return err | |
} | |
} | |
} | |
} | |
// If needed, wait until we replicate to the specified point, or our context | |
// times out. Callers can specify the point to wait for as either a | |
// GTID-based replication position or a Vitess reparent journal entry, | |
// or both. | |
if shouldbeReplicating { | |
if waitPosition != "" { | |
pos, err := mysql.DecodePosition(waitPosition) | |
if err != nil { | |
return err | |
} | |
if err := tm.MysqlDaemon.WaitSourcePos(ctx, pos); err != nil { | |
return err | |
} | |
} | |
if timeCreatedNS != 0 { | |
if err := tm.MysqlDaemon.WaitForReparentJournal(ctx, timeCreatedNS); err != nil { | |
return err | |
} | |
} | |
// Clear replication sentinel flag for this replica | |
tm.replManager.setReplicationStopped(false) | |
} | |
return nil | |
} |
We only touch it in demote primary:
vitess/go/vt/vttablet/tabletmanager/rpc_replication.go
Lines 378 to 469 in 93c811b
func (tm *TabletManager) demotePrimary(ctx context.Context, revertPartialFailure bool) (primaryStatus *replicationdatapb.PrimaryStatus, finalErr error) { | |
if err := tm.lock(ctx); err != nil { | |
return nil, err | |
} | |
defer tm.unlock() | |
tablet := tm.Tablet() | |
wasPrimary := tablet.Type == topodatapb.TabletType_PRIMARY | |
wasServing := tm.QueryServiceControl.IsServing() | |
wasReadOnly, err := tm.MysqlDaemon.IsReadOnly() | |
if err != nil { | |
return nil, err | |
} | |
// If we are a primary tablet and not yet read-only, stop accepting new | |
// queries and wait for in-flight queries to complete. If we are not primary, | |
// or if we are already read-only, there's no need to stop the queryservice | |
// in order to ensure the guarantee we are being asked to provide, which is | |
// that no writes are occurring. | |
if wasPrimary && !wasReadOnly { | |
// Tell Orchestrator we're stopped on purpose for demotion. | |
// This is a best effort task, so run it in a goroutine. | |
go func() { | |
if tm.orc == nil { | |
return | |
} | |
if err := tm.orc.BeginMaintenance(tm.Tablet(), "vttablet has been told to DemotePrimary"); err != nil { | |
log.Warningf("Orchestrator BeginMaintenance failed: %v", err) | |
} | |
}() | |
// Note that this may block until the transaction timeout if clients | |
// don't finish their transactions in time. Even if some transactions | |
// have to be killed at the end of their timeout, this will be | |
// considered successful. If we are already not serving, this will be | |
// idempotent. | |
log.Infof("DemotePrimary disabling query service") | |
if err := tm.QueryServiceControl.SetServingType(tablet.Type, logutil.ProtoToTime(tablet.PrimaryTermStartTime), false, "demotion in progress"); err != nil { | |
return nil, vterrors.Wrap(err, "SetServingType(serving=false) failed") | |
} | |
defer func() { | |
if finalErr != nil && revertPartialFailure && wasServing { | |
if err := tm.QueryServiceControl.SetServingType(tablet.Type, logutil.ProtoToTime(tablet.PrimaryTermStartTime), true, ""); err != nil { | |
log.Warningf("SetServingType(serving=true) failed during revert: %v", err) | |
} | |
} | |
}() | |
} | |
// Now that we know no writes are in-flight and no new writes can occur, | |
// set MySQL to read-only mode. If we are already read-only because of a | |
// previous demotion, or because we are not primary anyway, this should be | |
// idempotent. | |
if *setSuperReadOnly { | |
// Setting super_read_only also sets read_only | |
if err := tm.MysqlDaemon.SetSuperReadOnly(true); err != nil { | |
return nil, err | |
} | |
} else { | |
if err := tm.MysqlDaemon.SetReadOnly(true); err != nil { | |
return nil, err | |
} | |
} | |
defer func() { | |
if finalErr != nil && revertPartialFailure && !wasReadOnly { | |
// setting read_only OFF will also set super_read_only OFF if it was set | |
if err := tm.MysqlDaemon.SetReadOnly(false); err != nil { | |
log.Warningf("SetReadOnly(false) failed during revert: %v", err) | |
} | |
} | |
}() | |
// If using semi-sync, we need to disable primary-side. | |
if err := tm.fixSemiSync(topodatapb.TabletType_REPLICA); err != nil { | |
return nil, err | |
} | |
defer func() { | |
if finalErr != nil && revertPartialFailure && wasPrimary { | |
// enable primary-side semi-sync again | |
if err := tm.fixSemiSync(topodatapb.TabletType_PRIMARY); err != nil { | |
log.Warningf("fixSemiSync(PRIMARY) failed during revert: %v", err) | |
} | |
} | |
}() | |
// Return the current replication position. | |
status, err := tm.MysqlDaemon.PrimaryStatus(ctx) | |
if err != nil { | |
return nil, err | |
} | |
return mysql.PrimaryStatusToProto(status), nil | |
} |
And undo demote primary:
vitess/go/vt/vttablet/tabletmanager/rpc_replication.go
Lines 479 to 513 in 93c811b
func (tm *TabletManager) UndoDemotePrimary(ctx context.Context) error { | |
log.Infof("UndoDemotePrimary") | |
if err := tm.lock(ctx); err != nil { | |
return err | |
} | |
defer tm.unlock() | |
// If using semi-sync, we need to enable source-side. | |
if err := tm.fixSemiSync(topodatapb.TabletType_PRIMARY); err != nil { | |
return err | |
} | |
// Now, set the server read-only false. | |
if err := tm.MysqlDaemon.SetReadOnly(false); err != nil { | |
return err | |
} | |
// Update serving graph | |
tablet := tm.Tablet() | |
log.Infof("UndoDemotePrimary re-enabling query service") | |
if err := tm.QueryServiceControl.SetServingType(tablet.Type, logutil.ProtoToTime(tablet.PrimaryTermStartTime), true, ""); err != nil { | |
return vterrors.Wrap(err, "SetServingType(serving=true) failed") | |
} | |
// Tell Orchestrator we're no longer stopped on purpose. | |
// Do this in the background, as it's best-effort. | |
go func() { | |
if tm.orc == nil { | |
return | |
} | |
if err := tm.orc.EndMaintenance(tm.Tablet()); err != nil { | |
log.Warningf("Orchestrator EndMaintenance failed: %v", err) | |
} | |
}() | |
return nil | |
} |
When this would cause problems is when the mysqld instance for a primary tablet crashes or is otherwise restarted. Ironically, I happen to be doing that in a test I just added here: ef39eb8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure what the historical reasons were for not fixing read-only in tablet repair, but IMO it would be a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. Tablet restarts and demotes itself to replica through a path that doesn't go through demotePrimary
. I don't know if there was an actual reason for not doing it. Probably a miss from when use_super_read_only
was implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the code some more. When the tablet demotes itself, it first calls demotePrimary
which takes care of read_only (and super_read_only) before SetReplicationSource
is called. So that path is fine.
The other case where tablet is up but mysqld is down or restarted is something that vtorc will handle. It has never been handled properly prior to vtorc (and replication_reporter was not designed to do that either).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as the primary goes, we call SetReadOnly(false)
which also turns off super_read_only (per MySQL docs).
@@ -39,7 +40,7 @@ import ( | |||
|
|||
var ( | |||
enableSemiSync = flag.Bool("enable_semi_sync", false, "Enable semi-sync when configuring replication, on primary and replica tablets only (rdonly tablets will not ack).") | |||
setSuperReadOnly = flag.Bool("use_super_read_only", false, "Set super_read_only flag when performing planned failover.") | |||
setSuperReadOnly = flag.Bool("use_super_read_only", true, "Set super_read_only flag when performing planned failover.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we also want to deprecate this flag? We can even keep it around so that anyone using Vitess with mysql 5.6 will just have to pass this as a false parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep it for one release so that users can fallback if this causes any unforeseen problems.
Signed-off-by: deepthi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Signed-off-by: deepthi <[email protected]>
Signed-off-by: deepthi <[email protected]>
Slack discussion: https://vitess.slack.com/archives/C0PQY0PTK/p1638915929375700 |
@@ -343,8 +342,8 @@ func Restore(ctx context.Context, params RestoreParams) (*BackupManifest, error) | |||
// This is safe, since we're restarting MySQL after the restore anyway | |||
params.Logger.Infof("Restore: disabling super_read_only") | |||
if err := params.Mysqld.SetSuperReadOnly(false); err != nil { | |||
if strings.Contains(err.Error(), Error1193) { | |||
params.Logger.Warningf("Restore: server does not know about super_read_only; maybe MariaDB? Continuing anyway.") | |||
if strings.Contains(err.Error(), strconv.Itoa(mysql.ERUnknownSystemVariable)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a little brittle to me, but it's not new in this PR. This should be an INVALID_ARGUMENT vitess error right?
vitess/go/vt/vttablet/tabletserver/tabletserver.go
Lines 1575 to 1592 in f238c0b
case mysql.ERUnknownComError, mysql.ERBadNullError, mysql.ERBadDb, mysql.ERBadTable, mysql.ERNonUniq, mysql.ERWrongFieldWithGroup, mysql.ERWrongGroupField, | |
mysql.ERWrongSumSelect, mysql.ERWrongValueCount, mysql.ERTooLongIdent, mysql.ERDupFieldName, mysql.ERDupKeyName, mysql.ERWrongFieldSpec, mysql.ERParseError, | |
mysql.EREmptyQuery, mysql.ERNonUniqTable, mysql.ERInvalidDefault, mysql.ERMultiplePriKey, mysql.ERTooManyKeys, mysql.ERTooManyKeyParts, mysql.ERTooLongKey, | |
mysql.ERKeyColumnDoesNotExist, mysql.ERBlobUsedAsKey, mysql.ERTooBigFieldLength, mysql.ERWrongAutoKey, mysql.ERWrongFieldTerminators, mysql.ERBlobsAndNoTerminated, | |
mysql.ERTextFileNotReadable, mysql.ERWrongSubKey, mysql.ERCantRemoveAllFields, mysql.ERUpdateTableUsed, mysql.ERNoTablesUsed, mysql.ERTooBigSet, | |
mysql.ERBlobCantHaveDefault, mysql.ERWrongDbName, mysql.ERWrongTableName, mysql.ERUnknownProcedure, mysql.ERWrongParamCountToProcedure, | |
mysql.ERWrongParametersToProcedure, mysql.ERFieldSpecifiedTwice, mysql.ERInvalidGroupFuncUse, mysql.ERTableMustHaveColumns, mysql.ERUnknownCharacterSet, | |
mysql.ERTooManyTables, mysql.ERTooManyFields, mysql.ERTooBigRowSize, mysql.ERWrongOuterJoin, mysql.ERNullColumnInIndex, mysql.ERFunctionNotDefined, | |
mysql.ERWrongValueCountOnRow, mysql.ERInvalidUseOfNull, mysql.ERRegexpError, mysql.ERMixOfGroupFuncAndFields, mysql.ERIllegalGrantForTable, mysql.ERSyntaxError, | |
mysql.ERWrongColumnName, mysql.ERWrongKeyColumn, mysql.ERBlobKeyWithoutLength, mysql.ERPrimaryCantHaveNull, mysql.ERTooManyRows, mysql.ERUnknownSystemVariable, | |
mysql.ERSetConstantsOnly, mysql.ERWrongArguments, mysql.ERWrongUsage, mysql.ERWrongNumberOfColumnsInSelect, mysql.ERDupArgument, mysql.ERLocalVariable, | |
mysql.ERGlobalVariable, mysql.ERWrongValueForVar, mysql.ERWrongTypeForVar, mysql.ERVarCantBeRead, mysql.ERCantUseOptionHere, mysql.ERIncorrectGlobalLocalVar, | |
mysql.ERWrongFKDef, mysql.ERKeyRefDoNotMatchTableRef, mysql.ERCyclicReference, mysql.ERCollationCharsetMismatch, mysql.ERCantAggregate2Collations, | |
mysql.ERCantAggregate3Collations, mysql.ERCantAggregateNCollations, mysql.ERVariableIsNotStruct, mysql.ERUnknownCollation, mysql.ERWrongNameForIndex, | |
mysql.ERWrongNameForCatalog, mysql.ERBadFTColumn, mysql.ERTruncatedWrongValue, mysql.ERTooMuchAutoTimestampCols, mysql.ERInvalidOnUpdate, mysql.ERUnknownTimeZone, | |
mysql.ERInvalidCharacterString, mysql.ERIllegalReference, mysql.ERDerivedMustHaveAlias, mysql.ERTableNameNotAllowedHere, mysql.ERDataTooLong, mysql.ERDataOutOfRange, | |
mysql.ERTruncatedWrongValueForField: | |
errCode = vtrpcpb.Code_INVALID_ARGUMENT |
So can we leverage this here? https://pkg.go.dev/errors#Is
if err.Is(err, mysql.ERUnknownSystemVariable) {
Or if that doesn't work for some reason:
if err.Is(err, vtrpcpb.Code_INVALID_ARGUMENT) {
The other option would be to do this I think:
- if strings.Contains(err.Error(), Error1193) {
+ vterr := err.(*vtrpcpb.Error)
+ if vterr.Code == vtrpcpb.Code_INVALID_ARGUMENT {
This is non-blocking though, as we were already doing the substring search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea. We should combine the Code + substring. I'll make the change before I merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot that we can't do this yet. See #9240 (comment)
Here's where we lose the type information
vitess/go/vt/mysqlctl/query.go
Line 84 in d74de0c
return fmt.Errorf("ExecuteFetch(%v) failed: %v", redactPassword(query), redactPassword(err.Error())) |
I opened #9311 for this. Once that is done, we can clean up error handling in these files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
While looking into #9290 we discovered that the heartbeat writer was creating errant transactions during a failover (PlannedReparent). While we need to fix that properly, it has been suggested that we should run with
super_read_only
on MySQL versions that support it.This PR does two things.
Related Issue(s)
#9290
Checklist
Deployment Notes
It is expected that this change is safe and backwards-compatible. Anyone who is relying on the current behavior should pass
-use_super_read_only=false
on the vttablet command line, and make sure they are using a custom my.cnf instead of the one provided as the default by Vitess.