Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default for super_read_only to true #9312

Merged
merged 4 commits into from
Dec 13, 2021

Conversation

deepthi
Copy link
Member

@deepthi deepthi commented Dec 1, 2021

Description

While looking into #9290 we discovered that the heartbeat writer was creating errant transactions during a failover (PlannedReparent). While we need to fix that properly, it has been suggested that we should run with super_read_only on MySQL versions that support it.
This PR does two things.

  • change default for use_super_read_only to true
  • set super-read-only in cnf files

Related Issue(s)

#9290

Checklist

  • Should this PR be backported? NO
  • Tests were added or are not required
  • Documentation was added or is not required - needs to be added in website docs

Deployment Notes

It is expected that this change is safe and backwards-compatible. Anyone who is relying on the current behavior should pass -use_super_read_only=false on the vttablet command line, and make sure they are using a custom my.cnf instead of the one provided as the default by Vitess.

@deepthi deepthi added Component: TabletManager release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) Type: Enhancement Logical improvement (somewhere between a bug and feature) labels Dec 1, 2021
@@ -12,6 +12,9 @@ relay_log_info_repository = TABLE
relay_log_purge = 1
relay_log_recovery = 1

# we should never need super privileges
super-read-only
Copy link
Contributor

@mattlord mattlord Dec 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think this is a good idea — always starting mysqld in super-read-only mode — it does mean that it's something we'd need to turn off on the primary tablet during the tablet repair. I vaguely recall that we don't do that today?

Copy link
Contributor

@mattlord mattlord Dec 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we don't touch it in repair:

func (tm *TabletManager) setReplicationSourceLocked(ctx context.Context, parentAlias *topodatapb.TabletAlias, timeCreatedNS int64, waitPosition string, forceStartReplication bool) (err error) {
// End orchestrator maintenance at the end of fixing replication.
// This is a best effort operation, so it should happen in a goroutine
defer func() {
go func() {
if tm.orc == nil {
return
}
if err := tm.orc.EndMaintenance(tm.Tablet()); err != nil {
log.Warningf("Orchestrator EndMaintenance failed: %v", err)
}
}()
}()
// Change our type to REPLICA if we used to be PRIMARY.
// Being sent SetReplicationSource means another PRIMARY has been successfully promoted,
// so we convert to REPLICA first, since we want to do it even if other
// steps fail below.
// Note it is important to check for PRIMARY here so that we don't
// unintentionally change the type of RDONLY tablets
tablet := tm.Tablet()
if tablet.Type == topodatapb.TabletType_PRIMARY {
if err := tm.tmState.ChangeTabletType(ctx, topodatapb.TabletType_REPLICA, DBActionNone); err != nil {
return err
}
}
// See if we were replicating at all, and should be replicating.
wasReplicating := false
shouldbeReplicating := false
status, err := tm.MysqlDaemon.ReplicationStatus()
if err == mysql.ErrNotReplica {
// This is a special error that means we actually succeeded in reading
// the status, but the status is empty because replication is not
// configured. We assume this means we used to be a primary, so we always
// try to start replicating once we are told who the new primary is.
shouldbeReplicating = true
// Since we continue in the case of this error, make sure 'status' is
// in a known, empty state.
status = mysql.ReplicationStatus{}
} else if err != nil {
// Abort on any other non-nil error.
return err
}
if status.IOThreadRunning || status.SQLThreadRunning {
wasReplicating = true
shouldbeReplicating = true
}
if forceStartReplication {
shouldbeReplicating = true
}
// If using semi-sync, we need to enable it before connecting to primary.
// If we are currently PRIMARY, assume we are about to become REPLICA.
tabletType := tm.Tablet().Type
if tabletType == topodatapb.TabletType_PRIMARY {
tabletType = topodatapb.TabletType_REPLICA
}
if err := tm.fixSemiSync(tabletType); err != nil {
return err
}
// Update the primary/source address only if needed.
// We don't want to interrupt replication for no reason.
if parentAlias == nil {
// if there is no primary in the shard, return an error so that we can retry
return vterrors.New(vtrpc.Code_FAILED_PRECONDITION, "Shard primaryAlias is nil")
}
parent, err := tm.TopoServer.GetTablet(ctx, parentAlias)
if err != nil {
return err
}
host := parent.Tablet.MysqlHostname
port := int(parent.Tablet.MysqlPort)
if status.SourceHost != host || status.SourcePort != port {
// This handles both changing the address and starting replication.
if err := tm.MysqlDaemon.SetReplicationSource(ctx, host, port, wasReplicating, shouldbeReplicating); err != nil {
if err := tm.handleRelayLogError(err); err != nil {
return err
}
}
} else if shouldbeReplicating {
// The address is correct. Just start replication if needed.
if !status.ReplicationRunning() {
if err := tm.MysqlDaemon.StartReplication(tm.hookExtraEnv()); err != nil {
if err := tm.handleRelayLogError(err); err != nil {
return err
}
}
}
}
// If needed, wait until we replicate to the specified point, or our context
// times out. Callers can specify the point to wait for as either a
// GTID-based replication position or a Vitess reparent journal entry,
// or both.
if shouldbeReplicating {
if waitPosition != "" {
pos, err := mysql.DecodePosition(waitPosition)
if err != nil {
return err
}
if err := tm.MysqlDaemon.WaitSourcePos(ctx, pos); err != nil {
return err
}
}
if timeCreatedNS != 0 {
if err := tm.MysqlDaemon.WaitForReparentJournal(ctx, timeCreatedNS); err != nil {
return err
}
}
// Clear replication sentinel flag for this replica
tm.replManager.setReplicationStopped(false)
}
return nil
}

We only touch it in demote primary:

func (tm *TabletManager) demotePrimary(ctx context.Context, revertPartialFailure bool) (primaryStatus *replicationdatapb.PrimaryStatus, finalErr error) {
if err := tm.lock(ctx); err != nil {
return nil, err
}
defer tm.unlock()
tablet := tm.Tablet()
wasPrimary := tablet.Type == topodatapb.TabletType_PRIMARY
wasServing := tm.QueryServiceControl.IsServing()
wasReadOnly, err := tm.MysqlDaemon.IsReadOnly()
if err != nil {
return nil, err
}
// If we are a primary tablet and not yet read-only, stop accepting new
// queries and wait for in-flight queries to complete. If we are not primary,
// or if we are already read-only, there's no need to stop the queryservice
// in order to ensure the guarantee we are being asked to provide, which is
// that no writes are occurring.
if wasPrimary && !wasReadOnly {
// Tell Orchestrator we're stopped on purpose for demotion.
// This is a best effort task, so run it in a goroutine.
go func() {
if tm.orc == nil {
return
}
if err := tm.orc.BeginMaintenance(tm.Tablet(), "vttablet has been told to DemotePrimary"); err != nil {
log.Warningf("Orchestrator BeginMaintenance failed: %v", err)
}
}()
// Note that this may block until the transaction timeout if clients
// don't finish their transactions in time. Even if some transactions
// have to be killed at the end of their timeout, this will be
// considered successful. If we are already not serving, this will be
// idempotent.
log.Infof("DemotePrimary disabling query service")
if err := tm.QueryServiceControl.SetServingType(tablet.Type, logutil.ProtoToTime(tablet.PrimaryTermStartTime), false, "demotion in progress"); err != nil {
return nil, vterrors.Wrap(err, "SetServingType(serving=false) failed")
}
defer func() {
if finalErr != nil && revertPartialFailure && wasServing {
if err := tm.QueryServiceControl.SetServingType(tablet.Type, logutil.ProtoToTime(tablet.PrimaryTermStartTime), true, ""); err != nil {
log.Warningf("SetServingType(serving=true) failed during revert: %v", err)
}
}
}()
}
// Now that we know no writes are in-flight and no new writes can occur,
// set MySQL to read-only mode. If we are already read-only because of a
// previous demotion, or because we are not primary anyway, this should be
// idempotent.
if *setSuperReadOnly {
// Setting super_read_only also sets read_only
if err := tm.MysqlDaemon.SetSuperReadOnly(true); err != nil {
return nil, err
}
} else {
if err := tm.MysqlDaemon.SetReadOnly(true); err != nil {
return nil, err
}
}
defer func() {
if finalErr != nil && revertPartialFailure && !wasReadOnly {
// setting read_only OFF will also set super_read_only OFF if it was set
if err := tm.MysqlDaemon.SetReadOnly(false); err != nil {
log.Warningf("SetReadOnly(false) failed during revert: %v", err)
}
}
}()
// If using semi-sync, we need to disable primary-side.
if err := tm.fixSemiSync(topodatapb.TabletType_REPLICA); err != nil {
return nil, err
}
defer func() {
if finalErr != nil && revertPartialFailure && wasPrimary {
// enable primary-side semi-sync again
if err := tm.fixSemiSync(topodatapb.TabletType_PRIMARY); err != nil {
log.Warningf("fixSemiSync(PRIMARY) failed during revert: %v", err)
}
}
}()
// Return the current replication position.
status, err := tm.MysqlDaemon.PrimaryStatus(ctx)
if err != nil {
return nil, err
}
return mysql.PrimaryStatusToProto(status), nil
}

And undo demote primary:

func (tm *TabletManager) UndoDemotePrimary(ctx context.Context) error {
log.Infof("UndoDemotePrimary")
if err := tm.lock(ctx); err != nil {
return err
}
defer tm.unlock()
// If using semi-sync, we need to enable source-side.
if err := tm.fixSemiSync(topodatapb.TabletType_PRIMARY); err != nil {
return err
}
// Now, set the server read-only false.
if err := tm.MysqlDaemon.SetReadOnly(false); err != nil {
return err
}
// Update serving graph
tablet := tm.Tablet()
log.Infof("UndoDemotePrimary re-enabling query service")
if err := tm.QueryServiceControl.SetServingType(tablet.Type, logutil.ProtoToTime(tablet.PrimaryTermStartTime), true, ""); err != nil {
return vterrors.Wrap(err, "SetServingType(serving=true) failed")
}
// Tell Orchestrator we're no longer stopped on purpose.
// Do this in the background, as it's best-effort.
go func() {
if tm.orc == nil {
return
}
if err := tm.orc.EndMaintenance(tm.Tablet()); err != nil {
log.Warningf("Orchestrator EndMaintenance failed: %v", err)
}
}()
return nil
}

When this would cause problems is when the mysqld instance for a primary tablet crashes or is otherwise restarted. Ironically, I happen to be doing that in a test I just added here: ef39eb8

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure what the historical reasons were for not fixing read-only in tablet repair, but IMO it would be a good idea.

Copy link
Member Author

@deepthi deepthi Dec 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. Tablet restarts and demotes itself to replica through a path that doesn't go through demotePrimary. I don't know if there was an actual reason for not doing it. Probably a miss from when use_super_read_only was implemented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the code some more. When the tablet demotes itself, it first calls demotePrimary which takes care of read_only (and super_read_only) before SetReplicationSource is called. So that path is fine.
The other case where tablet is up but mysqld is down or restarted is something that vtorc will handle. It has never been handled properly prior to vtorc (and replication_reporter was not designed to do that either).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as the primary goes, we call SetReadOnly(false) which also turns off super_read_only (per MySQL docs).

@deepthi deepthi requested a review from sougou December 1, 2021 23:21
@@ -39,7 +40,7 @@ import (

var (
enableSemiSync = flag.Bool("enable_semi_sync", false, "Enable semi-sync when configuring replication, on primary and replica tablets only (rdonly tablets will not ack).")
setSuperReadOnly = flag.Bool("use_super_read_only", false, "Set super_read_only flag when performing planned failover.")
setSuperReadOnly = flag.Bool("use_super_read_only", true, "Set super_read_only flag when performing planned failover.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also want to deprecate this flag? We can even keep it around so that anyone using Vitess with mysql 5.6 will just have to pass this as a false parameter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep it for one release so that users can fallback if this causes any unforeseen problems.

@deepthi deepthi changed the title Attempt to use super_read_only everywhere Change default for super_read_only to true Dec 7, 2021
@deepthi deepthi marked this pull request as ready for review December 7, 2021 22:16
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Signed-off-by: deepthi <[email protected]>
Signed-off-by: deepthi <[email protected]>
@deepthi
Copy link
Member Author

deepthi commented Dec 10, 2021

Slack discussion: https://vitess.slack.com/archives/C0PQY0PTK/p1638915929375700
There is a path for that user, so it should be fine to merge this now.

@@ -343,8 +342,8 @@ func Restore(ctx context.Context, params RestoreParams) (*BackupManifest, error)
// This is safe, since we're restarting MySQL after the restore anyway
params.Logger.Infof("Restore: disabling super_read_only")
if err := params.Mysqld.SetSuperReadOnly(false); err != nil {
if strings.Contains(err.Error(), Error1193) {
params.Logger.Warningf("Restore: server does not know about super_read_only; maybe MariaDB? Continuing anyway.")
if strings.Contains(err.Error(), strconv.Itoa(mysql.ERUnknownSystemVariable)) {
Copy link
Contributor

@mattlord mattlord Dec 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a little brittle to me, but it's not new in this PR. This should be an INVALID_ARGUMENT vitess error right?

case mysql.ERUnknownComError, mysql.ERBadNullError, mysql.ERBadDb, mysql.ERBadTable, mysql.ERNonUniq, mysql.ERWrongFieldWithGroup, mysql.ERWrongGroupField,
mysql.ERWrongSumSelect, mysql.ERWrongValueCount, mysql.ERTooLongIdent, mysql.ERDupFieldName, mysql.ERDupKeyName, mysql.ERWrongFieldSpec, mysql.ERParseError,
mysql.EREmptyQuery, mysql.ERNonUniqTable, mysql.ERInvalidDefault, mysql.ERMultiplePriKey, mysql.ERTooManyKeys, mysql.ERTooManyKeyParts, mysql.ERTooLongKey,
mysql.ERKeyColumnDoesNotExist, mysql.ERBlobUsedAsKey, mysql.ERTooBigFieldLength, mysql.ERWrongAutoKey, mysql.ERWrongFieldTerminators, mysql.ERBlobsAndNoTerminated,
mysql.ERTextFileNotReadable, mysql.ERWrongSubKey, mysql.ERCantRemoveAllFields, mysql.ERUpdateTableUsed, mysql.ERNoTablesUsed, mysql.ERTooBigSet,
mysql.ERBlobCantHaveDefault, mysql.ERWrongDbName, mysql.ERWrongTableName, mysql.ERUnknownProcedure, mysql.ERWrongParamCountToProcedure,
mysql.ERWrongParametersToProcedure, mysql.ERFieldSpecifiedTwice, mysql.ERInvalidGroupFuncUse, mysql.ERTableMustHaveColumns, mysql.ERUnknownCharacterSet,
mysql.ERTooManyTables, mysql.ERTooManyFields, mysql.ERTooBigRowSize, mysql.ERWrongOuterJoin, mysql.ERNullColumnInIndex, mysql.ERFunctionNotDefined,
mysql.ERWrongValueCountOnRow, mysql.ERInvalidUseOfNull, mysql.ERRegexpError, mysql.ERMixOfGroupFuncAndFields, mysql.ERIllegalGrantForTable, mysql.ERSyntaxError,
mysql.ERWrongColumnName, mysql.ERWrongKeyColumn, mysql.ERBlobKeyWithoutLength, mysql.ERPrimaryCantHaveNull, mysql.ERTooManyRows, mysql.ERUnknownSystemVariable,
mysql.ERSetConstantsOnly, mysql.ERWrongArguments, mysql.ERWrongUsage, mysql.ERWrongNumberOfColumnsInSelect, mysql.ERDupArgument, mysql.ERLocalVariable,
mysql.ERGlobalVariable, mysql.ERWrongValueForVar, mysql.ERWrongTypeForVar, mysql.ERVarCantBeRead, mysql.ERCantUseOptionHere, mysql.ERIncorrectGlobalLocalVar,
mysql.ERWrongFKDef, mysql.ERKeyRefDoNotMatchTableRef, mysql.ERCyclicReference, mysql.ERCollationCharsetMismatch, mysql.ERCantAggregate2Collations,
mysql.ERCantAggregate3Collations, mysql.ERCantAggregateNCollations, mysql.ERVariableIsNotStruct, mysql.ERUnknownCollation, mysql.ERWrongNameForIndex,
mysql.ERWrongNameForCatalog, mysql.ERBadFTColumn, mysql.ERTruncatedWrongValue, mysql.ERTooMuchAutoTimestampCols, mysql.ERInvalidOnUpdate, mysql.ERUnknownTimeZone,
mysql.ERInvalidCharacterString, mysql.ERIllegalReference, mysql.ERDerivedMustHaveAlias, mysql.ERTableNameNotAllowedHere, mysql.ERDataTooLong, mysql.ERDataOutOfRange,
mysql.ERTruncatedWrongValueForField:
errCode = vtrpcpb.Code_INVALID_ARGUMENT

So can we leverage this here? https://pkg.go.dev/errors#Is

if err.Is(err, mysql.ERUnknownSystemVariable) {

Or if that doesn't work for some reason:

if err.Is(err, vtrpcpb.Code_INVALID_ARGUMENT) {

The other option would be to do this I think:

-               if strings.Contains(err.Error(), Error1193) {
+               vterr := err.(*vtrpcpb.Error)
+               if vterr.Code == vtrpcpb.Code_INVALID_ARGUMENT {

This is non-blocking though, as we were already doing the substring search.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea. We should combine the Code + substring. I'll make the change before I merge.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot that we can't do this yet. See #9240 (comment)
Here's where we lose the type information

return fmt.Errorf("ExecuteFetch(%v) failed: %v", redactPassword(query), redactPassword(err.Error()))

I opened #9311 for this. Once that is done, we can clean up error handling in these files.

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: TabletManager release notes (needs details) This PR needs to be listed in the release notes in a dedicated section (deprecation notice, etc...) Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants