Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

orc: vitess mode #6613

Merged
merged 15 commits into from
Sep 3, 2020
Merged

orc: vitess mode #6613

merged 15 commits into from
Sep 3, 2020

Conversation

sougou
Copy link
Contributor

@sougou sougou commented Aug 23, 2020

This is a POC/MVP of some sort where Orchestrator runs as a vitess component. Features of this PR are described in #6612.

This is not production ready yet, but people can start playing with it. The code mostly works, but there are lots of TODOs, and tests have to be written.

This version does not handle semi-sync settings correctly.

High level

  • Accept topo command line flags so Orc can directly read and write.
  • Orc polls the topo to discover tablet records, saves them in a new vitess_tablet table, and uses them as the authoritative source of mysql instances. For example, if a tablet record is deleted, the mysql instance is forgotten.
  • Analysis code drives from the vitess_tablet table and left joins on database_instance.
  • Additional tablet specific rules have been added in the analysis phase to improve decision-making.
  • Recovery code has new actions for initializating a brand new cluster (ISM) and wiring up new vttablets.
  • The code is written using a "desired state" approach: If a previous action failed and was incomplete, the newer analysis will detect the differences and converge the system to a consistent state.

Details

  • Replica info is currently provided in the config file. This will eventually be fetched from vttablets.
  • Improved logging to print file and line. We'll need to eventually merge this with the cloud-friendly vitess logging.
  • Removed the additional filtering from analysis. We want to monitor at all instances.
  • cluster_alias_override is not used because vitess is the authority here.
  • For now, we use SuggestedClusterAlias as the authoritative source. We'll need to eventually change this to ClusterName. But there are places where the cluster name is inferred from the instance key.
  • ChangeMasterTo always sets the credentials, with deprecates ChangeMasterCredentials.
  • There is a logic/tablet_discover.go and an inst/tablet_dao.go. The line between the two is not well drawn. We'll need to fix this forward.
  • In topology_recovery, I had to comment out a call to inst.BeginDowntime of the dead master. In the case of vitess, we should wire it back to the cluster when it comes back up. But this downtime was preventing it.
  • LockShard is sprayed in a few places, but it's not watertight yet.
  • During recovery, once the new candidate is identified, we first call ChangeType(MASTER) on the tablet. If anything fails after this, this master status is used to authoritatively finish anything that was incomplete.
  • Comaster situation becomes obsolete with the new approach: A new analysis called MasterHasMaster has been created, which is evaluated based on whether the tablet record is the master.
  • An electNewMaster function has been added to elect a brand new master if no previous master existed. This replaces the vitess InitShardMaster.

sougou added 12 commits August 21, 2020 20:02
Signed-off-by: Sugu Sougoumarane <[email protected]>
Signed-off-by: Sugu Sougoumarane <[email protected]>
Signed-off-by: Sugu Sougoumarane <[email protected]>
Signed-off-by: Sugu Sougoumarane <[email protected]>
Signed-off-by: Sugu Sougoumarane <[email protected]>
Signed-off-by: Sugu Sougoumarane <[email protected]>
Signed-off-by: Sugu Sougoumarane <[email protected]>
Signed-off-by: Sugu Sougoumarane <[email protected]>
Signed-off-by: Sugu Sougoumarane <[email protected]>
Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comments/questions inline.

Overall everything here makes sense and seems like a pretty smooth integration; that is, it feels like the code changes belong here and play well with existing logic. Good work!

go/vt/orchestrator/db/generate_base.go Outdated Show resolved Hide resolved
go/vt/orchestrator/external/golib/log/log.go Outdated Show resolved Hide resolved
go/vt/orchestrator/external/golib/log/log.go Show resolved Hide resolved
go/vt/orchestrator/inst/analysis.go Outdated Show resolved Hide resolved
@@ -116,12 +125,16 @@ const (
type ReplicationAnalysis struct {
AnalyzedInstanceKey InstanceKey
AnalyzedInstanceMasterKey InstanceKey
TabletType topodatapb.TabletType
MasterTimeStamp time.Time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this timestamp stand for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was our first step towards a consensus protocol. Every newly elected master saves its timestamp. If the previous master did not get properly demoted (maybe it wasn't reachable), then we use the timestamp to resolve who the real master is.

I'm not happy that we currently rely on a timestamp because rogue clocks can mess things up. We should eventually move towards a system that doesn't depend on the clock.

go/vt/orchestrator/logic/topology_recovery.go Show resolved Hide resolved
go/vt/orchestrator/logic/topology_recovery.go Show resolved Hide resolved
go/vt/orchestrator/logic/topology_recovery.go Show resolved Hide resolved
go/vt/orchestrator/logic/topology_recovery.go Show resolved Hide resolved
if !inst.IsBannedFromBeingCandidateReplica(replica) {
candidate = replica
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bit too much if-else-if-else. Can you please explain the logic in words?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored this. LMK if the new code reads better.

sougou added 2 commits August 24, 2020 15:14
Signed-off-by: Sugu Sougoumarane <[email protected]>
}
refreshTabletsUsing(func(instanceKey *inst.InstanceKey) {
_ = inst.InjectSeed(instanceKey)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work well upon startup

}
refreshTabletsUsing(func(instanceKey *inst.InstanceKey) {
_ = inst.InjectSeed(instanceKey)
})
// TODO(sougou): parameterize poll interval.
return time.Tick(15 * time.Second) //nolint SA1015: using time.Tick leaks the underlying ticker
}

// RefreshTablets reloads the tablets from topo.
func RefreshTablets() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose/usage of this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called from the main discovery loop. It polls for new/deleted tablets, and updates Orc accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it. Makes sense; one thing to note is that this causes a synchronous topology read on all tablets. Possibly this isn't what you wanted? Consider calling DiscoverInstance() instead, which will queue the instance for discovery, and avoid double-probing it within a pre configured timeslice.

@@ -59,7 +59,7 @@ func OpenTabletDiscovery() <-chan time.Time {
// RefreshTablets reloads the tablets from topo.
func RefreshTablets() {
refreshTabletsUsing(func(instanceKey *inst.InstanceKey) {
_, _ = inst.ReadTopologyInstance(instanceKey)
DiscoverInstance(*instanceKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤞 let's see how this goes!

Copy link
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code walkthrough and getting all the TODOs makes code read better.

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also LGTM

@sougou sougou merged commit 187be0a into vitessio:master Sep 3, 2020
@shlomi-noach shlomi-noach deleted the ss-oc2-vt-mode branch September 3, 2020 03:58
@askdba askdba added this to the v8.0 milestone Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants