You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
vtctlclient Validate claims that slaves aren't replicating although they are.
Running this command $ vtctlclient -server :15999 Validate -ping-tablets
The output includes pairs of error lines like this for each slave:
E0117 13:36:01.812848 3926 main.go:60] E0117 13:36:01.812707 validator.go:52] slave hostname.example.com not in replication graph for shard ks1/80- (mysql instance without vttablet?)
E0117 13:36:01.814426 3926 main.go:60] E0117 13:36:01.814245 validator.go:52] slave cell1-0000001234 not replicating: 192.168.0.1 slave list: ["hostname.example.com" (others elided...)]
As far as I can tell, the code around https://github.com/vitessio/vitess/blob/850574e0f70/go/vt/wrangler/validator.go#L203 assumes that GetSlaves returns a list of IP addresses. It obtains those using FindSlaves in https://github.com/vitessio/vitess/blob/850574e0f70/go/vt/mysqlctl/replication.go#L274
Although that's documented as "FindSlaves gets IP addresses for all currently connected slaves", what it does is: run show processlist on the master, find rows whose Command is like 'Binlog Dump%', and strip off the port from the Host column. Poking around various databases here, the Host column generally has hostnames instead of IP addresses, so that GetSlaves returns hostnames. As a result, back in validator.go tabletIPMap does indeed map from IP addresses, so that when comparing if tabletIPMap[normalizeIP(slaveAddr)] == nil it's always nil, because normalizeIP(slaveAddr) is a hostname instead of an IP address. I guess that FindSlaves should be fixed to return IP addresses as it is documented.
The text was updated successfully, but these errors were encountered:
slanning
pushed a commit
to slanning/vitess
that referenced
this issue
Feb 6, 2019
vtctlclient Validate
claims that slaves aren't replicating although they are.Running this command
$ vtctlclient -server :15999 Validate -ping-tablets
The output includes pairs of error lines like this for each slave:
As far as I can tell, the code around https://github.com/vitessio/vitess/blob/850574e0f70/go/vt/wrangler/validator.go#L203 assumes that
GetSlaves
returns a list of IP addresses. It obtains those usingFindSlaves
in https://github.com/vitessio/vitess/blob/850574e0f70/go/vt/mysqlctl/replication.go#L274Although that's documented as "FindSlaves gets IP addresses for all currently connected slaves", what it does is: run
show processlist
on the master, find rows whose Command is like 'Binlog Dump%', and strip off the port from the Host column. Poking around various databases here, the Host column generally has hostnames instead of IP addresses, so thatGetSlaves
returns hostnames. As a result, back in validator.gotabletIPMap
does indeed map from IP addresses, so that when comparingif tabletIPMap[normalizeIP(slaveAddr)] == nil
it's always nil, becausenormalizeIP(slaveAddr)
is a hostname instead of an IP address. I guess that FindSlaves should be fixed to return IP addresses as it is documented.The text was updated successfully, but these errors were encountered: