Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(wr *Wrangler) ShardReplicationStatuses hangs forever #4572

Closed
derekperkins opened this issue Jan 30, 2019 · 1 comment
Closed

(wr *Wrangler) ShardReplicationStatuses hangs forever #4572

derekperkins opened this issue Jan 30, 2019 · 1 comment

Comments

@derekperkins
Copy link
Member

I'm indirectly using ShardReplicationStatuses via BackupShard, and I found a case where it hangs forever. I spun up some rdonly tablets temporarily, but when I deleted them, they didn't remove their tablet records. That left references with bad host names. I would expect for the call to either return an error that it couldn't contact one of the tablets, but instead, it hangs there until the context timeout hits, which is a couple hours by default for the backup call.

wg.Add(1)
go func(i int, ti *topo.TabletInfo) {
defer wg.Done()
status, err := wr.tmc.SlaveStatus(ctx, ti.Tablet)
if err != nil {
rec.RecordError(fmt.Errorf("SlaveStatus(%v) failed: %v", ti.AliasString(), err))
return
}
result[i] = status
}(i, ti)
}
}
wg.Wait()

It looks like the actual backup is timing out, which led me to set the action_timeout to 6 hours, which made it harder to diagnose the problem. It would have been much nicer to see an error about not being able to connect to a tablet to get the replication status. I haven't dug deeper yet into the SlaveStatus interface call to see where that is set.

I'm not sure if tweaking the behavior of that call is appropriate or if it will have more cascading effects elsewhere that expect it to wait forever.

@ajm188
Copy link
Contributor

ajm188 commented Jun 24, 2022

This is a dupe of #4073, and I fixed this in #7690, so going to close!

@ajm188 ajm188 closed this as completed Jun 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants