Bug Report: VReplicationStreamState
falls out of sync with --workflow status
on resume
#15337
Labels
VReplicationStreamState
falls out of sync with --workflow status
on resume
#15337
Overview of the Issue
In a
VReplication
workflow (e.g.,MoveTables
), the--workflow status
andVReplicationStreamState
statuses fall out of sync when resuming the workflow after an interruption:--workflow status
VReplicationStreamState
--workflow create
Copying
Copying
--workflow stop
Stopped
Stopped
--workflow start
Copying
Running
Running
Running
This
Copying
vsRunning
mismatch inVReplicationStreamState
when restarting a workflow can lead to faulty assumptions in monitoring and reporting (e.g., thinking copying is done when it really isn't).Reproduction Steps
Spin up a new cluster, e.g.,
Insert enough data such that
VReplication
takes long enough to capture stats, e.g.,and
☝️ doubles the rows on every run.1 20,971,520 rows is enough for our purposes.
Spin up additional tablets in preparation for
VReplication
, e.g.,Begin the
VReplication
, e.g.,Observe the following statuses:
a.
--workflow status
b.
VReplicationStreamState
:Expected: The states match (
Copying
andCopying
).Stop the workflow, e.g.,
Resume the workflow, e.g.,
Observe the statuses again:
a.
--workflow status
b.
VReplicationStreamState
VReplicationStreamState
saysRunning
whenworkflow status
is stillCopying
.(Optional) Allow the copy to finish (i.e., wait a few minutes while the workflow is running). Observe both statuses are
Running
as expected.Binary Version
vtgate version Version: 20.0.0-SNAPSHOT (Git revision 27be9166e1ace2708a158e9faf220cf156569e50 branch 'main') built on Thu Feb 22 17:00:55 PST 2024 by tyler@local using go1.22.0 darwin/amd64
Operating System and Environment details
Log Fragments
Footnotes
Thanks @maxenglander! ↩
The text was updated successfully, but these errors were encountered: