Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: inconsistent scan can return error cannot specify timestamp older than 5m0s for this operation if cluster is overloaded #100304

Open
renatolabs opened this issue Mar 31, 2023 · 3 comments
Labels
A-sql-table-stats Table statistics (and their automatic refresh). C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. E-quick-win Likely to be a quick win for someone experienced. O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs P-3 Issues/test failures with no fix SLA T-sql-queries SQL Queries Team

Comments

@renatolabs
Copy link
Contributor

renatolabs commented Mar 31, 2023

The following error can observed in a cluster that is under a lot of load:

jobs/registry.go:1520 ⋮ [T1,n1] 761  AUTO CREATE STATS job 852625652397473796: stepping through state reverting with error: AS OF SYSTEM TIME: cannot specify timestamp older than 5m0s for this operation
jobs/registry.go:1520 ⋮ [T1,n1] 761 +(1) attached stack trace
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  -- stack trace:
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  | github.com/cockroachdb/cockroach/pkg/sql/row.(*Fetcher).StartInconsistentScan
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  |         github.com/cockroachdb/cockroach/pkg/sql/row/fetcher.go:580
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*tableReader).startScan
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  |         github.com/cockroachdb/cockroach/pkg/sql/rowexec/tablereader.go:219
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*tableReader).Next
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  |         github.com/cockroachdb/cockroach/pkg/sql/rowexec/tablereader.go:257
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*samplerProcessor).mainLoop
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  |         github.com/cockroachdb/cockroach/pkg/sql/rowexec/sampler.go:254
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*samplerProcessor).Run
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  |         github.com/cockroachdb/cockroach/pkg/sql/rowexec/sampler.go:228
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  | github.com/cockroachdb/cockroach/pkg/sql/flowinfra.(*FlowBase).StartInternal.func1
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  |         github.com/cockroachdb/cockroach/pkg/sql/flowinfra/flow.go:510
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  | runtime.goexit
jobs/registry.go:1520 ⋮ [T1,n1] 761 +  |         GOROOT/src/runtime/asm_amd64.s:1594

This seems easy to reproduce by setting up a cluster and running tpcc with more warehouses than the cluster can comfortably handle. For reference, I saw a couple of occurrences of this error in a 4-node (n1-standard-4) cluster with 200 warehouses.

Likely not relevant: the error above happened while in a mixed-version state. I don't think this is related, but worth pointing out. It has also been seen in roachtests before (coincidentally or not, in another mixed version test): #93623 (comment)

Jira issue: CRDB-26361

@renatolabs renatolabs added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team labels Mar 31, 2023
@RaduBerinde
Copy link
Member

The check in StartInconsistentScan that generates this error can just be removed I think. The txn will be recreated as needed by sendFn.

@michae2 michae2 added the A-sql-table-stats Table statistics (and their automatic refresh). label Mar 31, 2023
@mgartner mgartner added the E-quick-win Likely to be a quick win for someone experienced. label Apr 4, 2023
@mgartner mgartner moved this to Bugs to Fix in SQL Queries Jul 24, 2023
@inata4 inata4 added the O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs label May 31, 2024
@mgartner
Copy link
Collaborator

mgartner commented Jun 5, 2024

@inata4 I see you added the support label. Do you remember what ticket this issue was related to?

@mgartner mgartner added the P-3 Issues/test failures with no fix SLA label Jun 5, 2024
@inata4
Copy link
Collaborator

inata4 commented Jun 6, 2024

I think we saw it in https://cockroachdb.zendesk.com/agent/tickets/21535

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-table-stats Table statistics (and their automatic refresh). C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. E-quick-win Likely to be a quick win for someone experienced. O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs P-3 Issues/test failures with no fix SLA T-sql-queries SQL Queries Team
Projects
Status: Bugs to Fix
Development

No branches or pull requests

5 participants