Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracing: panic due to "getting first from empty ring buffer" #75045

Closed
cockroach-teamcity opened this issue Jan 18, 2022 · 6 comments · Fixed by #75099
Closed

tracing: panic due to "getting first from empty ring buffer" #75045

cockroach-teamcity opened this issue Jan 18, 2022 · 6 comments · Fixed by #75099
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-kv KV Team T-sql-queries SQL Queries Team

Comments

@cockroach-teamcity
Copy link
Member

sql/tests.TestCreateRandomSchema failed with artifacts on master @ 365b4da8bd02c06ee59d2130a56dec74ffc9ce21:

Fatal error:

panic: getting first from empty ring buffer

Stack:

goroutine 135566 [running]:
github.com/cockroachdb/cockroach/pkg/util/ring.(*Buffer).GetFirst(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/ring/ring_buffer.go:69
github.com/cockroachdb/cockroach/pkg/util/tracing.(*crdbSpan).recordInternalLocked(0xc003c48950, {0x5f7d9a0, 0xc007a92340}, 0xc003c48a28)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/tracing/crdbspan.go:478 +0x205
github.com/cockroachdb/cockroach/pkg/util/tracing.(*crdbSpan).recordFinishedChildrenLocked(0xc003c48950, {0xc00438ab40, 0x1, 0x44e5ad})
	/go/src/github.com/cockroachdb/cockroach/pkg/util/tracing/crdbspan.go:380 +0x29e
github.com/cockroachdb/cockroach/pkg/util/tracing.(*crdbSpan).recordFinishedChildren(0xb6cad5, {0xc00438ab40, 0x45e09a0, 0x8ed1798})
	/go/src/github.com/cockroachdb/cockroach/pkg/util/tracing/crdbspan.go:360 +0xbe
github.com/cockroachdb/cockroach/pkg/util/tracing.(*spanInner).ImportRemoteSpans(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/tracing/span_inner.go:80
github.com/cockroachdb/cockroach/pkg/util/tracing.(*Span).ImportRemoteSpans(0xc003c48900, {0xc00438ab40, 0x1, 0x3})
	/go/src/github.com/cockroachdb/cockroach/pkg/util/tracing/span.go:180 +0x4e
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLReceiver).pushMeta(0xc001faf500, 0xc005e5f1f0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:967 +0x33e
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLReceiver).Push(0xc001faf500, {0x0, 0x0, 0x0}, 0x2dc5cc0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1047 +0x29c
github.com/cockroachdb/cockroach/pkg/sql/execinfra.Run({0x601b540, 0xc0024c9500}, {0x6044858, 0xc005d7ed00}, {0x5f9d4b0, 0xc001faf500})
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/base.go:189 +0x86
github.com/cockroachdb/cockroach/pkg/sql/execinfra.(*ProcessorBaseNoHelper).Run(0xc005d7ed00, {0x601b540, 0xc0024c9500})
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/processorsbase.go:733 +0x5b
github.com/cockroachdb/cockroach/pkg/sql/flowinfra.(*FlowBase).Run(0xc003b0c1a0, {0x601b540, 0xc0024c9500}, 0xc005d7ed00)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/flowinfra/flow.go:453 +0x258
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*vectorizedFlow).Run(0xc00879fc80, {0x601b540, 0xc0024c9500}, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:246 +0x205
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).Run(0xc008982c00, 0xc0082260f0, 0xc004ea9ce0, 0xc005943100, 0xc001faf500, 0xc003c48c00, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:563 +0xb56
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).planAndRunSubquery(0xc008982c00, {0x6037620, 0xc003c48aa0}, 0x0, {{0x5f7ce20, 0xc0080997d0}, 0x2, 0x1, 0x0, {{0x601db88, ...}, ...}, ...}, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1269 +0x7b7
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).PlanAndRunSubqueries(0x6037620, {0x6037620, 0xc003c48aa0}, 0x1, 0xc00bfac7c0, {0xc00c25ecd0, 0x3, 0x1}, 0xc004ea9ce0, 0xc00bfac788)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1174 +0x17b
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execWithDistSQLEngine(0xc001c0ec00, {0x6037620, 0xc003c48aa0}, 0xc001c0f248, 0xc003c48aa0, {0x60bdd48, 0xc0087412f0}, 0x0, 0xc0083f2058)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1398 +0x566
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine(0xc001c0ec00, {0x6037620, 0xc003c48aa0}, 0xc001c0f248, {0x60bdd48, 0xc0087412f0})
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1084 +0x9c7
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState(0xc001c0ec00, {0x6037620, 0xc003c48aa0}, {{0x604b418, 0xc00384b180}, {0xc002ed0000, 0xbc30}, 0x100, 0x100}, 0xc0079b3200, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:721 +0x2526
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt(0xc001c0ec00, {0x6037620, 0xc003c487a0}, {{0x604b418, 0xc00384b180}, {0xc002ed0000, 0xbc30}, 0x100, 0x100}, 0xc0079b3200, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:135 +0x572
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execPortal(0xc001c0ec00, {0x6037620, 0xc003c487a0}, {0xc0079b3200, {0xc003bdd000, 0x100, 0x100}, {0x0, 0x0, 0x0}, ...}, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:222 +0x158
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd.func2({{0x0, 0x0}, 0x0, {0x24fdf08c, 0xed9787d27, 0x0}}, 0xc001c0ec00, 0xc00bfadbb8, 0xc00bfadba8, 0xc00a0e3ac8, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1820 +0x511
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc001c0ec00, {0x601b540, 0xc007346040})
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1822 +0x43a
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc001c0ec00, {0x6037620, 0xc0010650a0}, 0x70700c000074108, {0x1066160, 0xc001b5db00, 0x0}, 0xc005a2a900)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1676 +0x285
github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).initConnEx.func1()
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/internal.go:205 +0x8d
created by github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).initConnEx
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/internal.go:204 +0x5f1
Log preceding fatal error

=== RUN   TestCreateRandomSchema
    test_log_scope.go:79: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCreateRandomSchema71574895
    test_log_scope.go:80: use -show-logs to present logs inline

Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • GOFLAGS=-parallel=4

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Jan 18, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Jan 18, 2022
@cockroach-teamcity
Copy link
Member Author

sql/tests.TestCreateRandomSchema failed with artifacts on master @ 365b4da8bd02c06ee59d2130a56dec74ffc9ce21:

      /go/src/github.com/cockroachdb/cockroach/pkg/server/node.go:446 +0x1219
  github.com/cockroachdb/cockroach/pkg/server.(*Server).PreStart()
      /go/src/github.com/cockroachdb/cockroach/pkg/server/server.go:1669 +0x310c
  github.com/cockroachdb/cockroach/pkg/server.(*Server).Start()
      /go/src/github.com/cockroachdb/cockroach/pkg/server/server.go:1251 +0x44
  github.com/cockroachdb/cockroach/pkg/server.(*TestServer).Start()
      /go/src/github.com/cockroachdb/cockroach/pkg/server/testserver.go:500 +0x64
  github.com/cockroachdb/cockroach/pkg/testutils/serverutils.StartServer()
      /go/src/github.com/cockroachdb/cockroach/pkg/testutils/serverutils/test_server_shim.go:241 +0x17b
  github.com/cockroachdb/cockroach/pkg/sql/tests_test.TestCreateRandomSchema()
      /go/src/github.com/cockroachdb/cockroach/pkg/sql/tests/random_schema_test.go:43 +0x1ac
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /usr/local/go/src/testing/testing.go:1306 +0x47

Goroutine 1046 (running) created at:
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx()
      /go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:478 +0x615
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
      /go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:354 +0x13d
  github.com/cockroachdb/cockroach/pkg/spanconfig/spanconfigkvsubscriber.(*KVSubscriber).Start()
      /go/src/github.com/cockroachdb/cockroach/pkg/spanconfig/spanconfigkvsubscriber/kvsubscriber.go:177 +0x26
  github.com/cockroachdb/cockroach/pkg/server.(*Server).PreStart()
      /go/src/github.com/cockroachdb/cockroach/pkg/server/server.go:1801 +0x41fa
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTask()
      /go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:327 +0x161
  github.com/cockroachdb/cockroach/pkg/server.(*Node).writeNodeStatus()
      /go/src/github.com/cockroachdb/cockroach/pkg/server/node.go:834 +0x104
  github.com/cockroachdb/cockroach/pkg/server.(*Node).startWriteNodeStatus()
      /go/src/github.com/cockroachdb/cockroach/pkg/server/node.go:794 +0xae
  github.com/cockroachdb/cockroach/pkg/server.(*Server).PreStart()
      /go/src/github.com/cockroachdb/cockroach/pkg/server/server.go:1796 +0x416f
  github.com/cockroachdb/cockroach/pkg/server.(*Server).Start()
      /go/src/github.com/cockroachdb/cockroach/pkg/server/server.go:1251 +0x44
  github.com/cockroachdb/cockroach/pkg/server.(*TestServer).Start()
      /go/src/github.com/cockroachdb/cockroach/pkg/server/testserver.go:500 +0x64
  github.com/cockroachdb/cockroach/pkg/testutils/serverutils.StartServer()
      /go/src/github.com/cockroachdb/cockroach/pkg/testutils/serverutils/test_server_shim.go:241 +0x17b
  github.com/cockroachdb/cockroach/pkg/sql/tests_test.TestCreateRandomSchema()
      /go/src/github.com/cockroachdb/cockroach/pkg/sql/tests/random_schema_test.go:43 +0x1ac
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /usr/local/go/src/testing/testing.go:1306 +0x47
==================
    random_schema_test.go:110: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCreateRandomSchema1421317792
    testing.go:1152: race detected during execution of test
--- FAIL: TestCreateRandomSchema (43.95s)
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • GOFLAGS=-race -parallel=4

This test on roachdash | Improve this report!

@irfansharif
Copy link
Contributor

Actually, was only interested in the last failure, the first looks unrelated ("panic: getting first from empty ring buffer").

@irfansharif irfansharif reopened this Jan 18, 2022
@irfansharif
Copy link
Contributor

+cc @andreimatei, might have something to do with tracing.

@andreimatei
Copy link
Contributor

Also seen in
#74994
#68395 (comment)
#73610 (comment)
all from the past few hours.

I can repro with this test under stress. Somehow the maintained size of the structured recording buffer seems to get desynchronized from the buffer's contents. Nothing in this area seems to have changed recently, so I'm not sure why we're seeing these failures all of a sudden.

@rytaft rytaft changed the title sql/tests: TestCreateRandomSchema failed tracing: panic due to "getting first from empty ring buffer" Jan 18, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Jan 18, 2022
@andreimatei
Copy link
Contributor

Surprisingly or not, this bisected to #73876.
I'm not sure yet what's going on. It looks like a proto that's passed Span.RecordStructured is reused somehow.

andreimatei added a commit to andreimatei/cockroach that referenced this issue Jan 18, 2022
This patch fixes a bug leading to corruption of one of the Span's
recording buffers.
We had broken code like the following:
```
for _, e := range c.StructuredRecords {
        s.recordInternalLocked(&e, &s.mu.recording.structured)
}
```
`recordInternalLocked(ptr)` takes ownership of ptr, so the `&e` is
broken because `e` keep being reassigned.

The bug was causing buffer elements to essentially be overwritten. In
turn, this was leading to inconsistencies in the accounting of the size
of the buffer's elements - we were maintaining a size sum that was
diverging from the overwritten reality.

Fixes cockroachdb#74994
Fixes cockroachdb#75045

Release note: None
@andreimatei
Copy link
Contributor

Fixing in #75099

craig bot pushed a commit that referenced this issue Jan 20, 2022
75099: tracing: fix corruption of structured recording buffer r=andreimatei a=andreimatei

This patch fixes a bug leading to corruption of one of the Span's
recording buffers.
We had broken code like the following:
```
for _, e := range c.StructuredRecords {
        s.recordInternalLocked(&e, &s.mu.recording.structured)
}
```
`recordInternalLocked(ptr)` takes ownership of ptr, so the `&e` is
broken because `e` keep being reassigned.

The bug was causing buffer elements to essentially be overwritten. In
turn, this was leading to inconsistencies in the accounting of the size
of the buffer's elements - we were maintaining a size sum that was
diverging from the overwritten reality.

Fixes #74994
Fixes #75045

Release note: None

Co-authored-by: Andrei Matei <[email protected]>
@craig craig bot closed this as completed in 5a1adf7 Jan 20, 2022
gtr pushed a commit to gtr/cockroach that referenced this issue Jan 24, 2022
This patch fixes a bug leading to corruption of one of the Span's
recording buffers.
We had broken code like the following:
```
for _, e := range c.StructuredRecords {
        s.recordInternalLocked(&e, &s.mu.recording.structured)
}
```
`recordInternalLocked(ptr)` takes ownership of ptr, so the `&e` is
broken because `e` keep being reassigned.

The bug was causing buffer elements to essentially be overwritten. In
turn, this was leading to inconsistencies in the accounting of the size
of the buffer's elements - we were maintaining a size sum that was
diverging from the overwritten reality.

Fixes cockroachdb#74994
Fixes cockroachdb#75045

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-kv KV Team T-sql-queries SQL Queries Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants