Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: ycsb/B/nodes=3/cpu=32 failed #37380

Closed
cockroach-teamcity opened this issue May 8, 2019 · 3 comments · Fixed by #37401
Closed

roachtest: ycsb/B/nodes=3/cpu=32 failed #37380

cockroach-teamcity opened this issue May 8, 2019 · 3 comments · Fixed by #37401
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

SHA: https://github.com/cockroachdb/cockroach/commits/d554884a4e474cc06213230d5ba7d757a88e9e46

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=ycsb/B/nodes=3/cpu=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1279552&tab=buildLog

The test failed on branch=release-2.1, cloud=aws:
	cluster.go:1474,ycsb.go:41,cluster.go:1812,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1279552-ycsb-b-nodes-3-cpu-32:4 -- ./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=B --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m {pgurl:1-3} returned:
		stderr:
		
		stdout:
		local/go/src/runtime/proc.go:302 +0xeb fp=0xc0021195c0 sp=0xc0021195a0 pc=0x438f8b
		runtime.selectgo(0xc002119768, 0xc002119748, 0x2, 0x7517000000ffffff, 0x3435393931726573)
			/usr/local/go/src/runtime/select.go:313 +0xcc6 fp=0xc002119720 sp=0xc0021195c0 pc=0x448e66
		database/sql.(*DB).connectionResetter(0xc00057c0c0, 0x19949e0, 0xc001579c40)
			/usr/local/go/src/database/sql/sql.go:1014 +0xfb fp=0xc0021197c8 sp=0xc002119720 pc=0x4f89db
		runtime.goexit()
			/usr/local/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc0021197d0 sp=0xc0021197c8 pc=0x466f61
		created by database/sql.OpenDB
			/usr/local/go/src/database/sql/sql.go:672 +0x193
		bash: line 1:  4492 Aborted                 (core dumped) bash -c "./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=B --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable'"
		Error:  exit status 134
		: exit status 1
	cluster.go:1833,ycsb.go:44,ycsb.go:65,test.go:1251: Goexit() was called

@cockroach-teamcity cockroach-teamcity added this to the 19.2 milestone May 8, 2019
@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. labels May 8, 2019
@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/d554884a4e474cc06213230d5ba7d757a88e9e46

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=ycsb/B/nodes=3/cpu=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1279548&tab=buildLog

The test failed on branch=release-2.1, cloud=gce:
	cluster.go:1474,ycsb.go:41,cluster.go:1812,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1279548-ycsb-b-nodes-3-cpu-32:4 -- ./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=B --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m {pgurl:1-3} returned:
		stderr:
		
		stdout:
		1e8, 0x0, 0x1809, 0x1)
			/usr/local/go/src/runtime/proc.go:302 +0xeb fp=0xc00176bdc0 sp=0xc00176bda0 pc=0x438f8b
		runtime.selectgo(0xc00176bf68, 0xc00176bf48, 0x2, 0x0, 0x0)
			/usr/local/go/src/runtime/select.go:313 +0xcc6 fp=0xc00176bf20 sp=0xc00176bdc0 pc=0x448e66
		database/sql.(*DB).connectionResetter(0xc0005c0180, 0x19949e0, 0xc000d04080)
			/usr/local/go/src/database/sql/sql.go:1014 +0xfb fp=0xc00176bfc8 sp=0xc00176bf20 pc=0x4f89db
		runtime.goexit()
			/usr/local/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc00176bfd0 sp=0xc00176bfc8 pc=0x466f61
		created by database/sql.OpenDB
			/usr/local/go/src/database/sql/sql.go:672 +0x193
		bash: line 1:  4410 Aborted                 (core dumped) bash -c "./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=B --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable'"
		Error:  exit status 134
		: exit status 1
	cluster.go:1833,ycsb.go:44,ycsb.go:65,test.go:1251: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/8abb47a1c9795c1463183bc44e776b054bece682

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=ycsb/B/nodes=3/cpu=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1279687&tab=buildLog

The test failed on branch=master, cloud=aws:
	cluster.go:1474,ycsb.go:41,cluster.go:1812,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1279687-ycsb-b-nodes-3-cpu-32:4 -- ./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=B --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m {pgurl:1-3} returned:
		stderr:
		
		stdout:
		sr/local/go/src/runtime/proc.go:302 +0xeb fp=0xc00240add8 sp=0xc00240adb8 pc=0x438f8b
		runtime.selectgo(0xc00240af68, 0xc00240af60, 0x2, 0x37303031242c3337, 0x3537303031242c34)
			/usr/local/go/src/runtime/select.go:313 +0xcc6 fp=0xc00240af38 sp=0xc00240add8 pc=0x448e66
		database/sql.(*DB).connectionOpener(0xc0005c60c0, 0x19949e0, 0xc000a47840)
			/usr/local/go/src/database/sql/sql.go:1001 +0xe8 fp=0xc00240afc8 sp=0xc00240af38 pc=0x4f88a8
		runtime.goexit()
			/usr/local/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc00240afd0 sp=0xc00240afc8 pc=0x466f61
		created by database/sql.OpenDB
			/usr/local/go/src/database/sql/sql.go:671 +0x15d
		bash: line 1:  3855 Aborted                 (core dumped) bash -c "./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=B --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable'"
		Error:  exit status 134
		: exit status 1
	cluster.go:1833,ycsb.go:44,ycsb.go:65,test.go:1251: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/8abb47a1c9795c1463183bc44e776b054bece682

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=ycsb/B/nodes=3/cpu=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1279683&tab=buildLog

The test failed on branch=master, cloud=gce:
	cluster.go:1474,ycsb.go:41,cluster.go:1812,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1279683-ycsb-b-nodes-3-cpu-32:4 -- ./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=B --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m {pgurl:1-3} returned:
		stderr:
		
		stdout:
		09, 0x1)
			/usr/local/go/src/runtime/proc.go:302 +0xeb fp=0xc001f565c0 sp=0xc001f565a0 pc=0x438f8b
		runtime.selectgo(0xc001f56768, 0xc001f56748, 0x2, 0xc001fa65c0, 0xc001fa6960)
			/usr/local/go/src/runtime/select.go:313 +0xcc6 fp=0xc001f56720 sp=0xc001f565c0 pc=0x448e66
		database/sql.(*DB).connectionResetter(0xc0018020c0, 0x19949e0, 0xc000ed0c00)
			/usr/local/go/src/database/sql/sql.go:1014 +0xfb fp=0xc001f567c8 sp=0xc001f56720 pc=0x4f89db
		runtime.goexit()
			/usr/local/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc001f567d0 sp=0xc001f567c8 pc=0x466f61
		created by database/sql.OpenDB
			/usr/local/go/src/database/sql/sql.go:672 +0x193
		bash: line 1:  4717 Aborted                 (core dumped) bash -c "./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=B --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable'"
		Error:  exit status 134
		: exit status 1
	cluster.go:1833,ycsb.go:44,ycsb.go:65,test.go:1251: Goexit() was called

danhhz added a commit to danhhz/cockroach that referenced this issue May 8, 2019
workload's Table schemas are SQL schemas, but cockroachdb#35349 switched the
initial data to be returned as a coldata.Batch, which has a more limited
set of types. (Or, in the case of simple workloads that return a
[]interface{}, it's roundtripped through coldata.Batch by the `Tuples`
helper.) Notably, this means a SQL STRING column is represented the same
as a BYTES column (ditto UUID, etc).

This caused a regression in splits, which received some []byte data for
a column tried to hand it to SPLIT as a SQL BYTES datum. This didn't
work for the UUID column in tpcc's history table nor the VARCHAR in
ycsb's usertable. Happily, a STRING works for both of these. It also
seems to work for BYTES columns, so it seems like the ambiguity is fine
in this case. When/if someone wants to add a workload that splits a
BYTES primary key column containing non-utf8 data, we'll may need to
revisit.

A more principled fix would be to get the fidelity back by parsing the
SQL schema, which in fact we do in `importccl.makeDatumFromColOffset`.
However, at the moment, this hack works and avoids the complexity and
the undesirable pkg/sql/parser dep.

Closes cockroachdb#37383
Closes cockroachdb#37382
Closes cockroachdb#37381
Closes cockroachdb#37380
Closes cockroachdb#37379
Closes cockroachdb#37378
Closes cockroachdb#37377
Closes cockroachdb#37393

Release note: None
craig bot pushed a commit that referenced this issue May 8, 2019
37401: workload: fix --splits regression introduced in #35349 r=tbg a=danhhz

workload's Table schemas are SQL schemas, but #35349 switched the
initial data to be returned as a coldata.Batch, which has a more limited
set of types. (Or, in the case of simple workloads that return a
[]interface{}, it's roundtripped through coldata.Batch by the `Tuples`
helper.) Notably, this means a SQL STRING column is represented the same
as a BYTES column (ditto UUID, etc).

This caused a regression in splits, which received some []byte data for
a column tried to hand it to SPLIT as a SQL BYTES datum. This didn't
work for the UUID column in tpcc's history table nor the VARCHAR in
ycsb's usertable. Happily, a STRING works for both of these. It also
seems to work for BYTES columns, so it seems like the ambiguity is fine
in this case. When/if someone wants to add a workload that splits a
BYTES primary key column containing non-utf8 data, we'll may need to
revisit.

A more principled fix would be to get the fidelity back by parsing the
SQL schema, which in fact we do in `importccl.makeDatumFromColOffset`.
However, at the moment, this hack works and avoids the complexity and
the undesirable pkg/sql/parser dep.

Closes #37383
Closes #37382
Closes #37381
Closes #37380
Closes #37379
Closes #37378
Closes #37377
Closes #37393

Release note: None

Co-authored-by: Daniel Harrison <[email protected]>
@craig craig bot closed this as completed in #37401 May 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant