Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: ycsb/A/nodes=3/cpu=32 failed #37378

Closed
cockroach-teamcity opened this issue May 8, 2019 · 3 comments · Fixed by #37401
Closed

roachtest: ycsb/A/nodes=3/cpu=32 failed #37378

cockroach-teamcity opened this issue May 8, 2019 · 3 comments · Fixed by #37401
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

SHA: https://github.com/cockroachdb/cockroach/commits/d554884a4e474cc06213230d5ba7d757a88e9e46

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=ycsb/A/nodes=3/cpu=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1279552&tab=buildLog

The test failed on branch=release-2.1, cloud=aws:
	cluster.go:1474,ycsb.go:41,cluster.go:1812,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1279552-ycsb-a-nodes-3-cpu-32:4 -- ./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=A --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m {pgurl:1-3} returned:
		stderr:
		
		stdout:
		15ed1e8, 0x0, 0x1809, 0x1)
			/usr/local/go/src/runtime/proc.go:302 +0xeb fp=0xc0007875d8 sp=0xc0007875b8 pc=0x438f8b
		runtime.selectgo(0xc000787768, 0xc000787760, 0x2, 0x0, 0x0)
			/usr/local/go/src/runtime/select.go:313 +0xcc6 fp=0xc000787738 sp=0xc0007875d8 pc=0x448e66
		database/sql.(*DB).connectionOpener(0xc001ce8180, 0x19949e0, 0xc000e23000)
			/usr/local/go/src/database/sql/sql.go:1001 +0xe8 fp=0xc0007877c8 sp=0xc000787738 pc=0x4f88a8
		runtime.goexit()
			/usr/local/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc0007877d0 sp=0xc0007877c8 pc=0x466f61
		created by database/sql.OpenDB
			/usr/local/go/src/database/sql/sql.go:671 +0x15d
		bash: line 1:  4488 Aborted                 (core dumped) bash -c "./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=A --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable'"
		Error:  exit status 134
		: exit status 1
	cluster.go:1833,ycsb.go:44,ycsb.go:65,test.go:1251: Goexit() was called

@cockroach-teamcity cockroach-teamcity added this to the 19.2 milestone May 8, 2019
@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. labels May 8, 2019
@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/d554884a4e474cc06213230d5ba7d757a88e9e46

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=ycsb/A/nodes=3/cpu=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1279548&tab=buildLog

The test failed on branch=release-2.1, cloud=gce:
	cluster.go:1474,ycsb.go:41,cluster.go:1812,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1279548-ycsb-a-nodes-3-cpu-32:4 -- ./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=A --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m {pgurl:1-3} returned:
		stderr:
		
		stdout:
		r/local/go/src/runtime/proc.go:302 +0xeb fp=0xc0028705c0 sp=0xc0028705a0 pc=0x438f8b
		runtime.selectgo(0xc002870768, 0xc002870748, 0x2, 0xffffffffff353632, 0xffffffffffffffff)
			/usr/local/go/src/runtime/select.go:313 +0xcc6 fp=0xc002870720 sp=0xc0028705c0 pc=0x448e66
		database/sql.(*DB).connectionResetter(0xc000b580c0, 0x19949e0, 0xc00095c340)
			/usr/local/go/src/database/sql/sql.go:1014 +0xfb fp=0xc0028707c8 sp=0xc002870720 pc=0x4f89db
		runtime.goexit()
			/usr/local/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc0028707d0 sp=0xc0028707c8 pc=0x466f61
		created by database/sql.OpenDB
			/usr/local/go/src/database/sql/sql.go:672 +0x193
		bash: line 1:  3933 Aborted                 (core dumped) bash -c "./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=A --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable'"
		Error:  exit status 134
		: exit status 1
	cluster.go:1833,ycsb.go:44,ycsb.go:65,test.go:1251: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/8abb47a1c9795c1463183bc44e776b054bece682

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=ycsb/A/nodes=3/cpu=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1279687&tab=buildLog

The test failed on branch=master, cloud=aws:
	cluster.go:1474,ycsb.go:41,cluster.go:1812,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1279687-ycsb-a-nodes-3-cpu-32:4 -- ./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=A --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m {pgurl:1-3} returned:
		stderr:
		
		stdout:
		sr/local/go/src/runtime/proc.go:302 +0xeb fp=0xc000f9cdd8 sp=0xc000f9cdb8 pc=0x438f8b
		runtime.selectgo(0xc000f9cf68, 0xc000f9cf60, 0x2, 0xffffffffffffffff, 0xffffffffffffffff)
			/usr/local/go/src/runtime/select.go:313 +0xcc6 fp=0xc000f9cf38 sp=0xc000f9cdd8 pc=0x448e66
		database/sql.(*DB).connectionOpener(0xc001b7e180, 0x19949e0, 0xc0024f2040)
			/usr/local/go/src/database/sql/sql.go:1001 +0xe8 fp=0xc000f9cfc8 sp=0xc000f9cf38 pc=0x4f88a8
		runtime.goexit()
			/usr/local/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc000f9cfd0 sp=0xc000f9cfc8 pc=0x466f61
		created by database/sql.OpenDB
			/usr/local/go/src/database/sql/sql.go:671 +0x15d
		bash: line 1:  4504 Aborted                 (core dumped) bash -c "./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=A --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable'"
		Error:  exit status 134
		: exit status 1
	cluster.go:1833,ycsb.go:44,ycsb.go:65,test.go:1251: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/8abb47a1c9795c1463183bc44e776b054bece682

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=ycsb/A/nodes=3/cpu=32 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1279683&tab=buildLog

The test failed on branch=master, cloud=gce:
	cluster.go:1474,ycsb.go:41,cluster.go:1812,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1279683-ycsb-a-nodes-3-cpu-32:4 -- ./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=A --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m {pgurl:1-3} returned:
		stderr:
		
		stdout:
		09, 0x1)
			/usr/local/go/src/runtime/proc.go:302 +0xeb fp=0xc00048d5c0 sp=0xc00048d5a0 pc=0x438f8b
		runtime.selectgo(0xc00048d768, 0xc00048d748, 0x2, 0xc00048d768, 0xc00048d760)
			/usr/local/go/src/runtime/select.go:313 +0xcc6 fp=0xc00048d720 sp=0xc00048d5c0 pc=0x448e66
		database/sql.(*DB).connectionResetter(0xc0005be240, 0x19949e0, 0xc000fb7b80)
			/usr/local/go/src/database/sql/sql.go:1014 +0xfb fp=0xc00048d7c8 sp=0xc00048d720 pc=0x4f89db
		runtime.goexit()
			/usr/local/go/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc00048d7d0 sp=0xc00048d7c8 pc=0x466f61
		created by database/sql.OpenDB
			/usr/local/go/src/database/sql/sql.go:672 +0x193
		bash: line 1:  4717 Aborted                 (core dumped) bash -c "./workload run ycsb --init --initial-rows=1000000 --splits=100 --workload=A --concurrency=64 --histograms=logs/stats.json --ramp=1m --duration=10m 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable' 'postgres://[email protected]:26257?sslmode=disable'"
		Error:  exit status 134
		: exit status 1
	cluster.go:1833,ycsb.go:44,ycsb.go:65,test.go:1251: Goexit() was called

danhhz added a commit to danhhz/cockroach that referenced this issue May 8, 2019
workload's Table schemas are SQL schemas, but cockroachdb#35349 switched the
initial data to be returned as a coldata.Batch, which has a more limited
set of types. (Or, in the case of simple workloads that return a
[]interface{}, it's roundtripped through coldata.Batch by the `Tuples`
helper.) Notably, this means a SQL STRING column is represented the same
as a BYTES column (ditto UUID, etc).

This caused a regression in splits, which received some []byte data for
a column tried to hand it to SPLIT as a SQL BYTES datum. This didn't
work for the UUID column in tpcc's history table nor the VARCHAR in
ycsb's usertable. Happily, a STRING works for both of these. It also
seems to work for BYTES columns, so it seems like the ambiguity is fine
in this case. When/if someone wants to add a workload that splits a
BYTES primary key column containing non-utf8 data, we'll may need to
revisit.

A more principled fix would be to get the fidelity back by parsing the
SQL schema, which in fact we do in `importccl.makeDatumFromColOffset`.
However, at the moment, this hack works and avoids the complexity and
the undesirable pkg/sql/parser dep.

Closes cockroachdb#37383
Closes cockroachdb#37382
Closes cockroachdb#37381
Closes cockroachdb#37380
Closes cockroachdb#37379
Closes cockroachdb#37378
Closes cockroachdb#37377
Closes cockroachdb#37393

Release note: None
craig bot pushed a commit that referenced this issue May 8, 2019
37401: workload: fix --splits regression introduced in #35349 r=tbg a=danhhz

workload's Table schemas are SQL schemas, but #35349 switched the
initial data to be returned as a coldata.Batch, which has a more limited
set of types. (Or, in the case of simple workloads that return a
[]interface{}, it's roundtripped through coldata.Batch by the `Tuples`
helper.) Notably, this means a SQL STRING column is represented the same
as a BYTES column (ditto UUID, etc).

This caused a regression in splits, which received some []byte data for
a column tried to hand it to SPLIT as a SQL BYTES datum. This didn't
work for the UUID column in tpcc's history table nor the VARCHAR in
ycsb's usertable. Happily, a STRING works for both of these. It also
seems to work for BYTES columns, so it seems like the ambiguity is fine
in this case. When/if someone wants to add a workload that splits a
BYTES primary key column containing non-utf8 data, we'll may need to
revisit.

A more principled fix would be to get the fidelity back by parsing the
SQL schema, which in fact we do in `importccl.makeDatumFromColOffset`.
However, at the moment, this hack works and avoids the complexity and
the undesirable pkg/sql/parser dep.

Closes #37383
Closes #37382
Closes #37381
Closes #37380
Closes #37379
Closes #37378
Closes #37377
Closes #37393

Release note: None

Co-authored-by: Daniel Harrison <[email protected]>
@craig craig bot closed this as completed in #37401 May 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant