roachtest: tpch_concurrency failed #79870

cockroach-teamcity · 2022-04-12T23:43:26Z

roachtest.tpch_concurrency failed with artifacts on master @ 9b80166bc5f3d3d1212c44a9e1ac219d458af9e0:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	test_runner.go:1006,test_runner.go:905: test timed out (18h0m0s)

Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/sql-queries _{This test on roachdash | Improve this report!

Jira issue: CRDB-15885}

The text was updated successfully, but these errors were encountered:

yuzefovich · 2022-04-13T01:58:58Z

Seems like #79716 didn't fix everything, taking a look, but I don't think it's a release blocker since we reverted the original change on all branches.

yuzefovich · 2022-04-15T02:47:37Z

Unfortunately, so far I have been unable to reproduce this, neither on a roachprod cluster nor locally. For now I'll make a slight change to the test to print out the DistSQL diagrams so that at least we now the distribution of the relevant ranges which should make reproducing this easier later.

79935: streamingccl: fix nil linter check error and stream_ingestion_test error r=gh-casper a=gh-casper Previously, the stream ingestion test has no error since no duplicate event is generated. This PR fixes this and nil linter check error at the same time. Release note: None Closes: #79926 Related to #79929 which disables random_stream_client nil linter check. Will re-enable it after it's merged. 79980: roachtests: print DistSQL diagram in tpch_concurrency r=yuzefovich a=yuzefovich This will help with reproducing failures. Informs: #79870. Release note: None 80007: kv/kvprober: use single batch in write probe r=nvanbenschoten a=nvanbenschoten This commit adjusts `proberOpsImpl.Write` to issue its `Put`, `Del`, and `EndTxn` in a single `Batch`. This has a few positive effects: 1. it avoids writing intents 2. it requires a single pass through Raft 3. it avoids a pipeline stall after the Put goes through async consensus 4. it avoids async cleanup work after the txn commits It's not strictly necessary that we Put before we Del the key, because a Del is blind and leaves a tombstone even when the key is not live, but this is not guaranteed by the API, so we avoid depending on a subtlety of the implementation. 80116: ui: update statement key on ui r=maryliag a=maryliag Previously, we were displaying statement separated by the aggreation timestamp. Now that we don't show the aggregation timestamp column, we want to group all statements with same fingerprint id into a single row, no longer using the aggregation as part of a key. Aligning with the results we show for Statement Details page. Release note (ui change): Statements are not longer separated by aggregation interval on Statement Page. Now all statements with same fingerprint show as a single row. Co-authored-by: Casper <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]>

79292: randgen: add support for PARTITION BY to table generation r=rharding6373 a=rharding6373 This testing-only change adds support for `PARTITION BY LIST` on the prefix of a secondary index for randomly generated tables used for testing. We apply partitions to about ~10% of secondary indexes when there is more than one column in the index prefix, and add up to 10 partitions using random expressions. Fixes: #73920 Release note: None 80172: rowexec: use opaque keys to identify looked-up rows in lookup joins r=yuzefovich a=yuzefovich **sql: add span IDs to fetchers** This commit implements the idea of Jordan's 78223b9 to add opaque keys to fetchers. This allows the fetchers' users to identify the input `roachpb.Span` that a particular kv was fetched for. I decided to call these keys as "span IDs" since it seems less ambiguous in general, and especially so in the streamer world where too many "keys" are flying around already. Some additional improvements: - extract the return arguments of a function call into a helper struct - remove an allocation for `GetResponse`s in the non-streamer code path. Release note: None **rowexec: use opaque keys to identify looked-up rows in lookup joins** This commit refactors the span generators of the lookup joins to use the span IDs ("opaque keys") for matching between the looked-up rows and the input rows that correspond to them. Previously, we were maintaining a map from the span key (bytes value) and had to construct a partial key for each looked up row to get the matching input row indices. This commit makes it so that we now assign each span to lookup a unique span ID that is later used as an index into a two-dimensional slice of ints to get the matching indices. We still keep a map from the span key for the de-duplication purposes. This change is beneficial in several ways: - removes the need to construct the partial key for each looked up row - replaces a map lookup based on the partial key into a slice access - removes the need for the binary search among lookup spans in range-based lookup joins. This results in noticeable performance improvements in [microbenchmarks](https://gist.github.com/yuzefovich/3e5316e0a5ff32fd1b591b0583d78ced). Here is the difference on all TPCH queries (average over 20 runs of `tpchvec/perf` roachtest, negative number indicates reduction in latency): ``` Q2: 0.20s 0.20s 0.49% Q3: 1.96s 1.66s -14.86% Q4: 1.67s 1.62s -3.14% Q5: 2.82s 2.55s -9.55% Q7: 6.48s 6.93s 6.82% Q8: 1.08s 0.99s -8.34% Q9: 7.08s 6.36s -10.13% Q10: 2.16s 1.94s -9.99% Q11: 0.55s 0.51s -6.81% Q12: 8.78s 8.59s -2.13% Q16: 0.84s 0.82s -2.68% Q17: 0.49s 0.46s -6.80% Q20: 14.34s 13.43s -6.36% Q21: 9.95s 9.28s -6.76% Q22: 0.58s 0.54s -6.93% ``` (Queries that don't use lookup joins were ignored.) However, the main motivation behind this change was unblocking (or at least making it easier) to support lookup joins by the streamer. There, we want the streamer to operate with these "opaque keys" as well as take over the span de-duplication logic. Release note: None 80332: security: move logic in `security/password.go` to `security/password` r=ZhouXing19 a=ZhouXing19 This commit is to host the password logic in a separate package, `security/password/`. Release note: None 80416: evalctx: remove kv.DB from EvalContext r=RichardJCai a=RichardJCai Release note: None 80505: ccl/sqlproxyccl: update load balancing algorithm to use leastconns r=JeffSwenson a=jaylim-crl This commit replaces the existing load balancing algorithm within sqlproxy (i.e. weighted CPU load) with a leastconns approach as discussed. The idea behind this replacement is that CPU metrics are often stale, and we should be able to reasonably make routing decisions based on local data. Release note: None 80562: roachtest: fix a typo in a cluster setting name in tpch_concurrency r=yuzefovich a=yuzefovich Addresses: #79870 (comment). Release note: None 80591: rangefeed,storage: use time-bound iterator for with-diff catchup scan r=stevendanna,miretskiy a=sumeerbhola The MVCCIncrementalIterator already has support for selectively seeing versions beyond the time bounds, via the NextIgnoringTime method. This method is sufficient for with-diff catchup scans -- the changes to MVCCIncrementalIterator here are limited to comment improvements. To utilize NextIgnoringTime in the CatchUpIterator we need to distinguish between the reasons that a catchup scan wants to iterate to an older version of a key. These are now handled by different methods via the simpleCatchupIter interface. There is a simpleCatchupIterAdapter to provide a trivial implementation when the underlying iterator is a SimpleMVCCIterator. There is more testing of MVCCIncrementalIterator's NextIgnoringTime, and there is now a TestCatchupScan that tests with and without diff. BenchmarkCatchUpScan now also benchmarks with-diff. Results for linear-keys, where time-bound iteration shows an improvement: ``` BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=true/perc=0.00-16 3 397238811 ns/op 253609173 B/op 3006262 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=true/perc=50.00-16 6 192314659 ns/op 126847284 B/op 1503167 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=true/perc=75.00-16 12 93243340 ns/op 63433929 B/op 751606 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=true/perc=95.00-16 66 17636059 ns/op 12691501 B/op 150356 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=true/perc=99.00-16 296 3566355 ns/op 2560979 B/op 30126 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=false/perc=0.00-16 2 587807376 ns/op 253689672 B/op 3006770 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=false/perc=50.00-16 4 286110763 ns/op 126893106 B/op 1503413 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=false/perc=75.00-16 8 140012384 ns/op 63458643 B/op 751740 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=false/perc=95.00-16 43 26210520 ns/op 12695466 B/op 150380 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=true/withDiff=false/perc=99.00-16 200 5381206 ns/op 2561444 B/op 30133 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=true/perc=0.00-16 3 379387200 ns/op 253577896 B/op 3006228 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=true/perc=50.00-16 4 294533788 ns/op 130799366 B/op 1503677 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=true/perc=75.00-16 4 254943713 ns/op 69418826 B/op 752487 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=true/perc=95.00-16 5 220269848 ns/op 20291392 B/op 151411 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=true/perc=99.00-16 5 211813333 ns/op 10473760 B/op 31231 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=false/perc=0.00-16 3 379575618 ns/op 253591349 B/op 3006275 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=false/perc=50.00-16 4 282490494 ns/op 126852270 B/op 1503388 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=false/perc=75.00-16 5 241938049 ns/op 63463176 B/op 751985 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=false/perc=95.00-16 5 210848534 ns/op 12741640 B/op 150944 allocs/op BenchmarkCatchUpScan/linear-keys/useTBI=false/withDiff=false/perc=99.00-16 6 196175587 ns/op 2602912 B/op 30652 allocs/op ``` Fixes #78974 Release note: None 80650: testutils/buildutil: add disallowed_prefixes to disallowed_imports_test r=ajwerner a=ajwerner This is useful if you want to make sure that some part of the tree doesn't depend (transitively) on any other part of the tree. Release note: None 80656: bazel: in `bazel-generate.sh`, prune obviously irrelevant files r=irfansharif a=rickystewart Here we adapt a trick from the `Makefile`: we can use the `-prune` argument to `find` to filter out obviously irrelevant files from `node_modules` when traversing `pkg`. On my machine this saves about 3 seconds on no-op `dev generate bazel` invocations. Release note: None Co-authored-by: rharding6373 <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Jane Xing <[email protected]> Co-authored-by: richardjcai <[email protected]> Co-authored-by: Jay <[email protected]> Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>

cockroach-teamcity · 2022-04-30T13:35:07Z

roachtest.tpch_concurrency failed with artifacts on master @ a2e1910f51593bd2ef72e1d7c615e08f95791186:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	tpch_concurrency.go:114,monitor.go:105,errgroup.go:57: dial tcp 34.148.161.250:26257: connect: connection refused

	cluster.go:1855,tpch_concurrency.go:68,tpch_concurrency.go:79,tpch_concurrency.go:199,tpch_concurrency.go:224,test_runner.go:876: cluster.StartE: context canceled
		(1) attached stack trace
		  -- stack trace:
		  | main.(*clusterImpl).StartE
		  | 	main/pkg/cmd/roachtest/cluster.go:1787
		  | main.(*clusterImpl).Start
		  | 	main/pkg/cmd/roachtest/cluster.go:1854
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func2
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:68
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func3
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:79
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func4
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:199
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func5
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:224
		  | main.(*testRunner).runTest.func2
		  | 	main/pkg/cmd/roachtest/test_runner.go:876
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1581
		Wraps: (2) cluster.StartE
		Wraps: (3) context canceled
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2022-05-01T12:49:01Z

roachtest.tpch_concurrency failed with artifacts on master @ c9e0194b19a03d55c6be92572aad3fbafc256334:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	tpch_concurrency.go:114,monitor.go:105,errgroup.go:57: dial tcp 34.74.127.69:26257: connect: connection refused

	cluster.go:1855,tpch_concurrency.go:68,tpch_concurrency.go:79,tpch_concurrency.go:199,tpch_concurrency.go:224,test_runner.go:876: cluster.StartE: context canceled
		(1) attached stack trace
		  -- stack trace:
		  | main.(*clusterImpl).StartE
		  | 	main/pkg/cmd/roachtest/cluster.go:1787
		  | main.(*clusterImpl).Start
		  | 	main/pkg/cmd/roachtest/cluster.go:1854
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func2
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:68
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func3
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:79
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func4
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:199
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func5
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:224
		  | main.(*testRunner).runTest.func2
		  | 	main/pkg/cmd/roachtest/test_runner.go:876
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1581
		Wraps: (2) cluster.StartE
		Wraps: (3) context canceled
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2022-05-02T17:41:25Z

roachtest.tpch_concurrency failed with artifacts on master @ 7a9eb906ce86e2f75db637e29d46cd6604fca7b4:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	test_runner.go:1007,test_runner.go:906: test timed out (12h0m0s)

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

yuzefovich · 2022-05-03T02:57:56Z

Hm, the timeout seems to be just Q15 taking really long time. Q15 creates and drops a view, so it could be schema-related.

Here is a condensed timeline:

concurrency: 104, all Q15 took about 12 minutes, crash on Q18
concurrency: 76, all Q15 took about 11 minutes, crash on Q20
concurrency: 62, all Q15 took about 12 minutes, success
concurrency: 69, all Q15 took about 27 minutes, success
concurrency: 72, all Q15 took about 65 minutes, crash on Q20
concurrency: 70, all Q15 took about 76 minutes, success
concurrency: 71, the test timed out on Q15 after 17 minutes.

Note that a "crash" means that at least one node in the cluster crashed (probably due to an OOM) and a "success" means that all queries ran and no nodes crashed. The cluster is restarted for each concurrency run.

80967: sql: disable streamer for lookup joins for now r=yuzefovich a=yuzefovich There is some fallout in the roachtests from using the streamer for the lookup join without ordering, so let's disable it for now while we're figuring things out. Informs: #80823. Informs: #79870 (comment). Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]>

cockroach-teamcity · 2022-05-21T10:26:38Z

roachtest.tpch_concurrency failed with artifacts on master @ 7dd1c51f6b5802e32bafd82e46747f349836592f:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	tpch_concurrency.go:114,monitor.go:105,errgroup.go:57: dial tcp 35.231.212.140:26257: connect: connection refused

	cluster.go:1840,tpch_concurrency.go:68,tpch_concurrency.go:79,tpch_concurrency.go:199,tpch_concurrency.go:224,test_runner.go:884: cluster.StartE: context canceled
		(1) attached stack trace
		  -- stack trace:
		  | main.(*clusterImpl).StartE
		  | 	main/pkg/cmd/roachtest/cluster.go:1772
		  | main.(*clusterImpl).Start
		  | 	main/pkg/cmd/roachtest/cluster.go:1839
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func2
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:68
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func3
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:79
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func4
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:199
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCHConcurrency.func5
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpch_concurrency.go:224
		  | main.(*testRunner).runTest.func2
		  | 	main/pkg/cmd/roachtest/test_runner.go:884
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1581
		Wraps: (2) cluster.StartE
		Wraps: (3) context canceled
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2022-05-27T17:46:31Z

roachtest.tpch_concurrency failed with artifacts on master @ d9fe5f67c75b1b59fc297bf4509a139c640b835b:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	test_runner.go:1015,test_runner.go:914: test timed out (12h0m0s)

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2022-05-28T17:59:13Z

roachtest.tpch_concurrency failed with artifacts on master @ 1e2cc61b58dc14386bb68dca59814874648931c2:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	test_runner.go:1015,test_runner.go:914: test timed out (12h0m0s)

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2022-05-29T17:58:19Z

roachtest.tpch_concurrency failed with artifacts on master @ e6815947a050e32f21e983aa30dc74ab2a247af3:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	test_runner.go:1015,test_runner.go:914: test timed out (12h0m0s)

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2022-05-30T17:55:51Z

roachtest.tpch_concurrency failed with artifacts on master @ e6815947a050e32f21e983aa30dc74ab2a247af3:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	test_runner.go:1015,test_runner.go:914: test timed out (12h0m0s)

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2022-05-31T17:57:28Z

roachtest.tpch_concurrency failed with artifacts on master @ 662cc5c3070e6d64d155a9cc9f33253ee5d99ee9:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	test_runner.go:1015,test_runner.go:914: test timed out (12h0m0s)

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

cockroach-teamcity · 2022-06-01T17:50:12Z

roachtest.tpch_concurrency failed with artifacts on master @ 1cea73c8a18623949b81705eb5f75179e6cd8d86:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /artifacts/tpch_concurrency/run_1
	test_runner.go:1015,test_runner.go:914: test timed out (12h0m0s)

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

roachtest: tpch_concurrency failed #80290 roachtest: tpch_concurrency failed [C-test-failure O-roachtest O-robot T-sql-queries branch-release-22.1]

_{This test on roachdash | Improve this report!}

yuzefovich · 2022-06-02T01:56:37Z

I think I have explanation for these timeouts. They are due to automatic retries that are encountered when executing Q15. That query consists of three statements: creating a view, a SELECT stmt, dropping that view. On master we have enable_implicit_transaction_for_batch_statements session variable set to true, so all three statements are executed as part of a single txn (on 22.1 branch the default is false, so we're not hitting the timeouts). All concurrent txns are contending with each other (on creation and dropping of the view), and usually one connection is making progress while all others "blocked". With the streamer enabled it seems that the performance of creating the view and the SELECT query is worse, so each connection takes longer to process the query, meaning that the "queue" of connections is stalled for longer periods of time. I think the fix is to disable the implicit txn for multi-statement queries since it's not really the goal of this test to exercise DDL with high concurrency.

ajwerner · 2022-06-02T03:15:21Z

The TCP-H spec seems to indicate that the views should be assigned a StreamID suffix so that different threads don't interfere. i don't think it'd be a huge refactor to give each worker its own copy of the queries with a StreamID templated in. That'd also fix it and allow us to get rid of the IF NOT EXISTS clauses we've added.

yuzefovich · 2022-06-02T03:18:37Z

Indeed, nice find Andrew, thanks! I'll implement that.

blathers-crl bot added the T-sql-queries SQL Queries Team label Apr 12, 2022

yuzefovich self-assigned this Apr 13, 2022

yuzefovich removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Apr 13, 2022

yuzefovich mentioned this issue Apr 15, 2022

roachtests: print DistSQL diagram in tpch_concurrency #79980

Merged

cockroach-teamcity mentioned this issue Apr 20, 2022

roachtest: tpch_concurrency failed #80290

Closed

blathers-crl bot mentioned this issue Apr 22, 2022

release-22.1: roachtests: print DistSQL diagram in tpch_concurrency #80394

Merged

This comment was marked as resolved.

Sign in to view

yuzefovich mentioned this issue Apr 26, 2022

roachtest: fix a typo in a cluster setting name in tpch_concurrency #80562

Merged

This comment was marked as resolved.

Sign in to view

This comment was marked as duplicate.

Sign in to view

yuzefovich mentioned this issue May 4, 2022

sql: disable streamer for lookup joins for now #80967

Merged

This comment was marked as duplicate.

Sign in to view

jlinder added sync-me and removed sync-me labels May 20, 2022

yuzefovich mentioned this issue Jun 2, 2022

roachtest: disable implicit txns in tpch_concurrency #82326

Closed

yuzefovich mentioned this issue Jun 2, 2022

workload/tpch: augment Q15 with streamID for concurrency #82330

Merged

craig bot closed this as completed in #82330 Jun 2, 2022

craig bot closed this as completed in daf3b48 Jun 2, 2022

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to Done in SQL Queries Jul 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roachtest: tpch_concurrency failed #79870

roachtest: tpch_concurrency failed #79870

cockroach-teamcity commented Apr 12, 2022 •

edited by cockroach-jira-scripts

Loading

yuzefovich commented Apr 13, 2022

yuzefovich commented Apr 15, 2022

This comment was marked as resolved.

This comment was marked as resolved.

cockroach-teamcity commented Apr 30, 2022

cockroach-teamcity commented May 1, 2022

cockroach-teamcity commented May 2, 2022

yuzefovich commented May 3, 2022

This comment was marked as duplicate.

This comment was marked as duplicate.

cockroach-teamcity commented May 21, 2022

cockroach-teamcity commented May 27, 2022

cockroach-teamcity commented May 28, 2022

cockroach-teamcity commented May 29, 2022

cockroach-teamcity commented May 30, 2022

cockroach-teamcity commented May 31, 2022

cockroach-teamcity commented Jun 1, 2022

yuzefovich commented Jun 2, 2022

ajwerner commented Jun 2, 2022

yuzefovich commented Jun 2, 2022

roachtest: tpch_concurrency failed #79870

roachtest: tpch_concurrency failed #79870

Comments

cockroach-teamcity commented Apr 12, 2022 • edited by cockroach-jira-scripts Loading

yuzefovich commented Apr 13, 2022

yuzefovich commented Apr 15, 2022

This comment was marked as resolved.

This comment was marked as resolved.

cockroach-teamcity commented Apr 30, 2022

cockroach-teamcity commented May 1, 2022

cockroach-teamcity commented May 2, 2022

yuzefovich commented May 3, 2022

This comment was marked as duplicate.

This comment was marked as duplicate.

cockroach-teamcity commented May 21, 2022

cockroach-teamcity commented May 27, 2022

cockroach-teamcity commented May 28, 2022

cockroach-teamcity commented May 29, 2022

cockroach-teamcity commented May 30, 2022

cockroach-teamcity commented May 31, 2022

cockroach-teamcity commented Jun 1, 2022

yuzefovich commented Jun 2, 2022

ajwerner commented Jun 2, 2022

yuzefovich commented Jun 2, 2022

cockroach-teamcity commented Apr 12, 2022 •

edited by cockroach-jira-scripts

Loading