Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/opt/exec/execbuilder: TestExecBuild failed #72802

Closed
cockroach-teamcity opened this issue Nov 16, 2021 · 11 comments · Fixed by #72967
Closed

sql/opt/exec/execbuilder: TestExecBuild failed #72802

cockroach-teamcity opened this issue Nov 16, 2021 · 11 comments · Fixed by #72967
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-sql-queries SQL Queries Team

Comments

@cockroach-teamcity
Copy link
Member

sql/opt/exec/execbuilder.TestExecBuild failed with artifacts on master @ 6e288820935624d0839406a0a8bbacaff8fe960d:

=== RUN   TestExecBuild
    test_log_scope.go:79: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild770660939
    test_log_scope.go:80: use -show-logs to present logs inline
=== CONT  TestExecBuild
    logic.go:3514: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild770660939
--- FAIL: TestExecBuild (99.51s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2441: let $rangeid = 44
--- done: testdata/schema_change_in_txn_nonmetamorphic with config local: 9 tests, 0 failures
--- total progress: 400 statements/queries
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2400: 
         
        testdata/show_trace_nonmetamorphic:366: SELECT operation, message FROM [SHOW KV TRACE FOR SESSION]
        WHERE message     LIKE '%r44: sending batch%'
          AND message NOT LIKE '%PushTxn%'
          AND message NOT LIKE '%QueryTxn%'
        expected:
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 1 EndTxn to (n1,s1):1
        but found (query options: "") :
            dist sender send  r44: sending batch 24 CPut, 1 EndTxn to (n1,s1):1
--- done: testdata/show_trace_nonmetamorphic with config local: 48 tests, 1 failures
            --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup (4.05s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic
=== PAUSE TestExecBuild/local/show_trace_nonmetamorphic
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic
        --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic (12.99s)
=== RUN   TestExecBuild/local
    --- FAIL: TestExecBuild/local (0.02s)
Help

See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)Parameters in this failure:

  • GOFLAGS=-json

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Nov 16, 2021
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Nov 16, 2021
@cockroach-teamcity
Copy link
Member Author

sql/opt/exec/execbuilder.TestExecBuild failed with artifacts on master @ 52a7a6f10a3ed02d7d4d0b7dc09406c86ba937dd:

=== RUN   TestExecBuild
    test_log_scope.go:79: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild1715850967
    test_log_scope.go:80: use -show-logs to present logs inline
=== CONT  TestExecBuild
    logic.go:3514: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild1715850967
--- FAIL: TestExecBuild (92.80s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2441: let $rangeid = 44
--- done: testdata/update_from with config local: 11 tests, 0 failures
--- total progress: 175 statements/queries
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2400: 
         
        testdata/show_trace_nonmetamorphic:366: SELECT operation, message FROM [SHOW KV TRACE FOR SESSION]
        WHERE message     LIKE '%r44: sending batch%'
          AND message NOT LIKE '%PushTxn%'
          AND message NOT LIKE '%QueryTxn%'
        expected:
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 1 EndTxn to (n1,s1):1
        but found (query options: "") :
            dist sender send  r44: sending batch 10 CPut to (n1,s1):1
            dist sender send  r44: sending batch 10 CPut to (n1,s1):1
            dist sender send  r44: sending batch 4 CPut, 1 EndTxn to (n1,s1):1
--- done: testdata/show_trace_nonmetamorphic with config local: 48 tests, 1 failures
--- done: testdata/window with config local: 22 tests, 0 failures
            --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup (1.10s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic
=== PAUSE TestExecBuild/local/show_trace_nonmetamorphic
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic
        --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic (3.61s)
=== RUN   TestExecBuild/local
    --- FAIL: TestExecBuild/local (0.00s)
Help

See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)Parameters in this failure:

  • GOFLAGS=-parallel=4

This test on roachdash | Improve this report!

@cucaroach cucaroach self-assigned this Nov 17, 2021
@cockroach-teamcity
Copy link
Member Author

sql/opt/exec/execbuilder.TestExecBuild failed with artifacts on master @ 1a078671083946342ecf610a6fb899df2d1783d3:

=== RUN   TestExecBuild
    test_log_scope.go:79: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild445443747
    test_log_scope.go:80: use -show-logs to present logs inline
=== CONT  TestExecBuild
    logic.go:3514: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild445443747
--- FAIL: TestExecBuild (79.08s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2441: let $rangeid = 44
--- done: testdata/window with config local: 22 tests, 0 failures
--- done: testdata/virtual with config local: 10 tests, 0 failures
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2400: 
         
        testdata/show_trace_nonmetamorphic:366: SELECT operation, message FROM [SHOW KV TRACE FOR SESSION]
        WHERE message     LIKE '%r44: sending batch%'
          AND message NOT LIKE '%PushTxn%'
          AND message NOT LIKE '%QueryTxn%'
        expected:
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 1 EndTxn to (n1,s1):1
        but found (query options: "") :
            dist sender send  r44: sending batch 24 CPut, 1 EndTxn to (n1,s1):1
--- done: testdata/show_trace_nonmetamorphic with config local: 48 tests, 1 failures
--- total progress: 319 statements/queries
            --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup (1.58s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic
=== PAUSE TestExecBuild/local/show_trace_nonmetamorphic
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic
        --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic (5.46s)
=== RUN   TestExecBuild/local
    --- FAIL: TestExecBuild/local (0.00s)
Help

See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)Parameters in this failure:

  • GOFLAGS=-parallel=4

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

sql/opt/exec/execbuilder.TestExecBuild failed with artifacts on master @ 0b9208b8e293a475d5a4c940b493c8414dfa95ea:

=== RUN   TestExecBuild
    test_log_scope.go:79: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild3407763170
    test_log_scope.go:80: use -show-logs to present logs inline
=== CONT  TestExecBuild
    logic.go:3514: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild3407763170
--- FAIL: TestExecBuild (78.20s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2441: let $rangeid = 44
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2400: 
         
        testdata/show_trace_nonmetamorphic:366: SELECT operation, message FROM [SHOW KV TRACE FOR SESSION]
        WHERE message     LIKE '%r44: sending batch%'
          AND message NOT LIKE '%PushTxn%'
          AND message NOT LIKE '%QueryTxn%'
        expected:
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 1 EndTxn to (n1,s1):1
        but found (query options: "") :
            dist sender send  r44: sending batch 11 CPut to (n1,s1):1
            dist sender send  r44: sending batch 11 CPut to (n1,s1):1
            dist sender send  r44: sending batch 2 CPut, 1 EndTxn to (n1,s1):1
--- done: testdata/show_trace_nonmetamorphic with config local: 48 tests, 1 failures
--- progress: testdata/partial_index_nonmetamorphic: 38 statements/queries
--- done: testdata/inverted_join_multi_column with config local: 7 tests, 0 failures
            --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup (3.15s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic
=== PAUSE TestExecBuild/local/show_trace_nonmetamorphic
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic
        --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic (12.02s)
=== RUN   TestExecBuild/local
    --- FAIL: TestExecBuild/local (0.00s)
Help

See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)Parameters in this failure:

  • GOFLAGS=-json

This test on roachdash | Improve this report!

@cucaroach
Copy link
Contributor

This has happened before: #70108

@cucaroach
Copy link
Contributor

Based on the variation of the failures it seems most likely we're running these tests in metamorphic mode even though the test specifies !metamorphic.

@cucaroach
Copy link
Contributor

Near as I can tell this was the command from teamcity:

GOFLAGS= go test -json  -mod=vendor -tags ' gss make x86_64_pc_linux_gnu crdb_test' -ldflags '-X github.com/cockroachdb/cockroach/pkg/build.typ=development -extldflags "" -X "github.com/cockroachdb/cockroach/pkg/build.tag=v22.1.0-alpha.00000000-1266-g6e28882093" -X "github.com/cockroachdb/cockroach/pkg/build.rev=6e288820935624d0839406a0a8bbacaff8fe960d" -X "github.com/cockroachdb/cockroach/pkg/build.cgoTargetTriple=x86_64-pc-linux-gnu"  ' -run "."  -timeout 45m ./pkg/... -v

That should be engaging metamorphic and so the show_trace_nonmetamorphic test should be skipped I think.

@cucaroach
Copy link
Contributor

Okay the max batch size metamorphism should be high-jacked by forceProductionBatchSizes which should be true for these tests so again I don't see why the batch size limit is seemingly being randomized.

@cockroach-teamcity
Copy link
Member Author

sql/opt/exec/execbuilder.TestExecBuild failed with artifacts on master @ 28bb1ea049da5bfb6e15a7003cd7b678cbc4b67f:

=== RUN   TestExecBuild
    test_log_scope.go:79: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild2951121496
    test_log_scope.go:80: use -show-logs to present logs inline
=== CONT  TestExecBuild
    logic.go:3514: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestExecBuild2951121496
--- FAIL: TestExecBuild (75.76s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2441: let $rangeid = 44
--- done: testdata/select_index with config local: 170 tests, 0 failures
--- progress: testdata/secondary_index_column_families_nonmetamorphic: 34 statements/queries
--- done: testdata/values with config local: 5 tests, 0 failures
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup
    logic.go:2400: 
         
        testdata/show_trace_nonmetamorphic:366: SELECT operation, message FROM [SHOW KV TRACE FOR SESSION]
        WHERE message     LIKE '%r44: sending batch%'
          AND message NOT LIKE '%PushTxn%'
          AND message NOT LIKE '%QueryTxn%'
        expected:
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 6 CPut to (n1,s1):1
            dist sender send  r44: sending batch 1 EndTxn to (n1,s1):1
        but found (query options: "") :
            dist sender send  r44: sending batch 24 CPut, 1 EndTxn to (n1,s1):1
--- done: testdata/show_trace_nonmetamorphic with config local: 48 tests, 1 failures
--- done: testdata/topk with config local: 7 tests, 0 failures
--- progress: testdata/partial_index_nonmetamorphic: 94 statements/queries
--- progress: testdata/upsert: 35 statements/queries
--- done: testdata/update with config local: 52 tests, 0 failures
            --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic/system_table_lookup (4.07s)
=== RUN   TestExecBuild/local/show_trace_nonmetamorphic
=== PAUSE TestExecBuild/local/show_trace_nonmetamorphic
=== CONT  TestExecBuild/local/show_trace_nonmetamorphic
        --- FAIL: TestExecBuild/local/show_trace_nonmetamorphic (14.33s)
=== RUN   TestExecBuild/local
    --- FAIL: TestExecBuild/local (0.01s)
Help

See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)Parameters in this failure:

  • GOFLAGS=-json

This test on roachdash | Improve this report!

cucaroach added a commit to cucaroach/cockroach that referenced this issue Nov 19, 2021
Previously because of Parallel usage we would be writing
serverArgs.forceProductionBatchSizes from multiple go routines.   Avoid
that with a separate copy for the nonMetamorphic tests.

Fixes cockroachdb#72802

Release note: None
craig bot pushed a commit that referenced this issue Nov 19, 2021
72957: roachprod: introduce Node type r=RaduBerinde a=RaduBerinde

There is a lot of confusion in the code around node "indexes".
Sometimes these are a (0-based) index in the target list of nodes,
sometimes they are a (1-based) node ID. This makes the code harder to
follow and more error-prone (in fact, there are a couple of places
where we convert from one to the other incorrectly).

This change introduces the types `Node` and `Nodes` and switches code
to use `Node` (rather than the 0-based index) whenever possible. This
enlists the compiler's help, as it is no longer legal to implicitly
convert from one to the other.

Release note: None

72967: logic: fix racy forceProductionBatchSizes bool r=cucaroach a=cucaroach

Previously because of Parallel usage we would be writing
serverArgs.forceProductionBatchSizes from multiple go routines.   Avoid
that with a separate copy for the nonMetamorphic tests.

Fixes #72802

Release note: None



72994: bazel: have bazelisk pull bazel binaries from our fork r=rail a=rickystewart

This will result in Bazel binaries being pulled from
https://github.com/cockroachdb/bazel/releases/tag/4.2.1 rather than
upstream.

Release note: None

Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Tommy Reilly <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
@craig craig bot closed this as completed in 6ca7fe5 Nov 19, 2021
cucaroach added a commit to cucaroach/cockroach that referenced this issue Jan 21, 2022
cucaroach added a commit to cucaroach/cockroach that referenced this issue Jan 21, 2022
Temporary fix for cockroachdb#72802

Due to cockroachdb#73876 these tests have become flakey. With the
disable-span-configs option 70 runs of make stress on the opt logic
tests pass.

Release note: None
@cucaroach
Copy link
Contributor

cucaroach commented Jan 21, 2022

I didn't mean for this to be closed. Should only close after #75282 is resolved.

@cucaroach cucaroach reopened this Jan 21, 2022
craig bot pushed a commit that referenced this issue Jan 21, 2022
75261: row: fetcher cleanup and improvements r=RaduBerinde a=RaduBerinde

#### rowenc: remove deprecated return from DecodeIndexKey

Release note: None

#### row: minor cleanup around foundNull

This moves around some code to make it clear that it's only relevant in
a specific case.

Release note: None

#### row: clean up ReadIndexKey

Renaming to DecodeIndexKey and removing return value which is no
longer useful.

Release note: None

#### row: more Fetcher cleanup

 - improve comment for `indexKey`;
 - unexport NextKey;
 - slightly change the return value of nextKey to simplify the logic
   (the semantic difference is what the first call returns, which is
   not used);
 - use numKeysPerRow instead of counting the total families; this
   enables the faster paths for more cases.

Release note: None


75272: kvserver: de-flake TestStoreSplitRangeLookupRace r=irfansharif a=irfansharif

Fixes #75198. This test was a bit brittle in expecting only one kind of
range lookup request in a testing filter -- it was always possible to
intercept a ReverseScanRequest, and after enabling span configs
(#73876), we now have an internal query ("validate-span-cfgs") that
makes use of it. See #75198 for more details.

Release note: None

75281: sql: disable span-config on flakey 5node tests r=cucaroach a=cucaroach

Temporary fix for #72802 and 5node/distsql_enum CI failures.

Due to #73876 these tests have become flakey.
With the disable-span-configs option 70 runs of make stress on the opt logic tests pass.

Release note: None


Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: irfan sharif <[email protected]>
Co-authored-by: Tommy Reilly <[email protected]>
gtr pushed a commit to gtr/cockroach that referenced this issue Jan 24, 2022
Temporary fix for cockroachdb#72802

Due to cockroachdb#73876 these tests have become flakey. With the
disable-span-configs option 70 runs of make stress on the opt logic
tests pass.

Release note: None
@irfansharif
Copy link
Contributor

@cucaroach BTW, I think you meant to reference #74933 earlier. The failure in this issue looks like something else (though still TestExecBuild), and before #73876.

@cucaroach
Copy link
Contributor

Yes I got my signals crossed. This issue was fixed by #72967

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants