-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: scaledata/filesystem_simulator/nodes=3 failed #50687
Comments
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@8f768ad14cfb3f514db6d40465b2dd60ee1f2890:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
See this test on roachdash |
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@c627e3490d30e8ba88f6c7136717a392a054da4e:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
See this test on roachdash |
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@17c8048e80935f8a01477416980d18bf39cba1bb:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
See this test on roachdash |
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@3a03f3843a8cdf04f82c52753c61cf01b0d2ddcd:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
See this test on roachdash |
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@456a07cfc1e53b87abc7709052e54efb1450e758:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
See this test on roachdash |
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@3e0de239121813ea4d47873388a2828a66d9edf7:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
See this test on roachdash |
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@9304ecd70e9f3ba4cb16b5443a10b4e17d7baee0:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
See this test on roachdash |
@irfansharif would you mind taking a look at whether this is the same issue as #50175, but on |
Hm, these don't look like benign setup errors:
|
Not the same issue, but looks like failure mode was introduced ~11 days ago. I'll try repro-ing now (and bisecting if not immediately obvious). |
Immediately reproducible. As a future note to myself to clean up how scaledata tests are run, here's how to run scaledata tests locally. # In your cockroachdb/rksql checkout
cd $GOPATH/src/github.com/cockroachdb/rksql
src/go/BUILD.py
cp src/go/bin/filesystem_simulator $GOPATH/src/github.com/cockroachdb/cockroach
cd $GOPATH/src/github.com/cockroachdb/cockroach
make bin/roachprod; make bin/roachtest
roachprod wipe local; roachprod destroy local
bin/roachtest run scaledata/filesystem_simulator/nodes=3 --wipe=false --cockroach ./cockroach --roachprod bin/roachprod --local With the following diff applied to your cockroachdb checkout. diff --git i/pkg/cmd/roachtest/scaledata.go w/pkg/cmd/roachtest/scaledata.go
index 9fd5d9abf9..4d6a27b11c 100644
--- i/pkg/cmd/roachtest/scaledata.go
+++ w/pkg/cmd/roachtest/scaledata.go
@@ -13,11 +13,8 @@ package main
import (
"context"
"fmt"
- "runtime"
"strings"
"time"
-
- "github.com/cockroachdb/cockroach/pkg/util/binfetcher"
)
func registerScaleData(r *testRegistry) {
@@ -58,21 +55,8 @@ func runSqlapp(ctx context.Context, t *test, c *cluster, app, flags string, dur
roachNodes := c.Range(1, roachNodeCount)
appNode := c.Node(c.spec.NodeCount)
- if local && runtime.GOOS != "linux" {
- t.Fatalf("must run on linux os, found %s", runtime.GOOS)
- }
- b, err := binfetcher.Download(ctx, binfetcher.Options{
- Component: "rubrik",
- Binary: app,
- Version: "LATEST",
- GOOS: "linux",
- GOARCH: "amd64",
- })
- if err != nil {
- t.Fatal(err)
- }
-
- c.Put(ctx, b, app, appNode)
+ // Expects to find the named binary in repo root.
+ c.Put(ctx, app, app, appNode)
c.Put(ctx, cockroach, "./cockroach", roachNodes)
c.Start(ctx, t, roachNodes) |
The bisect script just terminated, definitively pointing to @asubiotto, mind taking a look? I'm not sure about much of the area touched in #50388, and why this failure would be caused by it. To understand what this file simulator test is doing, take a look at https://github.com/cockroachdb/rksql/blob/master/src/go/src/rubrik/sqlapp/filesystem_simulator/main.go After building the right make buildshort
roachprod wipe local; roachprod destroy local
bin/roachtest run scaledata/filesystem_simulator/nodes=3 --wipe=false --cockroach ./cockroach --roachprod bin/roachprod --local |
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@e3fb5aa18d0f5064f7ba5d4df3864e94b3abb96d:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
See this test on roachdash |
I'll take a look |
Looks like the test passes with the following diff: diff --git a/pkg/sql/colexec/materializer.go b/pkg/sql/colexec/materializer.go
index c2a8c701d9..d13fc157cd 100644
--- a/pkg/sql/colexec/materializer.go
+++ b/pkg/sql/colexec/materializer.go
@@ -168,10 +168,21 @@ func NewMaterializer(
output,
nil, /* memMonitor */
execinfra.ProcStateOpts{
- InputsToDrain: []execinfra.RowSource{m.drainHelper},
+ //InputsToDrain: []execinfra.RowSource{m.drainHelper},
TrailingMetaCallback: func(ctx context.Context) []execinfrapb.ProducerMetadata {
- m.InternalClose()
- return nil
+ var resultMeta []execinfrapb.ProducerMetadata
+ for {
+ row, meta := m.drainHelper.Next()
+ if meta != nil {
+ resultMeta = append(resultMeta, *meta)
+ }
+ if row == nil && meta == nil {
+ break
+ }
+ }
+ defer m.InternalClose()
+ return resultMeta
},
},
); err != nil { The difference is that this patch doesn't swallow |
I think the culprit is the vectorized inbox (looking at the plans of the queries dropping this error). If it encounters any metadata during execution, it buffers it and returns it later. This means that the flow first transitions to draining and only then observes the error, which is swallowed because it's a |
That was it. PR incoming. Unfortunately I think the commit that exposed this issue is part of the alpha so that will need to be restarted. |
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@1b5d070c93375d3e14c146241e8bafde349529bd:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
See this test on roachdash |
(roachtest).scaledata/filesystem-simulator/nodes=3 failed on master@e9a4f83e3eee59510f97db2c6e0df9b57cf6b944:
More
Artifacts: /scaledata/filesystem-simulator/nodes=3
See this test on roachdash |
(roachtest).scaledata/filesystem_simulator/nodes=3 failed on master@d3791a81c0716478de08d44459d3fcf5b4f3ea1e:
More
Artifacts: /scaledata/filesystem_simulator/nodes=3
Related:
roachtest: scaledata/filesystem_simulator/nodes=3 failed #50175 roachtest: scaledata/filesystem_simulator/nodes=3 failed C-test-failure O-roachtest O-robot branch-release-20.1 release-blocker
roachtest: scaledata/filesystem_simulator/nodes=3 failed #48328 roachtest: scaledata/filesystem_simulator/nodes=3 failed C-test-failure O-roachtest O-robot branch-release-19.2 release-blocker
See this test on roachdash
powered by pkg/cmd/internal/issues
The text was updated successfully, but these errors were encountered: