Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: distsql_automatic_stats logic test fails locally #37505

Closed
RaduBerinde opened this issue May 14, 2019 · 5 comments
Closed

sql: distsql_automatic_stats logic test fails locally #37505

RaduBerinde opened this issue May 14, 2019 · 5 comments
Assignees

Comments

@RaduBerinde
Copy link
Member

I am consistently hitting this error:

    --- FAIL: TestLogic/fakedist (0.00s)
        --- FAIL: TestLogic/fakedist/distsql_automatic_stats (46.01s)
            logic.go:1500: condition failed to evaluate within 45s: 
                testdata/logic_test/distsql_automatic_stats:17: SELECT statistics_name, column_names, row_count, distinct_count, null_count
                FROM [SHOW STATISTICS FOR TABLE data] ORDER BY column_names::STRING, created
                expected:
                    statistics_name  column_names  row_count  distinct_count  null_count
                    __auto__         {a}           1000       10              0
                    __auto__         {b}           1000       10              0
                    __auto__         {c}           1000       10              0
                    __auto__         {d}           1000       0               1000
                    
                but found (query options: "colnames,rowsort,retry" -> ignore the following ordering of rows) :
                    statistics_name  column_names  row_count  distinct_count  null_count

The other two configs (fakedist-opt, fakedist-metadata) work just fine (and complete quickly):

    --- FAIL: TestLogic/fakedist (46.01s)
        --- FAIL: TestLogic/fakedist/distsql_automatic_stats (45.99s)
    --- PASS: TestLogic/fakedist-opt (1.20s)
        --- PASS: TestLogic/fakedist-opt/distsql_automatic_stats (1.18s)
    --- PASS: TestLogic/fakedist-metadata (1.39s)
        --- PASS: TestLogic/fakedist-metadata/distsql_automatic_stats (1.39s)

But, I noticed that if I restrict it to these variants, the first one which runs has the same problem:

    --- FAIL: TestLogic/fakedist-opt (45.81s)
        --- FAIL: TestLogic/fakedist-opt/distsql_automatic_stats (45.80s)
    --- PASS: TestLogic/fakedist-metadata (1.09s)
        --- PASS: TestLogic/fakedist-metadata/distsql_automatic_stats (1.08s)

If I run other tests before, it works:

    --- PASS: TestLogic/fakedist-opt (20.58s)
        --- PASS: TestLogic/fakedist-opt/aggregate (2.10s)
            --- PASS: TestLogic/fakedist-opt/aggregate/other (1.62s)
            --- PASS: TestLogic/fakedist-opt/aggregate/string_agg (0.25s)
        --- PASS: TestLogic/fakedist-opt/alias_types (0.21s)
        --- PASS: TestLogic/fakedist-opt/alter_column_type (1.12s)
            --- PASS: TestLogic/fakedist-opt/alter_column_type/SanityCheck (0.13s)
            --- PASS: TestLogic/fakedist-opt/alter_column_type/ManualGeneralChange (0.64s)
            --- PASS: TestLogic/fakedist-opt/alter_column_type/ChangeVisibleColumnType (0.05s)
            --- PASS: TestLogic/fakedist-opt/alter_column_type/BadStringLocale (0.04s)
            --- PASS: TestLogic/fakedist-opt/alter_column_type/BadCollateOnNotString (0.04s)
            --- PASS: TestLogic/fakedist-opt/alter_column_type/NoOpColumnChange (0.04s)
        --- PASS: TestLogic/fakedist-opt/alter_sequence (0.23s)
        --- PASS: TestLogic/fakedist-opt/alter_table (9.40s)
            --- PASS: TestLogic/fakedist-opt/alter_table/26422 (1.20s)
            --- PASS: TestLogic/fakedist-opt/alter_table/27402 (0.20s)
        --- PASS: TestLogic/fakedist-opt/apply_join (0.27s)
        --- PASS: TestLogic/fakedist-opt/array (1.83s)
            --- PASS: TestLogic/fakedist-opt/array/36477 (0.09s)
        --- PASS: TestLogic/fakedist-opt/dangerous_statements (0.17s)
        --- PASS: TestLogic/fakedist-opt/datetime (0.57s)
            --- PASS: TestLogic/fakedist-opt/datetime/regression_35255 (0.00s)
            --- PASS: TestLogic/fakedist-opt/datetime/regression_36146 (0.00s)
            --- PASS: TestLogic/fakedist-opt/datetime/regression_36557 (0.00s)
        --- PASS: TestLogic/fakedist-opt/decimal (0.46s)
        --- PASS: TestLogic/fakedist-opt/deep_interleaving (0.26s)
        --- PASS: TestLogic/fakedist-opt/default (0.38s)
        --- PASS: TestLogic/fakedist-opt/delete (0.67s)
            --- PASS: TestLogic/fakedist-opt/delete/regression_29494 (0.12s)
            --- PASS: TestLogic/fakedist-opt/delete/regression_33361 (0.28s)
        --- PASS: TestLogic/fakedist-opt/dependencies (0.94s)
        --- PASS: TestLogic/fakedist-opt/discard (0.20s)
        --- PASS: TestLogic/fakedist-opt/distinct (0.23s)
        --- PASS: TestLogic/fakedist-opt/distinct_on (0.28s)
        --- PASS: TestLogic/fakedist-opt/distsql_automatic_stats (0.64s)

It feels like the binary must have been running for a while before we can run the test, though I have no idea why (we create a new test cluster for every testfile).

@rytaft
Copy link
Collaborator

rytaft commented May 21, 2019

I can't reproduce this problem -- I've tried running
make testlogic PKG=./pkg/sql/logictest/... TESTS='TestLogic/fakedist$$/distsql_automatic_stats',
make testlogic PKG=./pkg/sql/logictest/... TESTS='TestLogic/fakedist-opt$$/distsql_automatic_stats',
make stress PKG=./pkg/sql/logictest/... TESTS='TestLogic//distsql_automatic_stats', and
make stress PKG=./pkg/sql/logictest/... TESTS='TestLogic/fakedist/distsql_automatic_stats' on my laptop and hit no error. Are you still seeing this issue? What commands exactly did you run to produce the error?

@RaduBerinde
Copy link
Member Author

I am using make test PKG=./pkg/sql/logictest 'TESTS=Test(Planner|)Logic$$//distsql_automatic_stats' TESTFLAGS= TESTTIMEOUT=2m.

I tried it now on master and it didn't fail. Then I picked a commit from around the day I filed the bug (881c6e4) and it does fail consistently on that one.. So maybe it was fixed already? I'd still be curious what it was, I'll try bisecting if I find some time.

@rytaft
Copy link
Collaborator

rytaft commented May 21, 2019

Ok great - I think it's probably the change I just made in #37611 -- I added SET CLUSTER SETTING sql.stats.automatic_collection.enabled = false at the top of the file to avoid creating stats in the CREATE TABLE statement. Since that CREATE TABLE statement includes an index, it's possible that stats creation was getting triggered in the schema change code and that was throwing things off....

I'll close this now since it seems like it's been resolved, but feel free to reopen if you see the issue again on master.

@rytaft rytaft closed this as completed May 21, 2019
@RaduBerinde
Copy link
Member Author

Haha, bisect showed that it was actually fixed by the change I was working on when I ran into this (#37588). I moved up the tweaking of DefaultRefreshInterval before StartTestCluster (in logic.go); we were doing it too late and it wasn't working for the first test. I probably forgot that I had found the problem (or I wasn't sure).

@rytaft
Copy link
Collaborator

rytaft commented May 21, 2019

Ah nice find!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants