-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sqlsmith: make order-dependent aggregation functions deterministic #84324
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the new smith
command! The fix looks good.
Reviewed 6 of 6 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @michae2)
pkg/cmd/smith/main.go
line 35 at r2 (raw file):
for { fmt.Print("\n", smither.Generate(), ";\n")
Maybe in the future we can add a command-line option to print a set number of statements.
d1afa9a
to
a17fc21
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFYR!
bors r=msirek
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @msirek)
pkg/cmd/smith/main.go
line 35 at r2 (raw file):
Previously, msirek (Mark Sirek) wrote…
Maybe in the future we can add a command-line option to print a set number of statements.
Done. Also added some help text.
Build failed: |
IIRC sqlsmith looks at the existing table schema to generate queries. What schema is used ins this case? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new smith
command will be very useful!
I am slightly worried that the second commit (along with other recent changes) are reducing the efficacy of sqlsmith's in achieving its original goal - to find tests cases that cause panics. The more we constrain the output of sqlsmith to better support randomized correctness testing, the less effective it will become at randomized panic testing. Maybe should consider implementing two SQLSmith modes - one with almost no restrictions that can be used to catch panics, and one that is as restrictive as necessary to get deterministic results in randomized correctness tests.
Reviewed 4 of 6 files at r1, 3 of 3 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @michae2 and @msirek)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can set a flag when running any of the tests which compare results, and only use the ORDER BY
if that flag is true
.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @michae2 and @msirek)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
bors r-
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @msirek)
pkg/cmd/smith/main.go
line 65 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
IIRC sqlsmith looks at the existing table schema to generate queries. What schema is used ins this case?
public
Sorry, my question was not specific. I'm curious what tables SQLSmith is generating statements for when running this command. It doesn't seem like any tables would exist, unless they are being created somewhere here that I'm missing.
Sounds good to me. Maybe rename it to reflect it's increase influence? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @msirek)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I went hog-wild and added to smith
:
- passthrough of simple sqlsmith options
- option to connect to a database for schema info
- new
MutatingMode
for sqlsmith based on mutating smither from TLP / costfuzz / unoptimized-query-oracle - ability to generate expressions instead of statements
To answer your question, @mgartner, without a db connection sqlsmith can only generate expressions, SELECT
statements (over VALUES
tables), CREATE TABLE
statements, and probably a few other things. But no mutations.
As for the original issue, I extended DisableImpureFns
to cover adding the ORDER BY
and renamed it to DisableNondeterministicFns
.
Since I changed so much, I'll wait for another look whenever you have time (no rush).
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @msirek)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates look good
Add a new top-level command `smith` which dumps randomly-generated sqlsmith queries. This is useful for testing modifications to sqlsmith. Assists: cockroachdb#83024 Release note: None
Some aggregation functions (e.g. string_agg) have results that depend on the order of input rows. To make sqlsmith more deterministic, add ORDER BY clauses to these aggregation functions whenever their argument is a column reference. (When their argument is a constant, ordering doesn't matter.) Fixes: cockroachdb#83024 Release note: None
Thanks! bors r=msirek,mgartner |
Build succeeded: |
cmd: add smith command
Add a new top-level command
smith
which dumps randomly-generatedsqlsmith queries. This is useful for testing modifications to sqlsmith.
Assists: #83024
Release note: None
sqlsmith: make order-dependent aggregation functions deterministic
Some aggregation functions (e.g. string_agg) have results that depend
on the order of input rows. To make sqlsmith more deterministic, add
ORDER BY clauses to these aggregation functions whenever their argument
is a column reference. (When their argument is a constant, ordering
doesn't matter.)
Fixes: #83024
Release note: None