Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add random syntax generator tests #7970

Merged
merged 1 commit into from
Aug 27, 2016
Merged

sql: add random syntax generator tests #7970

merged 1 commit into from
Aug 27, 2016

Conversation

maddyblue
Copy link
Contributor

@maddyblue maddyblue commented Jul 21, 2016

This test is designed to be run from a nightly job for some longish time
(~1h). Its goal is to find panics introduced in our implementation. Since
the executor is not the same go routine as the test, the executor has
been changed to include a panic handler that prepends the failed SQL
statement in a panic.

This uses the yacc package from the docs repo that generates the SQL
diagrams. It and the rsg package are included in internal because they
are not needed for the operation of the database, only for a test.


This change is Reviewable

@knz knz self-assigned this Jul 27, 2016
@knz
Copy link
Contributor

knz commented Aug 4, 2016

Reviewed 7 of 7 files at r1.
Review status: all files reviewed at latest revision, 13 unresolved discussions, some commit checks failed.


internal/rsg/rsg.go, line 93 [r1] (raw file):

          switch item.Value {
          case "IDENT":
              v = []string{"ident"}

Let's have a random selection here between (at least) two strings with mixed casing. I want to check casing does not throw things out of the window.


internal/rsg/yacc/lex.go, line 3 [r1] (raw file):

// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

Are you copying (part of) the go yacc code here? If so I'd suggest adding to the header comment 1) where you found this code and how it could be updated, if needed 2) highlighting that this is not part of CockroachDB, 3) should not be linked in a CockroachDB binary, and 4) the code is merely bundled in the same repo but we're not responsible for it.


internal/rsg/yacc/node.go, line 3 [r1] (raw file):

// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

Ditto


internal/rsg/yacc/parse.go, line 3 [r1] (raw file):

// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

Ditto


sql/executor.go, line 362 [r1] (raw file):

  session.planner.semaCtx.Placeholders.Assign(pinfo)

  defer func() {

Please extract this to a separate commit or PR. It's a generally useful addition independently of the RSG.


sql/rsg_test.go, line 70 [r1] (raw file):

          default:
          }
          s := r.Generate("stmt", 20)

Pull the magic value into a named constant.


sql/rsg_test.go, line 79 [r1] (raw file):

      }
  }
  for i := 0; i < 200; i++ {

Make this configurable via env var, then modify top level makefile to add this env var and a comment to explain what it does.


sql/rsg_test.go, line 107 [r1] (raw file):

      t.Fatal(err)
  }
  if _, err := db.Exec(`CREATE DATABASE IF NOT EXISTS ident; CREATE TABLE IF NOT EXISTS ident.ident ();`); err != nil {

I think you can add some "ident" column in here.
Actually I'd recommend:

create database ident1
create database ident2
create table if not exists ident1.ident1(ident1 decimal, ident2 decimal, index ident1(ident1), index ident2(ident2))
create table if not exists ident2.ident1(ident1 decimal, ident2 decimal, index ident1(ident1), index ident2(ident2))
create table if not exists ident1.ident2(ident1 decimal, ident2 decimal, index ident1(ident1), index ident2(ident2))
create table if not exists ident2.ident2(ident1 decimal, ident2 decimal, index ident1(ident1), index ident2(ident2))

Then generate a random selection for identifiers out of []string{"ident1","ident2","IdEnT1", "iDeNt2"}

Then also generate random CREATE TABLE always with 2 columns and also separate CREATE INDEX statements, all using the same identifiers (so they work on the same tables) but with mixed case, and check that we get an error every time (I am 70% suspicious we are missing something with case normalization).

Then also generate some INSERT and DELETE statements for these automatically.


sql/rsg_test.go, line 121 [r1] (raw file):

          default:
          }
          targets := r.Generate("target_list", 30)

ditto re. constant


sql/rsg_test.go, line 136 [r1] (raw file):

      }
  }
  for i := 0; i < 200; i++ {

dito re. loop bound


sql/rsg_test.go, line 214 [r1] (raw file):

      }
  }
  for i := 0; i < 200; i++ {

ditto


sql/rsg_test.go, line 253 [r1] (raw file):

          default:
          }
          expr := r.Generate("func_expr_common_subexpr", 30)

ditto


sql/rsg_test.go, line 259 [r1] (raw file):

      }
  }
  for i := 0; i < 200; i++ {

ditto


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Aug 16, 2016

Since #8152 is in this can probably merge soon now.

@maddyblue
Copy link
Contributor Author

Review status: 5 of 63 files reviewed at latest revision, 13 unresolved discussions, some commit checks pending.


internal/rsg/rsg.go, line 93 [r1] (raw file):

Previously, knz (kena) wrote…

Let's have a random selection here between (at least) two strings with mixed casing. I want to check casing does not throw things out of the window.

Similar to my objection to this above, if you are worried about casing causing errors, let's write a test for that. But it doesn't seem like it belongs here.

internal/rsg/yacc/lex.go, line 3 [r1] (raw file):

Previously, knz (kena) wrote…

Are you copying (part of) the go yacc code here? If so I'd suggest adding to the header comment 1) where you found this code and how it could be updated, if needed 2) highlighting that this is not part of CockroachDB, 3) should not be linked in a CockroachDB binary, and 4) the code is merely bundled in the same repo but we're not responsible for it.

This parser was written by copying the code from the text/template/parse package and heavily modifying it for yacc. That is, I'm using the same framework and setup, but different lex/parse logic. We already have a file like this: https://github.com/cockroachdb/cockroach/blob/develop/util/leaktest/leaktest.go, and that file doesn't have any of the notices you list. I'd prefer to leave this header as is until we have some guidance about a better way to handle this situation.

sql/executor.go, line 362 [r1] (raw file):

Previously, knz (kena) wrote…

Please extract this to a separate commit or PR. It's a generally useful addition independently of the RSG.

Done.

sql/rsg_test.go, line 70 [r1] (raw file):

Previously, knz (kena) wrote…

Pull the magic value into a named constant.

Done.

sql/rsg_test.go, line 79 [r1] (raw file):

Previously, knz (kena) wrote…

Make this configurable via env var, then modify top level makefile to add this env var and a comment to explain what it does.

How about a flag since that's where the rest of the stuff that enables RSG is also?

sql/rsg_test.go, line 107 [r1] (raw file):

Previously, knz (kena) wrote…

I think you can add some "ident" column in here.
Actually I'd recommend:

create database ident1
create database ident2
create table if not exists ident1.ident1(ident1 decimal, ident2 decimal, index ident1(ident1), index ident2(ident2))
create table if not exists ident2.ident1(ident1 decimal, ident2 decimal, index ident1(ident1), index ident2(ident2))
create table if not exists ident1.ident2(ident1 decimal, ident2 decimal, index ident1(ident1), index ident2(ident2))
create table if not exists ident2.ident2(ident1 decimal, ident2 decimal, index ident1(ident1), index ident2(ident2))

Then generate a random selection for identifiers out of []string{"ident1","ident2","IdEnT1", "iDeNt2"}

Then also generate random CREATE TABLE always with 2 columns and also separate CREATE INDEX statements, all using the same identifiers (so they work on the same tables) but with mixed case, and check that we get an error every time (I am 70% suspicious we are missing something with case normalization).

Then also generate some INSERT and DELETE statements for these automatically.

These new tests you describe may be helpful, but they aren't random syntax tests like these, so I think they should be somewhere else in another PR. RSG tests are only trying to find panics, not errors.

I've added an ident column to the table because I think that has value as it may be specified in some queries.


sql/rsg_test.go, line 121 [r1] (raw file):

Previously, knz (kena) wrote…

ditto re. constant

Done.

sql/rsg_test.go, line 136 [r1] (raw file):

Previously, knz (kena) wrote…

dito re. loop bound

Done.

sql/rsg_test.go, line 214 [r1] (raw file):

Previously, knz (kena) wrote…

ditto

Done.

sql/rsg_test.go, line 253 [r1] (raw file):

Previously, knz (kena) wrote…

ditto

Done.

sql/rsg_test.go, line 259 [r1] (raw file):

Previously, knz (kena) wrote…

ditto

Done.

Comments from Reviewable

@maddyblue maddyblue changed the base branch from master to develop August 18, 2016 08:02
@knz
Copy link
Contributor

knz commented Aug 25, 2016

Reviewed 3 of 3 files at r2.
Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


internal/rsg/rsg.go, line 93 [r1] (raw file):

Previously, mjibson (Matt Jibson) wrote…

Similar to my objection to this above, if you are worried about casing causing errors, let's write a test for that. But it doesn't seem like it belongs here.

Again it will ensure that the NormalizeName method is called everywhere it should. Of course we can also write tests but due to the sheer number of places where NormalizeName is used we are likely to forget a few tests and that's what fuzz testing can help us detect.

internal/rsg/yacc/lex.go, line 3 [r1] (raw file):

Previously, mjibson (Matt Jibson) wrote…

This parser was written by copying the code from the text/template/parse package and heavily modifying it for yacc. That is, I'm using the same framework and setup, but different lex/parse logic. We already have a file like this: https://github.com/cockroachdb/cockroach/blob/develop/util/leaktest/leaktest.go, and that file doesn't have any of the notices you list. I'd prefer to leave this header as is until we have some guidance about a better way to handle this situation.

I think you can still _append_ a comment that states exactly what you explain above "this code was produced by copying the code from .... "

sql/rsg_test.go, line 79 [r1] (raw file):

Previously, mjibson (Matt Jibson) wrote…

How about a flag since that's where the rest of the stuff that enables RSG is also?

Yep, perfect.

sql/rsg_test.go, line 107 [r1] (raw file):

Previously, mjibson (Matt Jibson) wrote…

These new tests you describe may be helpful, but they aren't random syntax tests like these, so I think they should be somewhere else in another PR. RSG tests are only trying to find panics, not errors.

I've added an ident column to the table because I think that has value as it may be specified in some queries.

So what I am proposing is to create the schema described above at the start of the test then let the random generator produce random combinations of ident1 and ident2. This way it will exercise all combinations of name lookups in different parts of expressions (db.table.col, tb.table, table.col etc).

Comments from Reviewable

@maddyblue
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


internal/rsg/rsg.go, line 93 [r1] (raw file):

Previously, knz (kena) wrote…

Again it will ensure that the NormalizeName method is called everywhere it should. Of course we can also write tests but due to the sheer number of places where NormalizeName is used we are likely to forget a few tests and that's what fuzz testing can help us detect.

What happens if that method was not called where it should be? A panic or an error?

internal/rsg/yacc/lex.go, line 3 [r1] (raw file):

Previously, knz (kena) wrote…

I think you can still append a comment that states exactly what you explain above "this code was produced by copying the code from .... "

Done.

internal/rsg/yacc/node.go, line 3 [r1] (raw file):

Previously, knz (kena) wrote…

Ditto

Done.

internal/rsg/yacc/parse.go, line 3 [r1] (raw file):

Previously, knz (kena) wrote…

Ditto

Done.

sql/rsg_test.go, line 107 [r1] (raw file):

Previously, knz (kena) wrote…

So what I am proposing is to create the schema described above at the start of the test then let the random generator produce random combinations of ident1 and ident2. This way it will exercise all combinations of name lookups in different parts of expressions (db.table.col, tb.table, table.col etc).

If I add this, do you expect panics or errors?

Comments from Reviewable

This test is designed to be run from a nightly job for some longish time
(~1h). Its goal is to find panics introduced in our implementation. Since
the executor is not the same go routine as the test, the executor has
been changed to include a panic handler that prepends the failed SQL
statement in a panic.

This uses the yacc package from the docs repo that generates the SQL
diagrams. It and the rsg package are included in internal because they
are not needed for the operation of the database, only for a test.
@maddyblue
Copy link
Contributor Author

I've asked about panic vs error because it will inform where the NormalizeName testing goes. If these things produce panics, then they can go into the existing places. If you expect errors, then they will have to go into a new test that tests for lack of errors, since the current ones can produce valid, error-free statements.


Review status: 3 of 6 files reviewed at latest revision, 6 unresolved discussions, some commit checks pending.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Aug 26, 2016

Reviewed 3 of 3 files at r3.
Review status: 5 of 6 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


internal/rsg/rsg.go, line 93 [r1] (raw file):

Previously, mjibson (Matt Jibson) wrote…

What happens if that method was not called where it should be? A panic or an error?

This in combination with the other thing would guarantee that all generated names can be resolved (if there's a failure, it shouldn't be related to name lookup any more). So if we observe a name lookup error after this, there would be a normalization bug or a lookup bug.

sql/rsg_test.go, line 107 [r1] (raw file):

Previously, mjibson (Matt Jibson) wrote…

If I add this, do you expect panics or errors?

Both actually. (truth be told, I'd rather have neither but oh well). The most important check is that once this schema goes in, there should not be any error related to name look up any more (name ... not found). If we find any, that means a lookup was not implemented properly.

Comments from Reviewable

@maddyblue
Copy link
Contributor Author

Review status: 5 of 6 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


internal/rsg/rsg.go, line 93 [r1] (raw file):

Previously, knz (kena) wrote…

This in combination with the other thing would guarantee that all generated names can be resolved (if there's a failure, it shouldn't be related to name lookup any more). So if we observe a name lookup error after this, there would be a normalization bug or a lookup bug.

Ok. I'll add this stuff in a followup PR. There's enough new stuff here that I'd need to figure out and test that I'd prefer to do it separately since this PR is already like a month old.

Comments from Reviewable

@knz
Copy link
Contributor

knz commented Aug 27, 2016

LGTM! Very nice work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants