Misc. datadriven improvements #2

knz · 2019-11-12T14:55:11Z

See individual commits for details.

This change is

tbg

Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 1 of 1 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @andy-kimball and @knz)

datadriven.go, line 211 at r1 (raw file):

// Ignore files named .XXXX, XXX~ or #XXX#.
var tempFileRe = regexp.MustCompile(`(^\..*)|(.*~$)|(^#.*#$)`)

I don't know if I like that, but I guess it's fine.

datadriven.go, line 107 at r3 (raw file):

					}
				}()
				d.ContinueAfterError = false

Why this?

datadriven.go, line 268 at r3 (raw file):

	//   ContinueAfterError determines what the datadriven harness does
	//   after the error occurs.
	ContinueAfterError bool

This looks like an unfortunate addition of complexity. It seems to me that t.Error() should unconditionally fail the test, including its rewrite. If this isn't sufficient, I'd like to know why. (And it'd need testing).

line_parser.go, line 61 at r2 (raw file):

// splits a directive line into tokens, where each token is
// either:
//  - a,list,of,things

So there's no special handling here, right? You just get "a,list,of,things" and not multiple args, at least that's what the commit message suggests. Please add some of that newly introduced variety into the testdata in this repo.

knz

Reviewable status: 0 of 4 files reviewed, 4 unresolved discussions (waiting on @andy-kimball, @knz, and @tbg)

datadriven.go, line 107 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Why this?

Explained in comment.

datadriven.go, line 268 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This looks like an unfortunate addition of complexity. It seems to me that t.Error() should unconditionally fail the test, including its rewrite. If this isn't sufficient, I'd like to know why. (And it'd need testing).

The reasoning is the same as to why t.Fatal and t.Error exist side-by-side: to shorten development iteration loops, we want each iteration to report as many errors as possible (so that the dev can fix as many as possible within the iteration).

FWIW the same behavior is possible in SQL logic tests when running with -max-errors N

Of course this only makes sense for directives that do not change the state of the test environment. By default the datadriven package cannot assume this to be true, that's why the default is false.
I have enhanced the comments and commit message accordingly.

line_parser.go, line 61 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

So there's no special handling here, right? You just get "a,list,of,things" and not multiple args, at least that's what the commit message suggests. Please add some of that newly introduced variety into the testdata in this repo.

This has not changed from prior.
Added a test anyway.

tbg

Reviewed 3 of 3 files at r4, 4 of 4 files at r5, 1 of 1 files at r6.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @andy-kimball, @knz, and @tbg)

datadriven.go, line 268 at r3 (raw file):

Previously, knz (kena) wrote…

The reasoning is the same as to why t.Fatal and t.Error exist side-by-side: to shorten development iteration loops, we want each iteration to report as many errors as possible (so that the dev can fix as many as possible within the iteration).

FWIW the same behavior is possible in SQL logic tests when running with -max-errors N

Of course this only makes sense for directives that do not change the state of the test environment. By default the datadriven package cannot assume this to be true, that's why the default is false.
I have enhanced the comments and commit message accordingly.

I'm still not convinced that this complexity is a good idea, but I won't stop you.

knz · 2019-11-14T14:24:31Z

I'd be glad to receive a second opinion on this, e.g. @RaduBerinde who was making changes recently in this area.

RaduBerinde

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @andy-kimball, @knz, and @tbg)

datadriven.go, line 268 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I'm still not convinced that this complexity is a good idea, but I won't stop you.

Not convinced it's worth it either. If you have multiple issues with a file, usually you'll want to rerun that one specific test anyway so it shouldn't be a big deal to rerun multiple times. But I won't stop you either :)

The code changes LGTM

datadriven_test.go, line 50 at r6 (raw file):

	RunTestFromString(t, `
# NB: we allow duplicate args.
# ScanArgs simply picks the first occurrence.

Maybe we should use HasArg to fix this. The behavior is unintuitive, you'd expect the last occurrence to matter. Best thing would be to error out.

When designing sub-languages (sub-directives) it is useful to be able to reuse the same syntax as the main datadriven DSL. This commit exports the parser so code reuse can be simplified. Additionally, this commit extends the syntax to allow more argument strings: - `arg=` means `arg` key with empty value - `arg=1,2,3` (allow commas) - `arg=a.b.c` (allow periods) - `arg=a@b` (allow ampersands)

A reminder about error in tests: - a test that expects an error should not call t.Error()/t.Fatal() and instead print out the text of the expected error in the returned actual string. - t.Error()/t.Fatal() should be called if a test encounters an unexpected test failure. As to this commit: - previously, if a test failed with `t.Error` (not `t.Fatal`) and thus relinquishes control to the datadriven code, the datadriven code would still perform the rewrite of the expected output. With this change, the rewrite is skipped if the test is marked to fail by the inner test function. (Note that, as previously, a test for an expected error should simply print out the text of the error in the returned actual string) - previously, if a test failed with `t.Error` (not `t.Fatal`) the datadriven code would still attempt to process further directives in the same input file. With this change, a failed directive will stop further processing of the input file. This behavior can be overridden by a test by writing to the new `ContinueAfterError` field of `TestData`.

knz

Reviewable status: 2 of 4 files reviewed, 4 unresolved discussions (waiting on @andy-kimball, @knz, @RaduBerinde, and @tbg)

datadriven.go, line 268 at r3 (raw file):

Previously, RaduBerinde wrote…

Not convinced it's worth it either. If you have multiple issues with a file, usually you'll want to rerun that one specific test anyway so it shouldn't be a big deal to rerun multiple times. But I won't stop you either :)

The code changes LGTM

Ok with two dissenting opinions, I've removed this code.

datadriven_test.go, line 50 at r6 (raw file):

Previously, RaduBerinde wrote…

Maybe we should use HasArg to fix this. The behavior is unintuitive, you'd expect the last occurrence to matter. Best thing would be to error out.

I strongly disagree - I am relying on the current behavior elsewhere. Also I'm not keen on introducing code to de-duplicate when it's not going to be useful in 80% of the cases.

I'll document the behavior as part of the API.

knz · 2019-11-14T16:18:21Z

Thanks to both for the review!

RaduBerinde · 2019-11-14T16:43:15Z

datadriven_test.go, line 50 at r6 (raw file):

Previously, knz (kena) wrote…

I strongly disagree - I am relying on the current behavior elsewhere. Also I'm not keen on introducing code to de-duplicate when it's not going to be useful in 80% of the cases.

I'll document the behavior as part of the API.

Out of curiosity, what's your use case? I can't think of why you would want to do that (short of auto-generated test files or something), but maybe I lack imagination.

knz · 2019-11-14T16:53:34Z

Out of curiosity, what's your use case?

Here: cockroachdb/cockroach#42250

RaduBerinde · 2019-11-14T17:06:36Z

Interesting, though I still don't see where/why you would specify the same argument multiple times.

knz · 2019-11-14T21:01:59Z

Interesting, though I still don't see where/why you would specify the same argument multiple times.

Consider this:

with k=mykey
   put v=1
   put v=abc k=other # interleave a write on another key
   put v=2
   get

We want to omit/share the k=mykey as much as possible, but enable a temporary override where relevant.

This is implemented by appending the []CmdArg array specified by with to the CmdArgs of each subsequent directive, and relying on the "first match wins" rule to guarantee proper scoping/prioritization.

RaduBerinde · 2019-11-14T21:44:33Z

This is not an argument for the correct semantics, it's an implementation detail. You could easily have a with-specific function that removes duplicates instead of a simple append. You'd still want to disallow duplicate args within each directive, eg put v=1 k=other v=2.

RaduBerinde · 2019-11-14T21:45:31Z

Or, you could leave the "prefer first entry" logic in place but still error out if you see the same value in a single directive.

knz · 2019-11-14T21:51:53Z

Yes ok that's true. But do you think it's worth to add the extra logic to detect duplicates, where in the common case this will not occur?

knz · 2019-11-15T09:14:03Z

Actually I like it, it's not even super expensive to do.

knz requested review from tbg and andy-kimball November 12, 2019 14:55

tbg reviewed Nov 12, 2019

View reviewed changes

Skip over hidden and temporary files.

7bcb16a

knz force-pushed the 20191112-dd branch 2 times, most recently from a20b195 to 8e2cf0e Compare November 14, 2019 14:13

knz commented Nov 14, 2019

View reviewed changes

tbg approved these changes Nov 14, 2019

View reviewed changes

tbg self-requested a review November 14, 2019 14:20

RaduBerinde approved these changes Nov 14, 2019

View reviewed changes

knz added 2 commits November 14, 2019 17:14

knz force-pushed the 20191112-dd branch from 8e2cf0e to 547772e Compare November 14, 2019 16:15

knz commented Nov 14, 2019

View reviewed changes

knz merged commit 67fe190 into cockroachdb:master Nov 14, 2019

knz deleted the 20191112-dd branch November 14, 2019 16:18

knz mentioned this pull request Nov 15, 2019

Fix syntax error checks #4

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. datadriven improvements #2

Misc. datadriven improvements #2

knz commented Nov 12, 2019 •

edited

Loading

tbg left a comment

knz left a comment

tbg left a comment

knz commented Nov 14, 2019

RaduBerinde left a comment

knz left a comment

knz commented Nov 14, 2019

RaduBerinde commented Nov 14, 2019

knz commented Nov 14, 2019

RaduBerinde commented Nov 14, 2019

knz commented Nov 14, 2019

RaduBerinde commented Nov 14, 2019

RaduBerinde commented Nov 14, 2019

knz commented Nov 14, 2019

knz commented Nov 15, 2019

Misc. datadriven improvements #2

Misc. datadriven improvements #2

Conversation

knz commented Nov 12, 2019 • edited Loading

tbg left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

knz commented Nov 14, 2019

RaduBerinde left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

knz commented Nov 14, 2019

RaduBerinde commented Nov 14, 2019

knz commented Nov 14, 2019

RaduBerinde commented Nov 14, 2019

knz commented Nov 14, 2019

RaduBerinde commented Nov 14, 2019

RaduBerinde commented Nov 14, 2019

knz commented Nov 14, 2019

knz commented Nov 15, 2019

knz commented Nov 12, 2019 •

edited

Loading