Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc. datadriven improvements #2

Merged
merged 3 commits into from
Nov 14, 2019
Merged

Misc. datadriven improvements #2

merged 3 commits into from
Nov 14, 2019

Conversation

knz
Copy link
Contributor

@knz knz commented Nov 12, 2019

See individual commits for details.


This change is Reviewable

@knz knz requested review from tbg and andy-kimball November 12, 2019 14:55
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 1 of 1 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @andy-kimball and @knz)


datadriven.go, line 211 at r1 (raw file):

// Ignore files named .XXXX, XXX~ or #XXX#.
var tempFileRe = regexp.MustCompile(`(^\..*)|(.*~$)|(^#.*#$)`)

I don't know if I like that, but I guess it's fine.


datadriven.go, line 107 at r3 (raw file):

					}
				}()
				d.ContinueAfterError = false

Why this?


datadriven.go, line 268 at r3 (raw file):

	//   ContinueAfterError determines what the datadriven harness does
	//   after the error occurs.
	ContinueAfterError bool

This looks like an unfortunate addition of complexity. It seems to me that t.Error() should unconditionally fail the test, including its rewrite. If this isn't sufficient, I'd like to know why. (And it'd need testing).


line_parser.go, line 61 at r2 (raw file):

// splits a directive line into tokens, where each token is
// either:
//  - a,list,of,things

So there's no special handling here, right? You just get "a,list,of,things" and not multiple args, at least that's what the commit message suggests. Please add some of that newly introduced variety into the testdata in this repo.

@knz knz force-pushed the 20191112-dd branch 2 times, most recently from a20b195 to 8e2cf0e Compare November 14, 2019 14:13
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 4 unresolved discussions (waiting on @andy-kimball, @knz, and @tbg)


datadriven.go, line 107 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Why this?

Explained in comment.


datadriven.go, line 268 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This looks like an unfortunate addition of complexity. It seems to me that t.Error() should unconditionally fail the test, including its rewrite. If this isn't sufficient, I'd like to know why. (And it'd need testing).

The reasoning is the same as to why t.Fatal and t.Error exist side-by-side: to shorten development iteration loops, we want each iteration to report as many errors as possible (so that the dev can fix as many as possible within the iteration).

FWIW the same behavior is possible in SQL logic tests when running with -max-errors N

Of course this only makes sense for directives that do not change the state of the test environment. By default the datadriven package cannot assume this to be true, that's why the default is false.
I have enhanced the comments and commit message accordingly.


line_parser.go, line 61 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

So there's no special handling here, right? You just get "a,list,of,things" and not multiple args, at least that's what the commit message suggests. Please add some of that newly introduced variety into the testdata in this repo.

This has not changed from prior.
Added a test anyway.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r4, 4 of 4 files at r5, 1 of 1 files at r6.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @andy-kimball, @knz, and @tbg)


datadriven.go, line 268 at r3 (raw file):

Previously, knz (kena) wrote…

The reasoning is the same as to why t.Fatal and t.Error exist side-by-side: to shorten development iteration loops, we want each iteration to report as many errors as possible (so that the dev can fix as many as possible within the iteration).

FWIW the same behavior is possible in SQL logic tests when running with -max-errors N

Of course this only makes sense for directives that do not change the state of the test environment. By default the datadriven package cannot assume this to be true, that's why the default is false.
I have enhanced the comments and commit message accordingly.

I'm still not convinced that this complexity is a good idea, but I won't stop you.

@tbg tbg self-requested a review November 14, 2019 14:20
@knz
Copy link
Contributor Author

knz commented Nov 14, 2019

I'd be glad to receive a second opinion on this, e.g. @RaduBerinde who was making changes recently in this area.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @andy-kimball, @knz, and @tbg)


datadriven.go, line 268 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I'm still not convinced that this complexity is a good idea, but I won't stop you.

Not convinced it's worth it either. If you have multiple issues with a file, usually you'll want to rerun that one specific test anyway so it shouldn't be a big deal to rerun multiple times. But I won't stop you either :)

The code changes LGTM


datadriven_test.go, line 50 at r6 (raw file):

	RunTestFromString(t, `
# NB: we allow duplicate args.
# ScanArgs simply picks the first occurrence.

Maybe we should use HasArg to fix this. The behavior is unintuitive, you'd expect the last occurrence to matter. Best thing would be to error out.

knz added 2 commits November 14, 2019 17:14
When designing sub-languages (sub-directives) it is useful to be able
to reuse the same syntax as the main datadriven DSL. This commit
exports the parser so code reuse can be simplified.

Additionally, this commit extends the syntax to allow more argument
strings:

- `arg=` means `arg` key with empty value
- `arg=1,2,3` (allow commas)
- `arg=a.b.c` (allow periods)
- `arg=a@b` (allow ampersands)
A reminder about error in tests:

- a test that expects an error should not call t.Error()/t.Fatal() and
  instead print out the text of the expected error in the returned
  actual string.
- t.Error()/t.Fatal() should be called if a test encounters an unexpected
  test failure.

As to this commit:

- previously, if a test failed with `t.Error` (not `t.Fatal`) and thus
  relinquishes control to the datadriven code, the datadriven code would
  still perform the rewrite of the expected output.
  With this change, the rewrite is skipped if the test is marked
  to fail by the inner test function.
  (Note that, as previously, a test for an expected error should simply
  print out the text of the error in the returned actual string)

- previously, if a test failed with `t.Error` (not `t.Fatal`) the
  datadriven code would still attempt to process further directives in
  the same input file.
  With this change, a failed directive will stop further processing
  of the input file. This behavior can be overridden by a test
  by writing to the new `ContinueAfterError` field of `TestData`.
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 4 files reviewed, 4 unresolved discussions (waiting on @andy-kimball, @knz, @RaduBerinde, and @tbg)


datadriven.go, line 268 at r3 (raw file):

Previously, RaduBerinde wrote…

Not convinced it's worth it either. If you have multiple issues with a file, usually you'll want to rerun that one specific test anyway so it shouldn't be a big deal to rerun multiple times. But I won't stop you either :)

The code changes LGTM

Ok with two dissenting opinions, I've removed this code.


datadriven_test.go, line 50 at r6 (raw file):

Previously, RaduBerinde wrote…

Maybe we should use HasArg to fix this. The behavior is unintuitive, you'd expect the last occurrence to matter. Best thing would be to error out.

I strongly disagree - I am relying on the current behavior elsewhere. Also I'm not keen on introducing code to de-duplicate when it's not going to be useful in 80% of the cases.

I'll document the behavior as part of the API.

@knz
Copy link
Contributor Author

knz commented Nov 14, 2019

Thanks to both for the review!

@knz knz merged commit 67fe190 into cockroachdb:master Nov 14, 2019
@knz knz deleted the 20191112-dd branch November 14, 2019 16:18
@RaduBerinde
Copy link
Member


datadriven_test.go, line 50 at r6 (raw file):

Previously, knz (kena) wrote…

I strongly disagree - I am relying on the current behavior elsewhere. Also I'm not keen on introducing code to de-duplicate when it's not going to be useful in 80% of the cases.

I'll document the behavior as part of the API.

Out of curiosity, what's your use case? I can't think of why you would want to do that (short of auto-generated test files or something), but maybe I lack imagination.

@knz
Copy link
Contributor Author

knz commented Nov 14, 2019

Out of curiosity, what's your use case?

Here: cockroachdb/cockroach#42250

@RaduBerinde
Copy link
Member

Interesting, though I still don't see where/why you would specify the same argument multiple times.

@knz
Copy link
Contributor Author

knz commented Nov 14, 2019

Interesting, though I still don't see where/why you would specify the same argument multiple times.

Consider this:

with k=mykey
   put v=1
   put v=abc k=other # interleave a write on another key
   put v=2
   get

We want to omit/share the k=mykey as much as possible, but enable a temporary override where relevant.

This is implemented by appending the []CmdArg array specified by with to the CmdArgs of each subsequent directive, and relying on the "first match wins" rule to guarantee proper scoping/prioritization.

@RaduBerinde
Copy link
Member

This is not an argument for the correct semantics, it's an implementation detail. You could easily have a with-specific function that removes duplicates instead of a simple append. You'd still want to disallow duplicate args within each directive, eg put v=1 k=other v=2.

@RaduBerinde
Copy link
Member

Or, you could leave the "prefer first entry" logic in place but still error out if you see the same value in a single directive.

@knz
Copy link
Contributor Author

knz commented Nov 14, 2019

Yes ok that's true. But do you think it's worth to add the extra logic to detect duplicates, where in the common case this will not occur?

@knz
Copy link
Contributor Author

knz commented Nov 15, 2019

Actually I like it, it's not even super expensive to do.

@knz knz mentioned this pull request Nov 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants