Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix and enhance expression normalization. #7512

Merged
merged 1 commit into from
Jul 10, 2016

Conversation

knz
Copy link
Contributor

@knz knz commented Jun 28, 2016

This patch is needed so that the expressions in the Sqllogictest do
not cause the server to panic.

Summary:

  • Bug fixes:
    • an invalid cast was attempted for some expressions; is now fixed.
    • NULL caused invalid transformations, NULL is now normalized on a
      fast path.
    • transformations that rebalance arithmetic across a comparison now
      first check that the proposed replacement arithmetic expression
      has a valid overload w.r.t typing.
  • Improvements: more simplifications.

Fixes #7412.


This change is Reviewable

@knz
Copy link
Contributor Author

knz commented Jun 28, 2016

cc @petermattis @nvanbenschoten.

@knz knz force-pushed the fix-normalize branch from 03cc46d to 662245b Compare June 28, 2016 21:12
@tamird
Copy link
Contributor

tamird commented Jun 28, 2016

Reviewed 2 of 4 files at r1, 2 of 2 files at r2.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


sql/parser/col_types.go, line 275 [r2] (raw file):

      return &BytesColType{"BYTES"}
  }
  panic("unknown type")

log the type


sql/parser/normalize.go, line 611 [r2] (raw file):

}

// CanBeNumericZero returns true if the expr may be a number and

i don't think that's what it does...


sql/parser/normalize.go, line 654 [r2] (raw file):

      return NewDInt(0)
  }
  panic("zero not defined here")

log the type


Comments from Reviewable

@knz knz force-pushed the fix-normalize branch from 662245b to c8d00e9 Compare June 29, 2016 20:03
@knz
Copy link
Contributor Author

knz commented Jun 29, 2016

Review status: 0 of 4 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


sql/parser/col_types.go, line 275 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

log the type

Done.

sql/parser/normalize.go, line 611 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

i don't think that's what it does...

This is what it should do. Is there a bug in the code?

sql/parser/normalize.go, line 654 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

log the type

Done.

Comments from Reviewable

@knz knz force-pushed the fix-normalize branch from c8d00e9 to d409e9d Compare June 29, 2016 20:18
@tamird
Copy link
Contributor

tamird commented Jun 29, 2016

Reviewed 4 of 4 files at r3.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


sql/parser/col_types.go, line 275 [r2] (raw file):

Previously, knz (kena) wrote…

Done.

ditto

sql/parser/normalize.go, line 611 [r2] (raw file):

Previously, knz (kena) wrote…

This is what it should do. Is there a bug in the code?

It currently returns true if the argument is e.g. a `*DString`

sql/parser/normalize.go, line 654 [r2] (raw file):

Previously, knz (kena) wrote…

Done.

This convention is weak, but we use `log.Fatalf` to avoid the double call.

Comments from Reviewable

@knz knz force-pushed the fix-normalize branch from d409e9d to e85afbc Compare June 29, 2016 21:23
@knz
Copy link
Contributor Author

knz commented Jun 29, 2016

Review status: 2 of 5 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


sql/parser/col_types.go, line 275 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

ditto

Done.

sql/parser/normalize.go, line 611 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

It currently returns true if the argument is e.g. a *DString

Aha, I see. I have extended the logic and also the explanatory comment accordingly. Thanks for spotting this.

sql/parser/normalize.go, line 654 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

This convention is weak, but we use log.Fatalf to avoid the double call.

Done.

Comments from Reviewable

@knz knz force-pushed the fix-normalize branch from e85afbc to 05e2f7f Compare June 29, 2016 21:29
@knz
Copy link
Contributor Author

knz commented Jun 29, 2016

Review status: 2 of 5 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending.


sql/parser/normalize.go, line 715 [r5] (raw file):

  }
  log.Fatalf("zero not defined for Datum type %T", d)
  return nil // unreachable

@tamird hmm are you ok with having to explicitly write "return" after a call to log.Fatalf? Can we inform the compiler this does not return?


Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Jun 29, 2016

Reviewed 3 of 3 files at r5.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


sql/parser/normalize.go, line 676 [r5] (raw file):

      }
      if _, ok := d.(dividerDatum); ok {
          log.Errorf("bug: unknown dividerDatum in IsValidDivider(): %T", d)

oh good god. how about an error return?


sql/parser/normalize.go, line 715 [r5] (raw file):

Previously, knz (kena) wrote…

@tamird hmm are you ok with having to explicitly write "return" after a call to log.Fatalf? Can we inform the compiler this does not return?

Yeah, that's a bummer. I don't know if we can.

By the way, I'm pretty grumpy about these panics, since future changes (like adding a type) will suddenly open up a myriad ways of crashing the server. How about we bite the bullet and plumb errors through? Can be a followup.


Comments from Reviewable

@knz
Copy link
Contributor Author

knz commented Jun 29, 2016

Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


sql/parser/normalize.go, line 676 [r5] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

oh good god. how about an error return?

But that's the thing here. Normalization is just an optimization, there is no reason to cause the (likely valid) user queries to fail with an error although we didn't finish the job here properly.

sql/parser/normalize.go, line 715 [r5] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Yeah, that's a bummer. I don't know if we can.

By the way, I'm pretty grumpy about these panics, since future changes (like adding a type) will suddenly open up a myriad ways of crashing the server. How about we bite the bullet and plumb errors through? Can be a followup.

Ok perhaps there is wisdom for this one, since cannot complete the work safely at this point. What is the new best way to signal an "internal" error not caused by a problem in the user's query?

Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Jun 29, 2016

Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


sql/parser/normalize.go, line 715 [r5] (raw file):

Previously, knz (kena) wrote…

Ok perhaps there is wisdom for this one, since cannot complete the work safely at this point. What is the new best way to signal an "internal" error not caused by a problem in the user's query?

just return an error =/

Comments from Reviewable

@knz knz force-pushed the fix-normalize branch from 05e2f7f to 50ffd0f Compare June 30, 2016 15:15
@knz
Copy link
Contributor Author

knz commented Jun 30, 2016

Review status: 3 of 5 files reviewed at latest revision, 2 unresolved discussions.


sql/parser/normalize.go, line 676 [r5] (raw file):

Previously, knz (kena) wrote…

But that's the thing here. Normalization is just an optimization, there is no reason to cause the (likely valid) user queries to fail with an error although we didn't finish the job here properly.

I added the error as discussed on the mailing list.

sql/parser/normalize.go, line 715 [r5] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

just return an error =/

Done.

Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Jun 30, 2016

Reviewed 2 of 2 files at r6.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


sql/parser/normalize.go, line 705 [r6] (raw file):

      }
      if _, ok := d.(dividerDatum); ok {
          return true, errors.Errorf("internal error: unknown dividerDatum in IsValidDivider(): %T", d)

what is IsValidDivider()?


Comments from Reviewable

@knz knz force-pushed the fix-normalize branch from 50ffd0f to 3457e13 Compare June 30, 2016 15:21
@knz
Copy link
Contributor Author

knz commented Jun 30, 2016

Review status: 4 of 5 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


sql/parser/normalize.go, line 705 [r6] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

what is IsValidDivider()?

It's the name of the function, obviously. Why do you ask?

Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Jun 30, 2016

Reviewed 1 of 1 files at r7.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


sql/parser/normalize.go, line 705 [r6] (raw file):

Previously, knz (kena) wrote…

It's the name of the function, obviously. Why do you ask?

For a friend.

Comments from Reviewable

@nvanbenschoten
Copy link
Member

Do you mind running a SQL benchmark diff over this change, just to make sure there weren't any exceptionally inefficient normalization steps introduced.


Reviewed 1 of 2 files at r6, 4 of 4 files at r8.
Review status: all files reviewed at latest revision, 15 unresolved discussions, all commit checks successful.


sql/parser/col_types.go, line 259 [r8] (raw file):

  switch d.(type) {
  case *DInt:
      return &IntColType{"INT", 0}, nil

Don't we already have all of these defined (see intColTypeInt, floatColTypeFloat, etc.)? If the result of this function is always immutable, they could be returned to avoid allocations and duplicated logic.


sql/parser/expr.go, line 994 [r8] (raw file):

}

// newBinExprIfValidOverloads constructs a new BinaryExpr if and only

s/newBinExprIfValidOverloads/newBinExprIfValidOverload/


sql/parser/expr.go, line 995 [r8] (raw file):

// newBinExprIfValidOverloads constructs a new BinaryExpr if and only
// if the pair of arguments have a valid implementation.

s/implementation/implementation for the given BinaryOperator/


sql/parser/expr.go, line 1006 [r8] (raw file):

          fn:       fn,
      }
      expr.typ = fn.ReturnType

nit: We usually use the overloadImpl.returnType() method for this use case


sql/parser/normalize.go, line 34 [r8] (raw file):

func (expr *CoalesceExpr) normalize(v *normalizeVisitor) TypedExpr {
  for i := range expr.Exprs {

I like this normalization, but it's a bit more involved than the others. It could use a short comment.


sql/parser/normalize.go, line 37 [r8] (raw file):

      subExpr := expr.TypedExprAt(i)

      if i == len(expr.Exprs)-1 {

nit: Something like if last := (i == len(expr.Exprs)-1); last { can often be a harmless way to improve skim-ability.


sql/parser/normalize.go, line 67 [r8] (raw file):

          return expr
      }
      if d, err := GetBool(cond.(Datum)); err == nil {

Doesn't .Eval() return a Datum?


sql/parser/normalize.go, line 95 [r8] (raw file):

      }
      switch b := val.(type) {
      // -(a - b) -> (b - a)

Double checking: This property is true of all data types, right?


sql/parser/normalize.go, line 162 [r8] (raw file):

          cbz, err := CanBeZeroDivider(right)
          if err != nil {
              final, v.err = expr, err

We don't need final = expr in these situations.


sql/parser/normalize.go, line 228 [r8] (raw file):

      }
      if right != DNull {
          if d, err := GetBool(right.(Datum)); err == nil {

same as above, isn't right already a Datum


sql/parser/normalize.go, line 489 [r8] (raw file):

func (expr *ParenExpr) normalize(v *normalizeVisitor) TypedExpr {
  newExpr := expr.TypedInnerExpr()
  if normalizeable, ok := newExpr.(normalizableExpr); ok {

I was recently thinking about this. Does the walk infrastructure/normalizeVisitor take care of recursing into modified returned expressions?


sql/parser/normalize.go, line 621 [r8] (raw file):

      return GT, nil
  default:
      return op, errors.Errorf("internal error: unable to invert: %s", op)

If this is a common case, then allocating a new error every time is going to be an issue.


sql/parser/normalize.go, line 728 [r8] (raw file):

          return *t == 0, nil
      }
      if _, ok := d.(dividerDatum); ok {

Can this be added to the type switch?


sql/parser/type_check.go, line 58 [r8] (raw file):

// are added.
type dividerDatum interface {
  isDividerDatum()

nit: This doesn't seem to fit our naming convention for "annotation methods". It should probably just be dividerDatum().


sql/parser/type_check.go, line 61 [r8] (raw file):

}

var _ dividerDatum = NewDInt(0)

Can these be changed to var _ dividerDatum = TypeInt, etc.?


Comments from Reviewable

@knz knz force-pushed the fix-normalize branch from 2c0c916 to 5092b40 Compare July 8, 2016 15:43
@knz
Copy link
Contributor Author

knz commented Jul 8, 2016

Running the benchmarks now. Will post the results later.


Review status: 1 of 5 files reviewed at latest revision, 15 unresolved discussions, some commit checks pending.


sql/parser/col_types.go, line 259 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Don't we already have all of these defined (see intColTypeInt, floatColTypeFloat, etc.)? If the result of this function is always immutable, they could be returned to avoid allocations and duplicated logic.

Done.

sql/parser/expr.go, line 994 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/newBinExprIfValidOverloads/newBinExprIfValidOverload/

Done.

sql/parser/expr.go, line 995 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/implementation/implementation for the given BinaryOperator/

Done.

sql/parser/expr.go, line 1006 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: We usually use the overloadImpl.returnType() method for this use case

Done.

sql/parser/normalize.go, line 34 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I like this normalization, but it's a bit more involved than the others. It could use a short comment.

Done.

sql/parser/normalize.go, line 37 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: Something like if last := (i == len(expr.Exprs)-1); last { can often be a harmless way to improve skim-ability.

Nice! I wasn't aware of this idiom.

sql/parser/normalize.go, line 67 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Doesn't .Eval() return a Datum?

It does! Thanks for the hint.

sql/parser/normalize.go, line 95 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Double checking: This property is true of all data types, right?

Woops, I checked and indeed it's not guaranteed. I added a conditional for this.

sql/parser/normalize.go, line 162 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We don't need final = expr in these situations.

Done.

sql/parser/normalize.go, line 228 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

same as above, isn't right already a Datum

It was to avoid an extra temporary. I added it.

sql/parser/normalize.go, line 489 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I was recently thinking about this. Does the walk infrastructure/normalizeVisitor take care of recursing into modified returned expressions?

It doesn't; but neither should it need to. The logic where it would make a difference is already inlined in the normalize function (for ComparisonExpr mostly). For the other cases it would result in duplicated work.

Note how I did have to insert the extra related necessary calls for normalize of RangeCond.


sql/parser/normalize.go, line 621 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

If this is a common case, then allocating a new error every time is going to be an issue.

There was a panic() here before, so it's not common.

sql/parser/normalize.go, line 728 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can this be added to the type switch?

Done.

sql/parser/type_check.go, line 58 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: This doesn't seem to fit our naming convention for "annotation methods". It should probably just be dividerDatum().

Done.

sql/parser/type_check.go, line 61 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can these be changed to var _ dividerDatum = TypeInt, etc.?

Done.

Comments from Reviewable

@knz knz force-pushed the fix-normalize branch from 5092b40 to 6913ca1 Compare July 8, 2016 16:21
@knz
Copy link
Contributor Author

knz commented Jul 8, 2016

Review status: 1 of 5 files reviewed at latest revision, 15 unresolved discussions, some commit checks pending.


sql/parser/type_check.go, line 61 [r8] (raw file):

Previously, knz (kena) wrote…

Done.

Scrub that; no it can't. TypeXXX instances are typed as Datum, here we want the particular types.

Comments from Reviewable

@knz
Copy link
Contributor Author

knz commented Jul 8, 2016

So the only benchmark that has expressions which would test something related are Select2 and Select3. This is what I measure:

name                 old time/op  new time/op  delta
Select2_Cockroach-8   467µs ± 1%   469µs ± 1%  +0.32%  (p=0.031 n=19+18)
Select3_Cockroach-8   611µs ± 1%   612µs ± 1%    ~     (p=0.461 n=19+20)

@nvanbenschoten
Copy link
Member

:lgtm: Besides a few comments.


Review status: 1 of 5 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


sql/parser/expr.go, line 996 [r10] (raw file):

// newBinExprIfValidOverload constructs a new BinaryExpr if and only
// if the pair of arguments have a valid implementation for the given
// BinaryOperator..

s/.././


sql/parser/normalize.go, line 728 [r8] (raw file):

Previously, knz (kena) wrote…

Done.

Even better, does `case dividerDatum:` work?

sql/parser/normalize.go, line 113 [r10] (raw file):

              return b
          }
      // - (- a) -> a

Same question here actually. Is this true for all datum types?


sql/parser/type_check.go, line 61 [r8] (raw file):

Previously, knz (kena) wrote…

Scrub that; no it can't. TypeXXX instances are typed as Datum, here we want the particular types.

Ahh ok that's too bad.

Comments from Reviewable

@knz knz force-pushed the fix-normalize branch from 6913ca1 to 78db8f6 Compare July 8, 2016 19:53
@knz
Copy link
Contributor Author

knz commented Jul 8, 2016

Review status: 1 of 5 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


sql/parser/expr.go, line 996 [r10] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/.././

Done.

sql/parser/normalize.go, line 728 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Even better, does case dividerDatum: work?

Apparently it does, but somehow it feels a bit uncomfortable: the dividerDatum case is inclusive of the cases before. Are you sure that would be an improvement?

sql/parser/normalize.go, line 113 [r10] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Same question here actually. Is this true for all datum types?

Yes that is true.

Comments from Reviewable

@nvanbenschoten
Copy link
Member

Review status: 1 of 5 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


sql/parser/normalize.go, line 728 [r8] (raw file):

Previously, knz (kena) wrote…

Apparently it does, but somehow it feels a bit uncomfortable: the dividerDatum case is inclusive of the cases before. Are you sure that would be an improvement?

That seems the most natural to me. It's basically a restricted default case, which is exactly what we want and definitely an improvement. You could add a `// Catch unhandled dividerDatum impl types` comment if you want.

The only other thing that might be clearer would be to have another type assertion to dividerDatum above (which wraps this entire type switch), and use a default case.


Comments from Reviewable

This patch is needed so that the expressions in the Sqllogictest do
not cause the server to panic.

Summary:

- Bug fixes:

  - an invalid cast was attempted for some expressions; is now fixed.
  - NULL caused invalid transformations, NULL is now normalized on a
    fast path.
  - transformations that rebalance arithmetic across a comparison now
    first check that the proposed replacement arithmetic expression
    has a valid overload w.r.t typing.

- Improvements: more simplifications.
@knz knz force-pushed the fix-normalize branch from 78db8f6 to b6fcfbd Compare July 9, 2016 11:44
@knz
Copy link
Contributor Author

knz commented Jul 9, 2016

Review status: 0 of 5 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


sql/parser/normalize.go, line 728 [r8] (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

That seems the most natural to me. It's basically a restricted default case, which is exactly what we want and definitely an improvement. You could add a // Catch unhandled dividerDatum impl types comment if you want.

The only other thing that might be clearer would be to have another type assertion to dividerDatum above (which wraps this entire type switch), and use a default case.

Ah I like your last proposal. I just implemented that, PTAL.

Comments from Reviewable

@nvanbenschoten
Copy link
Member

LGerTM


Reviewed 1 of 1 files at r10, 3 of 3 files at r12.
Review status: 4 of 5 files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


Comments from Reviewable

@knz knz merged commit 3348759 into cockroachdb:master Jul 10, 2016
@knz knz deleted the fix-normalize branch July 10, 2016 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: panic on comparison lookup
4 participants