sql: fix the handling of integer types #28690

knz · 2018-08-16T11:01:15Z

Addresses a large chunk of #26925.
Fixes #25098.
Informs #24686.

Prior to this patch, CockroachDB maintained an unnecessary distinction
between "INT" and "INTEGER", between "BIGINT" and "INT8", etc.

This distinction is unnecessary but also costly, as we were paying the
price of a "name" attribute in coltypes.TInt, with a string comparison
and hash table lookup on every use of the type.

What really matters is that the type shows up properly in
introspection; this has already been ensured by various
OID-to-pgcatalog mappings and the recently introduced
InformationSchemaTypeName().

Any distinction beyond that is unnecessary and can be dropped from the
implementation.

Release note: None

cockroach-teamcity · 2018-08-16T11:01:22Z

This change is

knz · 2018-08-16T13:10:38Z

The PR is not yet ready for review. Will ping.

knz · 2018-08-16T14:36:00Z

ok I think the review can start now.

What's missing:

fixing some more test
introduce a SQL migration to populate the missing Width field on FLOAT column descriptors.

BramGruneir

Reviewed 39 of 44 files at r1, 20 of 20 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/create_table.go, line 1405 at r1 (raw file):

	evalCtx *tree.EvalContext,
) error {
	if d.HasDefaultExpr() {

This code movement seems unrelated to this PR, but this check should indeed come at the start.

pkg/sql/information_schema.go, line 288 at r1 (raw file):

					tree.DNull,                                          // character_set_name
					dStringPtrOrEmpty(column.ComputeExpr),               // generation_expression
					tree.NewDString(column.Type.SQLString()),            // crdb_sql_type

This change needs to be addressed in docs, and should probably be mentioned in the release notes.

pkg/sql/show_columns.go, line 37 at r1 (raw file):

  IF(inames[1] IS NULL, ARRAY[]:::STRING[], inames) AS indices
FROM
  (SELECT column_name, crdb_sql_type, data_type, is_nullable, column_default, generation_expression, ordinal_position,

is data type used at all anymore?

pkg/sql/coltypes/aliases.go, line 143 at r1 (raw file):

// NewFloat creates a type alias for FLOAT with the given precision.
func NewFloat(prec int64) (*TFloat, error) {
	if prec < 1 {

just for style, perhaps a switch here would be a bit cleaner

pkg/sql/coltypes/arith.go, line 15 at r1 (raw file):

// permissions and limitations under the License.

package coltypes

This is just me personally, but I would really like to see each type get it's own file. This goes for all types. Maybe a future PR.

pkg/sql/coltypes/arith.go, line 30 at r1 (raw file):

func (node *TBool) TypeName() string { return "BOOL" }

// PGTypeName implements the ColTypeFormatter interface.

I'm a fan of ensuring an interface is implemented using the dummy declaration. Just makes me feel safer.
var _ foo.RequiredInterface = myType{}

pkg/sql/logictest/testdata/planner_test/distsql_agg, line 126 at r2 (raw file):

SELECT url FROM [EXPLAIN (DISTSQL) SELECT avg(c), sum(c), avg(d), sum(d) FROM data]
----
https://cockroachdb.github.io/distsqlplan/decode.html#eJy8lF-L4jAUxd_3U8h9WiFg80fH6VNkYEHYGRf_PC1lyZpLEZxGkhR2GfzuS1thWrFpwbpPmtuc_E7ODfcDMqPxTb2jg_gnUCDAgAAHAgIITCEhcLJmj84ZW2ypBEv9B-KIwCE75b4oJwT2xiLEH-AP_ogQw1b9PuIalUY7iYCARq8OxxJzsod3Zf9KrbwCAqvcxyPJiRSQnAmY3H-e6rxKEWJ6Jv3JizS1mCpv7GTaBG92r18lHQOBl9XubXv5X1ZZrcrGrUZYq5FPfp4Zq9GibsCTc9gqjdq8bnavv5YXXxe3vFEXxWqNmUYbjySdvCw2xe7F5tv31WI7H5ORpGQk-USK4qf1crxxOdq_v3TY_naQa6HNHttf1j8CNmwEHeRaBE-PjYD3j4APG0EHuRbB_LERiP4RiGEj6CDXInj-f4PuhpE1upPJHF4NvNsnR8UgRJ1iNTWdye0ef1izLzHVclXqyoJG56uvtFoss-pTYbAupkExa4jptZiFyR1oHlSLsFjc43saFM_C5Nk95KegeB4mz-8hP4d7FXU8k_Aju2Yn5y__AgAA___R7uSX

Why did these change?

pkg/sql/opt/exec/execbuilder/testdata/distsql_agg, line 126 at r2 (raw file):

SELECT url FROM [EXPLAIN (DISTSQL) SELECT avg(c), sum(c), avg(d), sum(d) FROM data]
----
https://cockroachdb.github.io/distsqlplan/decode.html#eJy8lF-L4jAUxd_3U8h9WiFg80fH6VNkYEHYGRf_PC1lyZpLEZxGkhR2GfzuS1thWrFpwbpPmtuc_E7ODfcDMqPxTb2jg_gnUCDAgAAHAgIITCEhcLJmj84ZW2ypBEv9B-KIwCE75b4oJwT2xiLEH-AP_ogQw1b9PuIalUY7iYCARq8OxxJzsod3Zf9KrbwCAqvcxyPJiRSQnAmY3H-e6rxKEWJ6Jv3JizS1mCpv7GTaBG92r18lHQOBl9XubXv5X1ZZrcrGrUZYq5FPfp4Zq9GibsCTc9gqjdq8bnavv5YXXxe3vFEXxWqNmUYbjySdvCw2xe7F5tv31WI7H5ORpGQk-USK4qf1crxxOdq_v3TY_naQa6HNHttf1j8CNmwEHeRaBE-PjYD3j4APG0EHuRbB_LERiP4RiGEj6CDXInj-f4PuhpE1upPJHF4NvNsnR8UgRJ1iNTWdye0ef1izLzHVclXqyoJG56uvtFoss-pTYbAupkExa4jptZiFyR1oHlSLsFjc43saFM_C5Nk95KegeB4mz-8hP4d7FXU8k_Aju2Yn5y__AgAA___R7uSX

Why did these change?

pkg/sql/opt/memo/private_storage_test.go, line 373 at r1 (raw file):

	nf30, _ := coltypes.NewFloat(16)
	test(coltypes.Float, nf30, false)
	test(nf30, nf30, true)

What is this line testing?

pkg/sql/parser/parse_test.go, line 1139 at r2 (raw file):

		{`SELECT NUMERIC 'foo'`, `SELECT DECIMAL 'foo'`},
		{`SELECT REAL 'foo'`, `SELECT FLOAT4 'foo'`},
		{`SELECT DOUBLE PRECISION 'foo'`, `SELECT FLOAT8 'foo'`},

Maybe add an array or two as well?

pkg/sql/parser/sql.y, line 5895 at r2 (raw file):

| const_datetime
| const_json
| BLOB

There's no way to just collapse all BLOB, BYTES and BYTEA into one is there? It would just look so much nicer.

pkg/sql/sem/tree/col_types_test.go, line 30 at r1 (raw file):

	testData := []struct {
		str          string
		norm         string

please add a comment about this being empty means the name matches.

pkg/sql/sem/tree/parse_array.go, line 151 at r1 (raw file):

// string representation of the type. Used by dump. It only
// supports those type names that can appear immediately before `[]`.
func ArrayElementTypeStringToColType(s string) (coltypes.T, error) {

I think this should be a function in coltypes. Ideally inside arrays.go.

pkg/sql/sem/tree/testdata/pretty/7.ref.golden, line 136 at r2 (raw file):

	NOTHING

26:

Why did you remove this test case?

pkg/sql/sqlbase/coltype_conversion.go, line 26 at r2 (raw file):

	"github.com/pkg/errors"
)

All of the functions in this file should have unit tests.

pkg/sql/sqlbase/coltype_conversion.go, line 31 at r2 (raw file):

// After calling this, the caller must also call PopulateTypeAttrs
// below.
func DatumTypeToColumnType(ptyp types.T) (ColumnType, error) {

This has a mix of some returns inside the cases and some out.

Can you move them all in and get rid of the ctyp variable entirely.

pkg/sql/sqlbase/coltype_conversion.go, line 92 at r2 (raw file):

			case 16:
				base.VisibleType = ColumnType_SMALLINT
			case 32:

these two lying cases can be joined together.

pkg/sql/sqlbase/coltype_conversion.go, line 180 at r2 (raw file):

// InfoSchemaColumnType returns the string suitable to populate the data_type column
// of information_schema.columns.

I'm confused about information_schema.columns.

Are postgres type names compatible but we're not? Because these look like postgres type names.

For pg_catalog, let's go with postgres' type names, but for information schema, let's either go with our type names unless there's another standard we should match.

pkg/sql/sqlbase/coltype_conversion.go, line 279 at r2 (raw file):

			// Pre-2.1 columns: the width is not set yet and instead there
			// is a precision. Reverse-engineer the width from that.
			if c.Precision < 0 {

switch here might make it a lot nicer to read

Also, this matches FloatProperties(), any way to DRY it up? Especially considering that this has some extra strange logic in it.

pkg/sql/sqlbase/structured.go, line 2221 at r2 (raw file):

		} else if c.Precision == 0 {
			// Special case of poorly initialized coltypes pre-2.1.
			width = 64

why not just return here? And for all the cases? It will greatly simplify the logic:

func (c *ColumnType) FloatProperties() (int32, int32) {
  if c.width == 0 {
    // Pre-2.1 columns: the width is not set yet and instead there
    // is a precision. Reverse-engineer the width from that.
    if c.Precision < 0 {
      panic(fmt.Sprintf("programming error: invalid float precision: %d", c.Precision))
    }
    if c.Precision == 0 {
      // Special case of poorly initialized coltypes pre-2.1.
      return int32(64), int32(53)
     }
    if c.Precision <= 24 {
      return int32(32), int32(24)
    } 
    if c.Precision <= 54 {
      return int32(64), int32(53)
    }
    panic(fmt.Sprintf("programming error: invalid float precision: %d", c.Precision))
  }

  if c.width == 64 {
    return int32(64), int32(53)
  }
  return c.width, int32(24)
}

might even be worth adding a switch into the inner if

pkg/sql/sqlbase/structured.go, line 2240 at r2 (raw file):

// data types. Returns false if the data type is not numeric, or if the precision
// of the numeric type is not bounded.
func (c *ColumnType) NumericPrecision() (int32, bool) {

Does this ever return true for 0? If not, why not remove the bool return value entirely?

pkg/sql/sqlbase/structured.proto, line 68 at r2 (raw file):

  // Note: this enum is becoming deprecated. Do not add items
  // to it without consideration.
  // TODO(knz): remove this.

Please add to this comment as to when you can do so.

pkg/sql/sqlbase/structured.proto, line 80 at r2 (raw file):

  optional SemanticType semantic_type = 1 [(gogoproto.nullable) = false];
  // BIT, INT, FLOAT, DECIMAL, CHAR, BINARY, FLOAT.

nit: this isn't a sentence, no period at the end

pkg/sql/sqlbase/structured.proto, line 82 at r2 (raw file):

  // BIT, INT, FLOAT, DECIMAL, CHAR, BINARY, FLOAT.
  optional int32 width = 2 [(gogoproto.nullable) = false];
  // DECIMAL.

nit: this isn't a sentence, no period at the end

Perhaps
// DECIMAL, pre2.1 FLOAT (incorrectly)

BramGruneir

We also need to ensure that docs is aware of the changes and I think the release note need to be greatly expanded.

Reviewable status: complete! 0 of 0 LGTMs obtained

knz · 2018-08-16T19:12:59Z

Thank you for your initial review. I'll iterate on that today.

Regarding the doc work I think I will simply take care of doing all the doc updates myself. There is no way the doc team will have the bandwidth to process such a change themselves.

In the meantime you can have a look at the SERIAL PR #28575

knz

Some reactions already, will process the rest tomorrow.

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/show_columns.go, line 37 at r1 (raw file):