Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schemadiff: normalize index option value (string) #11675

Merged
merged 5 commits into from
Nov 10, 2022

Conversation

shlomi-noach
Copy link
Contributor

Description

With this PR schemadiff normalizes (to lower case) the value of an index option. For example:

fulltext key name_ft(name) WITH PARSER NGRAM is normalized into
fulltext key name_ft(name) WITH PARSER ngram

Related Issue(s)

Tracking: #10203

Checklist

  • "Backport me!" label has been added if this change should be backported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Nov 9, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

@@ -482,6 +482,7 @@ func (c *CreateTableEntity) normalizeIndexOptions() {
idx.Info.Type = strings.ToLower(idx.Info.Type)
for _, opt := range idx.Options {
opt.Name = strings.ToLower(opt.Name)
opt.String = strings.ToLower(opt.String)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, would this always lowercase things if you have for example a comment set? It seems like we should not normalize that then. I wonder if we should make this specific only to WITH PARSER?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our parser:

vitess/go/vt/sqlparser/sql.y

Lines 2299 to 2332 in 1ace3b4

index_option:
using_index_type
{
$$ = $1
}
| KEY_BLOCK_SIZE equal_opt INTEGRAL
{
// should not be string
$$ = &IndexOption{Name: string($1), Value: NewIntLiteral($3)}
}
| COMMENT_KEYWORD STRING
{
$$ = &IndexOption{Name: string($1), Value: NewStrLiteral($2)}
}
| VISIBLE
{
$$ = &IndexOption{Name: string($1) }
}
| INVISIBLE
{
$$ = &IndexOption{Name: string($1) }
}
| WITH PARSER ci_identifier
{
$$ = &IndexOption{Name: string($1) + " " + string($2), String: $3.String()}
}
| ENGINE_ATTRIBUTE equal_opt STRING
{
$$ = &IndexOption{Name: string($1), Value: NewStrLiteral($3)}
}
| SECONDARY_ENGINE_ATTRIBUTE equal_opt STRING
{
$$ = &IndexOption{Name: string($1), Value: NewStrLiteral($3)}
}

It looks like the String value is only ever used (at this time?) in a WITH PARSER clause. No other index clause/option utilizes it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm also here:

vitess/go/vt/sqlparser/sql.y

Lines 7240 to 7244 in 1ace3b4

using_index_type:
USING sql_id
{
$$ = &IndexOption{Name: string($1), String: string($2.String())}
}

USING ..., so this affects e.g. USING HASH -> USING hash

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any issues with lower-casing USING or WITH PARSER values

from: "create table t1 (id int primary key, name tinytext not null)",
to: "create table t1 (id int primary key, name tinytext not null, fulltext key name_ft(name) with parser ngram)",
diff: "alter table t1 add fulltext key name_ft (`name`) with parser ngram",
cdiff: "ALTER TABLE `t1` ADD FULLTEXT KEY `name_ft` (`name`) WITH PARSER NGRAM",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also want to / need to change the canonical output format of this. It looks like the parser formatter always upcases this. I think we should add something like the following as well:

diff --git i/go/vt/sqlparser/ast_format.go w/go/vt/sqlparser/ast_format.go
index e94a1b24ab..65756a46a9 100644
--- i/go/vt/sqlparser/ast_format.go
+++ w/go/vt/sqlparser/ast_format.go
@@ -826,7 +826,7 @@ func (idx *IndexDefinition) Format(buf *TrackedBuffer) {
        for _, opt := range idx.Options {
                buf.astPrintf(idx, " %s", opt.Name)
                if opt.String != "" {
-                       buf.astPrintf(idx, " %s", opt.String)
+                       buf.astPrintf(idx, " %#s", opt.String)
                } else if opt.Value != nil {
                        buf.astPrintf(idx, " %v", opt.Value)
                }
diff --git i/go/vt/sqlparser/tracked_buffer_test.go w/go/vt/sqlparser/tracked_buffer_test.go
index 02ea192a5d..614a215c0e 100644
--- i/go/vt/sqlparser/tracked_buffer_test.go
+++ w/go/vt/sqlparser/tracked_buffer_test.go
@@ -220,6 +220,10 @@ func TestCanonicalOutput(t *testing.T) {
                        "select char(77, 121, 83, 81, '76' using utf8mb4) from dual",
                        "SELECT CHAR(77, 121, 83, 81, '76' USING utf8mb4) FROM `dual`",
                },
+               {
+                       "create table t1 (id int primary key, name tinytext not null, fulltext key name_ft(name) with parser ngram)",
+                       "CREATE TABLE `t1` (\n\t`id` int PRIMARY KEY,\n\t`name` tinytext NOT NULL,\n\tFULLTEXT KEY `name_ft` (`name`) WITH PARSER ngram\n)",
+               },
        }
 
        for _, tc := range testcases {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@shlomi-noach
Copy link
Contributor Author

Normalization now takes place in both ast_format and in schemadiff. Only two syntax are affected by this chane:

  • USING ... for "normal" index types (USING HASH, USING BTREE)
  • WITH PARSER ... for fulltext index types (USING PARSER NGRAM)

schemadiff tests adapted. sqlparser tests unchanged and all good.

@shlomi-noach shlomi-noach merged commit a2fad7e into vitessio:main Nov 10, 2022
@shlomi-noach shlomi-noach deleted the schemadiff-fulltext branch November 10, 2022 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants