Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add to non-reserved keywords; rework functionName and symbolPrimitive parse rules #1457

Merged
merged 3 commits into from
May 9, 2024

Conversation

alancai98
Copy link
Member

Description

Other Information

  • Updated Unreleased Section in CHANGELOG: [NO]

    • No, v1 branch
  • Any backward-incompatible changes? [NO?]

    • Will have some behavioral changes related to some invalid queries not resulting in a parse error (will still be an evaluation error)
  • Any new external dependencies? [NO]

  • Do your changes comply with the Contributing Guidelines
    and Code Style Guidelines? [YES]

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@alancai98 alancai98 self-assigned this May 6, 2024
Copy link

github-actions bot commented May 6, 2024

Conformance comparison report-Cross Engine

Base (legacy) eval +/-
% Passing 92.51% 90.70% -1.80%
✅ Passing 5382 5278 -104
❌ Failing 436 541 105
🔶 Ignored 0 0 0
Total Tests 5818 5819 1
Number passing in both: 5071

Number failing in both: 229

Number passing in legacy engine but fail in eval engine: 312

Number failing in legacy engine but pass in eval engine: 207
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.
207 test(s) were failing in legacy but now pass in eval. Before merging, confirm they are intended to pass.
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Conformance comparison report-Cross Commit-LEGACY

Base (2879f3a) 3d70502 +/-
% Passing 92.52% 92.51% -0.02%
✅ Passing 5383 5382 -1
❌ Failing 435 436 1
🔶 Ignored 0 0 0
Total Tests 5818 5818 0
Number passing in both: 5382

Number failing in both: 435

Number passing in Base (2879f3a) but now fail: 1

Number failing in Base (2879f3a) but now pass: 0
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️. The following test(s) were previously passing but now fail:

Click here to see
  • MYSQL_SELECT_29, compileOption: LEGACY

Conformance comparison report-Cross Commit-EVAL

Base (2879f3a) 3d70502 +/-
% Passing 90.70% 90.70% 0.00%
✅ Passing 5278 5278 0
❌ Failing 541 541 0
🔶 Ignored 0 0 0
Total Tests 5819 5819 0
Number passing in both: 5278

Number failing in both: 541

Number passing in Base (2879f3a) but now fail: 1

Number failing in Base (2879f3a) but now pass: 1
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️. The following test(s) were previously passing but now fail:

Click here to see
  • Example 6 — Value Coercion, compileOption: LEGACY
The following test(s) were previously failing but now pass. Before merging, confirm they are intended to pass:
Click here to see
  • Example 6 — Value Coercion, compileOption: LEGACY

@RCHowell RCHowell self-requested a review May 7, 2024 15:20
partiql-parser/src/main/antlr/PartiQL.g4 Outdated Show resolved Hide resolved
Comment on lines 566 to 578
override fun visitQuotedIdentifier(ctx: org.partiql.parser.internal.antlr.PartiQLParser.QuotedIdentifierContext): Identifier.Symbol = translate(ctx) {
identifierSymbol(
ctx.IDENTIFIER_QUOTED().getStringValue(),
Identifier.CaseSensitivity.SENSITIVE
)
}

override fun visitUnquotedIdentifier(ctx: org.partiql.parser.internal.antlr.PartiQLParser.UnquotedIdentifierContext): Identifier.Symbol = translate(ctx) {
identifierSymbol(
ctx.text,
Identifier.CaseSensitivity.INSENSITIVE
)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why do we need the visitAs when we have explicit return types? I wonder if adding the two alternatives caused this. I wonder if we should define our own getIdentifier that is more efficient than visitAs.

So it would be like

fun identifier(ctx: ParserRuleContext): Identifier = when (ctx) {
     is IdentifierQuotedContext -> identifier(ctx)
     is IdentifierUnquotedContext -> identifier(ctx)
}

fun identifier(ctx: IdentifierQuotedContext): Identifier = ....

fun identifier(ctx: IdentifierUquotedContext): Identifier.Symbol = ....

We also should take care where qualified names appear and make sure our grammar rules follow this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like, I think we should be replacing visitSymbolPrimitive with visitIdentifierUnquoted and it shouldn't be anymore than a name change. Is there a reason this did not work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There wasn't a reason this did not work previously. I was just following the visitAs<...> () pattern I often saw in the visitors. I'll update to use a getIdentifier (or the like) rather than visitAs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New commit 445be01 uses the private visitSymbolPrimitive function which calls the corresponding visitIdentifierQuoted and visitIdentifierUnquoted. Also changed some existing usages of visitAs<Identifier.Symbol> to use visitSymbolPrimitive.

@@ -766,8 +768,7 @@ functionCall

// SQL-99 10.4 — <routine name> ::= [ <schema name> <period> ] <qualified identifier>
functionName
: (qualifier+=symbolPrimitive PERIOD)* name=( CHAR_LENGTH | CHARACTER_LENGTH | OCTET_LENGTH | BIT_LENGTH | UPPER | LOWER | SIZE | EXISTS | COUNT | MOD ) # FunctionNameReserved
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(self-review) -- this is the rule we were aiming to remove from the ANTLR grammar in this PR

| qualifier=AT_SIGN? key=nonReserved # VariableKeyword
;

nonReserved
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(self-review) -- the non-reserved words were taken directly from the SQL-99 grammar. At the end are some, PartiQL specific-keywords marked as non-reserved.

If you'll notice, postgresql along with other SQL dialects use their own set of non-reserved keywords that diverges from the SQL standards. Also between different SQL standards, there are some changes to the reserved vs non-reserved keywords. We choose to follow SQL-99 as we do in other places in the project.

Comment on lines 770 to 771
functionName
: (qualifier+=symbolPrimitive PERIOD)* name=( CHAR_LENGTH | CHARACTER_LENGTH | OCTET_LENGTH | BIT_LENGTH | UPPER | LOWER | SIZE | EXISTS | COUNT | MOD ) # FunctionNameReserved
| (qualifier+=symbolPrimitive PERIOD)* name=symbolPrimitive # FunctionNameSymbol
: (qualifier+=symbolPrimitive PERIOD)* name=symbolPrimitive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can go one step further and replace uses of functionName with qualifiedName.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're right. Didn't realize the definition was the same. Replaced usages in 7b615f8

@alancai98 alancai98 requested a review from johnedquinn May 9, 2024 22:45
@alancai98 alancai98 dismissed RCHowell’s stale review May 9, 2024 23:08

Feedback addressed in prior commits

@alancai98 alancai98 merged commit 4f89c2d into v1 May 9, 2024
14 checks passed
@alancai98 alancai98 deleted the non-reserved-keywords branch May 9, 2024 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants