Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing closing bracket leads to crash of js sql-parse #66

Closed
proddata opened this issue Aug 6, 2024 · 2 comments · Fixed by #96
Closed

Missing closing bracket leads to crash of js sql-parse #66

proddata opened this issue Aug 6, 2024 · 2 comments · Fixed by #96
Assignees
Labels
antlr4 Related to the antlr4 framework bug Something isn't working

Comments

@proddata
Copy link
Member

proddata commented Aug 6, 2024

Example:

CREATE TABLE t01 (
   "x" OBJECT (DYNAMIC),
   "y" OBJECT (DYNAMIC) AS ("z" ARRAY(OBJECT (DYNAMIC))      
   );

Leads to:

file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:85514
      e2.ctx.parser.getTokenStream().getText(
         ^

TypeError: Cannot read properties of null (reading 'ctx')
    at ExceptionCollectorListener.syntaxError (file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:85514:10)
    at file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:1647:37
    at Array.map (<anonymous>)
    at Rt.syntaxError (file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:1647:22)
    at _SqlBaseParser.notifyErrorListeners (file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:3569:39)
    at Ae.reportUnwantedToken (file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:3136:10)
    at Ae.sync (file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:3109:16)
    at _SqlBaseParser.createStmt (file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:67301:30)
    at _SqlBaseParser.statement (file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:60659:16)
    at _SqlBaseParser.statements (file:///Users/georg/sqlparsetest/node_modules/@cratedb/cratedb-sqlparse/dist/sqlparse.js:58435:12)
@surister surister added the bug Something isn't working label Aug 6, 2024
@surister surister self-assigned this Aug 6, 2024
@surister
Copy link
Collaborator

surister commented Aug 6, 2024

Interestingly, in this situation when collecting errors, antrl4 gives us an None e

def syntaxError(self, recognizer, offendingSymbol, line, column, msg, e):
    error = ParsingException(
        msg=msg,
        offending_token=offendingSymbol,
        e=e, # <--------------------
        query=e.ctx.parser.getTokenStream().getText(e.ctx.start, e.offendingToken.tokenIndex)
    )

    raise error
column = {int} 4
e = {NoneType} None
line = {int} 5
msg = {str} "extraneous input ';' expecting {',', ')'}"
offendingSymbol = {CommonToken} [@41,117:117=';',<301>,5:4]
recognizer = {SqlBaseParser} <cratedb_sqlparse.generated_parser.SqlBaseParser.SqlBaseParser object at 0x7a927b0715b0>
self = {ExceptionErrorListener} <cratedb_sqlparse.parser.ExceptionErrorListener object at 0x7a927b1ed430>

We still have the error message and could just compose the exception ourselves, the only issue is that we currently use the e object to get the query and later match the exception with the original query.

@surister surister added the antlr4 Related to the antlr4 framework label Aug 6, 2024
@surister
Copy link
Collaborator

surister commented Sep 10, 2024

Potential fix:

def syntaxError(self, recognizer, offendingSymbol, line, column, msg, e):
        if e:
            query = recognizer.getTokenStream().getText(e.ctx.start, offendingSymbol.tokenIndex)

        else:
            # If antlr4 doesn't give us an error object, we heuristically create a query, or a piece of it
            # so we increase the chances of it being correctly assigned.
            # It means that theoretically if you input two wrong queries that antlr4 manages
            # to distinguish as two different statements (which is hard already), and both are similar
            # the errors could be matched wrongly. Still pretty rare, and it is very hard to come up with
            # an actual query that does it.

            # The newly generated query will be either the offendingToken + one token to the left
            # or offendingToken + two tokens to the left, if the second is possible it takes precedence.
            min_token_to_check = max(1, offendingSymbol.tokenIndex - 2)
            tokens = recognizer.getTokenStream().tokens[
                     min_token_to_check: offendingSymbol.tokenIndex:offendingSymbol.tokenIndex + 1]
            query = "".join(map(lambda x: x.text, tokens))

        error = ParsingException(
            msg=msg,
            offending_token=offendingSymbol,
            e=e if e else type('NotViableInput', (Exception,), {})(),
            query=query
        )
        self.errors.append(error)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
antlr4 Related to the antlr4 framework bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants