Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugfix] Treat standalone carriage return as newline #7

Merged
merged 3 commits into from
Sep 15, 2021
Merged

[bugfix] Treat standalone carriage return as newline #7

merged 3 commits into from
Sep 15, 2021

Conversation

benjreinhart
Copy link
Contributor

The existing code correctly handles CRLF, treating it as a single newline. However, it does not appear to treat a standalone carriage return as a newline. According to the spec, carriage returns should be treated as a newline.

Steps to reproduce:

> kdl.parse("foo\r\nbar")
{
  output: [
    { name: 'foo', properties: {}, values: [], children: [] },
    { name: 'bar', properties: {}, values: [], children: [] }
  ],
  errors: []
}

> kdl.parse("foo\rbar")
{
  output: undefined,
  errors: [
    EarlyExitException: Expecting: expecting at least one iteration which starts with one of these possible Token sequences::
      <[BOM] ,[WhiteSpace] ,[OpenMultiLineComment] ,[EscLine]>
    but found: 'bar'
        <...stacktrace...>  {
      token: [Object],
      resyncedTokens: [],
      previousToken: [Object],
      context: [Object]
    }
  ]
}

@larsgw
Copy link
Collaborator

larsgw commented Sep 15, 2021

Thank you for catching this! Could you also update the RegExp for the LineComment? It should match everything from // up to but not including the linespace, so the pattern should get a lot simpler (since it does not have to consider \r separately from \r\n now).

parser.js Outdated
@@ -21,7 +21,7 @@ const BOM = createToken({
const NewLine = createToken({
name: 'NewLine',
// eslint-disable-next-line no-control-regex
pattern: /\x0D\x0A|[\x0A\x0C\x85\u2028\u2029]/
pattern: /\x0D\x0A|[\x0A\x0D\x0C\x85\u2028\u2029]/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, sorry to be pedantic but could you insert it after \x0C to keep it ordered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, thought I had ordered it

@benjreinhart
Copy link
Contributor Author

benjreinhart commented Sep 15, 2021

Updated from your feedback.

Might want to double check that regex update, but I think it looks good:

> "foo // line comment here\r\nbar".match(/\/\/[^\x0A\x0C\x0D\x85\u2028\u2029]*/)[0]
'// line comment here'

> "foo // line comment here\nbar".match(/\/\/[^\x0A\x0C\x0D\x85\u2028\u2029]*/)[0]
'// line comment here'

> "foo // line comment here\rbar".match(/\/\/[^\x0A\x0C\x0D\x85\u2028\u2029]*/)[0]
'// line comment here'

@larsgw larsgw merged commit 6f00292 into kdl-org:main Sep 15, 2021
@benjreinhart benjreinhart deleted the carriage-return branch September 16, 2021 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants