-
-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should the whitespaces before backslash hard line break be removed? #724
Comments
As far as I am aware this is not explained somewhere in the spec. I don’t think it needs to. In my mind, it’s similar to putting anything else at the end of a line, such as: a &
b -> <p>a &
b</p> Or: a b
c <p>a b
c</p> |
But it still maters in some situation, for example
a
b Output to AST: [
{
"type": "paragraph",
"start": {
"line": 0,
"column": 0,
"offset": 0
},
"end": {
"line": 1,
"column": 1,
"offset": 7
},
"children": [
{
"text": "a",
"start": {
"line": 0,
"column": 0,
"offset": 0
},
"end": {
"line": 0,
"column": 1,
"offset": 1
}
},
{
"type": "hardLineBreak",
"start": {
"line": 0,
"column": 1,
"offset": 1
},
"end": {
"line": 0,
"column": 5,
"offset": 5
},
"markers": [
{
"start": {
"line": 0,
"column": 1,
"offset": 1
},
"end": {
"line": 0,
"column": 5,
"offset": 5
},
"text": " "
}
]
},
{
"text": "b",
"start": {
"line": 1,
"column": 0,
"offset": 6
},
"end": {
"line": 1,
"column": 1,
"offset": 7
}
}
]
}
] The offset from 1 to 4 hit the hardLineBreak marker.
a \
b If we do not count these proceeding spaces as part of the hard line break, the AST output will be: [
{
"type": "paragraph",
"start": {
"line": 0,
"column": 0,
"offset": 0
},
"end": {
"line": 1,
"column": 1,
"offset": 7
},
"children": [
{
"text": "a ",
"start": {
"line": 0,
"column": 0,
"offset": 0
},
"end": {
"line": 0,
"column": 4,
"offset": 4
}
},
{
"type": "hardLineBreak",
"start": {
"line": 0,
"column": 4,
"offset": 4
},
"end": {
"line": 0,
"column": 5,
"offset": 5
},
"markers": [
{
"start": {
"line": 0,
"column": 4,
"offset": 4
},
"end": {
"line": 0,
"column": 5,
"offset": 5
},
"text": "\\"
}
]
},
{
"text": "b",
"start": {
"line": 1,
"column": 0,
"offset": 6
},
"end": {
"line": 1,
"column": 1,
"offset": 7
}
}
]
}
] This way, only offset 5 is the hard line break marker. There might be no difference when rendered to HTML. but in markdown editor, it might matter. |
If you have a problem with an AST, this is not the place to report it. This spec does not define ASTs. |
I meant I do not have a clear specification to follow when parsing the backslash hard line break to an AST. whether or not count the spaces before the backslash as a part of the hard line break will output different ASTs |
I think this is a problem in your AST, and unrelated to this specification. I don’t believe there is anything that has to happen in this project. If you’re interested in AST tools that do generate such as AST, and AST tools tools that do serialize with backslashes, you might find my projects |
Thanks a lot! Yep, the trailing whitespace should always be removed if there is not a specific reason. This is my project dart_markdown, a Markdown to AST parser, which is definitely inspired by your mdast. |
In other words, should parse the backslash and the proceeding whitespace as a hard line break as a whole. or leave the proceeding whitespaces and only parse the backslash as a hard line break?
for example, should parse
a \ b
into
or
The two or more spaces hard line break is clear, all the whitespaces before the line ending represent a hard line break, but I didn't find any mention of backslash hard line break anywhere.
The text was updated successfully, but these errors were encountered: