[Question] Consistent Behavior of End-of-Line Characters Across Block-Level Tokens #3506

Bistard · 2024-10-27T09:51:26Z

Marked version: 14.1.2

Background

This is not a bug, but rather a confusion from me. Consider the following text and tokenization result:

const token = lexer.lex('paragraph1\n');
// tokenization result
{type:"paragraph", raw:"paragraph1\n", text:"paragraph1", tokens:[
  {type:"text", raw:"paragraph1", text:"paragraph1"}
]}

I notice that the end of the line character \n only exists at the token.raw and undetectable under its children tokens or token.text. This is also confirmed by this previous issue I asked.

Expected behavior

My question is: Does this behaviour work for EVERY block-level token? That is, for every block-level token, when a '\n' character is at the end of that block, is it always only accessible and detectable in the token.raw property?

Example

I tested list, paragraph, heading, codeBlock, blockQuote in the official demo website. They seem to follow my expectations.

For example, the tokenization result from heading, codeBlock and BlockQuote tokens in my case is the following:

// '# Heading\n'
{type:"heading", raw:"# heading\n", depth:1, text:"heading", tokens:[
  {type:"text", raw:"heading", text:"heading"}
]}
// '> paragraph1\n'
{type:"paragraph", raw:"'> paragraph1\n", text:"'> paragraph1", tokens:[
  {type:"text", raw:"'> paragraph1", text:"&#39;&gt; paragraph1"}
]}
// '```ts\nconsole.log(1)\n```\n'
[
{type:"code", raw:"```ts\nconsole.log(1)\n```\n", lang:"ts", text:"console.log(1)"}
]

But I tried html token, seems like an exception:

// '<div>hi</div>\n'
[
{type:"html", block:true, raw:"<div>hi</div>\n", pre:false, text:"<div>hi</div>\n"}
]

Additionals

For hr token, since it only has the token.raw property but no token.text property, so this block-level token is not in the range of my question:

// '---\n'
{type:"hr", raw:"---"}

The text was updated successfully, but these errors were encountered:

UziTech · 2024-10-30T04:48:40Z

I don't think it is consistent. If you would like to create a PR to make it consistent we could get it in the next major version. 😁👍

Bistard · 2024-10-30T04:54:58Z

OK. In the next few days or weeks, I will look up the source code and try to make it consistent through a PR.

Bistard changed the title ~~[Question] Is every block-level token ignoring the end of the line~~ [Question] Consistent Behavior of End-of-Line Characters Across Block-Level Tokens Oct 27, 2024

UziTech added the proposal label Oct 30, 2024

markedjs deleted a comment Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Consistent Behavior of End-of-Line Characters Across Block-Level Tokens #3506

[Question] Consistent Behavior of End-of-Line Characters Across Block-Level Tokens #3506

Bistard commented Oct 27, 2024 •

edited

Loading

UziTech commented Oct 30, 2024

Bistard commented Oct 30, 2024 •

edited

Loading

[Question] Consistent Behavior of End-of-Line Characters Across Block-Level Tokens #3506

[Question] Consistent Behavior of End-of-Line Characters Across Block-Level Tokens #3506

Comments

Bistard commented Oct 27, 2024 • edited Loading

Background

Expected behavior

Example

Additionals

UziTech commented Oct 30, 2024

Bistard commented Oct 30, 2024 • edited Loading

Bistard commented Oct 27, 2024 •

edited

Loading

Bistard commented Oct 30, 2024 •

edited

Loading