[Question] Extra lex results when dealing with text within a list #2684

Bistard · 2022-12-14T04:47:55Z

Marked version:4.0

Describe the bug
Given the following plain text:

This is a paragraph token
* This is a text token

The following data is the lexing result that I copied from the marked demo website:

[
{type:"paragraph", raw:"This is a paragraph token\n", text:"This is a paragraph token", tokens:[
  {type:"text", raw:"This is a paragraph token", text:"This is a paragraph token"}
]}
{type:"list", raw:"* This is a text token", ordered:false, start:"", loose:false, items:[
  {type:"list_item", raw:"* This is a text token", task:false, checked:undefined, loose:false, text:"This is a text token", tokens:[
    {type:"text", raw:"This is a text token", text:"This is a text token", tokens:[
      {type:"text", raw:"This is a text token", text:"This is a text token"}
]}
]}
]}
]

In the lex part of the list. I am not sure what is the expected behavior should be seen here:

If the text token is expected, then its children token I think is totally redundant.
If the text token is not expected, I believe maybe the correct one is the paragraph token?

P.S. I checked on the demo website from CommonMark Demo, The lex result from the same plain texts are shown as following:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">

<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <text>This is a paragraph token</text>
  </paragraph>
  <list type="bullet" tight="true">
    <item>
      <paragraph>
        <text>This is a text token</text>
      </paragraph>
    </item>
  </list>
</document>

To Reproduce
Steps to reproduce the behavior: Copies the given plain text to the marked demo website.

The text was updated successfully, but these errors were encountered:

Bistard · 2022-12-14T04:58:33Z

I have a follow-up question:

I found out that in the marked.d.ts, the interface for text token is:

interface Text {
      type: 'text';
      raw: string;
      text: string;
      tokens?: Token[] | undefined;
}

In what kind of situation that a text token will have a list of children tokens? In my understanding, A text token is more like an inline token, if it suppose to have children tokens, then isn't it suppose to be a paragraph token which is a real block token?

I am not really familiar with markdown parsing and lexing. If I stated some points that are terribly wrong, please point me out 😃 .

Bistard · 2022-12-14T05:00:28Z

P.S.S. This question might be similar to #2670.

UziTech · 2022-12-14T07:42:32Z

This is working as intended. in marked there are block text token, inline text tokens, and block paragraph tokens for plain text depending on the context. Block paragraph tokens are wrapped in <p> tags. Block and inline text tokens are not wrapped in anything. Block text tokens can have other inline tokens inside of them. I think the only time we actually have block text tokens is in lists since we don't want them wrapped in <p> tags unless the list is loose. I am simplifying here because markdown rules can become strange when dealing with edge cases. But it is intentional to have block text tokens that contain inline text tokens in lists.

UziTech added the works as intended label Dec 14, 2022

UziTech closed this as completed Dec 17, 2022

ekmixon mentioned this issue Dec 1, 2023

[Snyk] Fix for 1 vulnerabilities ekmixon/marked#130

Open

ekmixon mentioned this issue May 13, 2024

[Snyk] Fix for 2 vulnerabilities ekmixon/marked#135

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Extra lex results when dealing with text within a list #2684

[Question] Extra lex results when dealing with text within a list #2684

Bistard commented Dec 14, 2022

Bistard commented Dec 14, 2022

Bistard commented Dec 14, 2022

UziTech commented Dec 14, 2022

[Question] Extra lex results when dealing with text within a list #2684

[Question] Extra lex results when dealing with text within a list #2684

Comments

Bistard commented Dec 14, 2022

Bistard commented Dec 14, 2022

Bistard commented Dec 14, 2022

UziTech commented Dec 14, 2022