gatsby-transformer-remark not transforming excerpts #4459

vojtechruz · 2018-03-09T07:26:35Z

I am using gatsby-transformer-remark with the following configuraton:

{
      resolve: `gatsby-transformer-remark`,
      options: {
        excerpt_separator: `<!--more-->`,
        plugins: [
          {
            resolve: `gatsby-remark-images`,
            options: {
              maxWidth: 590,
            },
          },
          {
            resolve: `gatsby-remark-responsive-iframe`,
            options: {
              wrapperStyle: `margin-bottom: 1.0725rem`,
            },
          },
          'gatsby-remark-prismjs',
          'gatsby-remark-copy-linked-files',
        ],
      },
    },

According to this post https://using-remark.gatsbyjs.org/excerpts/ I am querying excerpts from my articles like this:


export const pageQuery = graphql`
  query IndexQuery {
    site {
      siteMetadata {
        title
      }
    }
    allMarkdownRemark(sort: { fields: [frontmatter___date], order: DESC }) {
      edges {
        node {
          excerpt
          fields {
            slug
          }
          frontmatter {
            date(formatString: "DD MMMM, YYYY")
            title
            tags
            featuredImage {
                childImageSharp{
                    sizes(maxWidth: 200) {
                        ...GatsbyImageSharpSizes_tracedSVG
                    }
                }
            }
          }
        }
      }
    }
  }
`

My article in markdown uses a separater defined in config above:


---
title: Break the Java Generics Naming Convention?
date: "2016-04-30T22:12:03.284Z"
tags: ['Java']
path: '/break-java-generics-naming-convention'
featuredImage: './break-java-generics-maning-convention.jpg'
---

Choosing **descriptive**, *intent-revealing* names is one of the key principles of writing clean code, which is easy to read and understand and usually requires little to no comments. According to the official convention, generic type parameter names are supposed to be just one capital letter. Should you break this convention to make your names more descriptive?
<!--more-->

However, the excerpt I am getting is plain text and not transformed from markdown:

Is this intended behavior? How can I get excerpt which is already formatted? Thank you.

"gatsby": "^1.9.223",
"gatsby-transformer-remark": "^1.7.34",

Node: v9.5.0
NPM: 5.6.0

The text was updated successfully, but these errors were encountered:

vojtechruz · 2018-03-12T12:55:06Z

This only happens when using:
excerpt_separator

Otherwise, excerpt contains plain text with stripped markdown formatting.

vojtechruz · 2018-03-12T13:08:32Z

It is not so bad that you lose formatting such as bold or italics, but what's not nice is that escape slashes are preserved:
Eg:
How to make the most of the IntelliJ IDEA\'s view...

m-allanson · 2018-03-16T19:18:45Z

It looks like the gatsby-transformer-remark's excerpt handling is here. If you or anyone else is interested in doing some investigation, a PR would be great.

docwhat · 2018-03-17T23:15:11Z

I've been digging around and I don't see anything obvious.

It took me a while, but I figured out that the excerpt_separator is actually defined and used by grey-matter.

I can't figure out where the transformation happens and what is passed to grey-matter.

Can someone point me in the right direction? This is really annoying.

It's broken due to bug gatsbyjs/gatsby#4459 — excerpts using the separator are not transformed from Markdown to HTML.

omeid · 2018-04-22T15:00:33Z

Although not directly related to this issue, the markdown plugin doesn't parse the title or any part of the frontmatter either.

Undistraction · 2018-04-29T20:04:04Z

The docs here say

you can specify an excerpt_separator, as well as any of the other options mentioned here, in the gatsby-config.js file.

However neither excerpt_separator or excerpt, docs, which should allow you to pass a function for generating the excerpt result it anything other than the default 140 char excerpt.

"gatsby": "^1.9.131",
"gatsby-transformer-remark": "^1.7.15"

arlobelshee · 2018-05-29T02:39:25Z

The issue is at extend-node-type.js, line 128. It passes node.internal.content through the remark processor at that point, getting an AST, which it then does further stuff with and eventually transforms to Html, HtmlAst, or other forms (eg, in getHtml() on line 276).

The problem is that it doesn't do anything with node.excerpt. So that just passes through the whole process. And, at line 339, it then resolves the promise with the unprocessed excerpt.

To fix this, we'd have to either add the excerpt to the cache (as its own entry, or by storing a pair of ASTs), or just assume that the excerpt will only really be parsed once and do it directly. I assume we still want to keep the async processing and all the plugins, so it seems to make sense to me to change the cached value to be a pair of ASTs, with two separate promise chains to generate them.

Extracting a function that applies all the plugins, then does the remark processing, then applies all the plugins again, would make it easier to re-use the entire functionality to do both parts - especially as this is all async.

I don't know this code at all, so would prefer to leave changing it to someone who knows better. Hope this helps unblock the bug - I really want it fixed!

arlobelshee · 2018-05-29T04:17:54Z

I just submitted PR #5586 with a possible fix for this. It implements my above-mentioned plan. However, I am not a committer, so am not sure if it works.

arlobelshee · 2018-05-30T13:54:53Z

I could use some help from someone who knows the code better. It seems like my change has somehow altered the type of node.excerpt (even though I didn't change its declared type from string). I have no idea where and how the schema expectations are set. I also don't know whether it is a good thing for the excerpt to now have HTML in it, or to create a new node.excerptFormatted which does.

Please help!

strubell · 2018-07-08T00:35:40Z

This is still a problem. Any chance of this PR getting merged, or can anyone share a workaround?

MrSaints · 2018-07-08T11:10:08Z

@strubell, here's a workaround I've been using:

node.html.split("<!--more-->")[0]

🙈

karl-run · 2018-08-31T18:57:24Z

That works for simple things. But if you have things like anchor links in your post the URL needs to be the slug + the link of the markdown. I guess one could massage the links in the freshly split HTML but that seems nasty.

resir014 · 2018-10-06T11:48:41Z

Hey, any updates on this? I just stumbled upon this issue as well.

This is the beginnings of a fix for #4459 Right now, the fix works when excerpt_separator is provided. I have not yet implemented a fix for when the excerpt is generated using the `pruneLength` setting, though I have thoughts on how to solve it. My plan to is do a pre-order traversal over the html AST, accumulating nodes until the pruneLength is reached. I would love some preliminary feedback to get a sense of how the contributors feel about this PR so far. Here are some of my initial thoughts: 1. The work in the `excerpt` resolve function is a little hacky. We are directly invoking a variety of functions get our HTML. I'm curious if anyone has a more streamlined approach for this. 2. We are invoking `getMarkdownAST` (previously getAST) twice, once for the excerpt and once for the entire page. I'm worried that this is going to break non-idempotent existing plugins since they will now run twice. 3. I tore apart the `getAST` function a bit and think I did a good job simplifying it. The big change is that we don't generate multiple promises anymore. i think this goes a long way towards improving readability. I'd love to hear people's thoughts.

gatsbot · 2019-01-20T17:16:48Z

Old issues will be closed after 30 days of inactivity. This issue has been quiet for 20 days and is being marked as stale. Reply here or add the label "not stale" to keep this issue open!

vojtechruz · 2019-01-21T08:38:28Z

not stale

This is the beginnings of a fix for gatsbyjs#4459 Right now, the fix works when excerpt_separator is provided. I have not yet implemented a fix for when the excerpt is generated using the `pruneLength` setting, though I have thoughts on how to solve it. My plan to is do a pre-order traversal over the html AST, accumulating nodes until the pruneLength is reached. I would love some preliminary feedback to get a sense of how the contributors feel about this PR so far. Here are some of my initial thoughts: 1. The work in the `excerpt` resolve function is a little hacky. We are directly invoking a variety of functions get our HTML. I'm curious if anyone has a more streamlined approach for this. 2. We are invoking `getMarkdownAST` (previously getAST) twice, once for the excerpt and once for the entire page. I'm worried that this is going to break non-idempotent existing plugins since they will now run twice. 3. I tore apart the `getAST` function a bit and think I did a good job simplifying it. The big change is that we don't generate multiple promises anymore. i think this goes a long way towards improving readability. I'd love to hear people's thoughts.

viktorbengtsson · 2019-01-23T09:12:57Z

+1 Not stale.

lednhatkhanh · 2019-03-03T02:33:38Z

Hello, any updates on this.

DSchau · 2019-03-04T17:40:25Z

Yeah!

So - we added an HTML addition to the excerpt functionality relatively recently. Using this HTML excerpt functionality will transform the excerpt from plain text to HTML, if there's HTML to be transformed. For example, in the above example, the excerpt would translate italics/bold/etc. into their corresponding HTML tags.

It can be used like so:

{
  # grab your markdownRemark node however you prefer
  markdownRemark(id: "123412341234") {
    excerpt(format: HTML)
  }
}

This still works with the excerpt_seperator functionality.

Here's an example repo demonstrating this functionality

Here's the Markdown file showing the separator

Going to close this as answered, but please feel free to reply if we can help further. Thanks all!

StormPooper · 2019-03-05T10:27:42Z

@DSchau any suggestions for if we just want the text of the excerpt as per the default excerpt behaviour, but with a defined separator for some posts and relying on the prune value for the rest?

markvital · 2019-04-02T16:52:30Z

This still works with the excerpt_seperator functionality.

@DSchau this still doesn't work with format:PLAIN with custom separator.

I cloned [your example](Here's an example repo demonstrating this functionality) and changed format from HTML to plain in BlogIndex:

          excerpt(format: PLAIN)
          fields {
            slug
          }

Textual excerpts don't show properly:

Am I doing something wrong?

DSchau · 2019-04-02T16:55:35Z

@markvital what would you expect to happen there?

That is plain text. Would you expect Markdown specific stuff to be stripped?

markvital · 2019-04-02T17:05:02Z

Oh I see.
@DSchau Yes, I expected all tags to be removed, so I can insert clean text into meta tags, as described in docs.
This is how it worked for me, until I set "excerpt_separator": ` in gatsby-config.js.

corvidism · 2019-05-12T14:23:36Z

@DSchau awesome, I just stumbled on this problem and I'm glad I don't have to work around it! One problem though: if I set my excerpts to (format: HTML) and use a custom excerpt_separator, when the file doesn't contain the separator it creates the excerpt from the last paragraph of the text. It used the first 140 characters before I changed the format.

docwhat · 2019-05-12T19:41:50Z

One problem though: if I set my excerpts to (format: HTML) and use a custom excerpt_separator, when the file doesn't contain the separator it creates the excerpt from the last paragraph of the text. It used the first 140 characters before I changed the format.

That’s issue #12386

_{Sent with GitHawk}

riywo · 2019-06-03T01:37:59Z

I don't think the original issue is solved by excerpt(format: HTML) since HTML tags are not suitable for SEO metadata. We probably need fix for PLAIN's behaviour.

Like @markvital mentioned, if I don't specify excerpt_separator, the markdown annotations like italic/bold/link are processed to HTML and excerpt only contains visible text strings (like innerHTML of a tag). However, if I specify excerpt_separator, the original markdown content before the separator is returned as excerpt without any markdown and HTML processing (the raw markdown text).

Both behaviours should be consistent, preferably the former, I think.

riywo · 2019-06-04T09:01:22Z

Since this issue has been closed, do we need another issue?

FYI: This is an example unit test to see how it works differently with excerpt_separator riywo@4178bbd

 FAIL  src/__tests__/extend-node.js (6.786s)
  ● Excerpt is generated correctly from schema › excerpt does have missing words and extra spaces with excerpt_separator

    expect(received).toMatch(expected)

    Expected value to match:
      "Where oh where is that pony?"
    Received:
      "
    Where oh [*where*](nick.com) **_is_** ![that pony](pony.png)?

    "

      419 |       `,
      420 |     node => {
    > 421 |       expect(node.excerpt).toMatch(`Where oh where is that pony?`)
          |                            ^
      422 |     },
      423 |     { pluginOptions: { excerpt_separator: `<!-- end -->` } }
      424 |   )

      at toMatch (src/__tests__/extend-node.js:421:28)
      at test (src/__tests__/extend-node.js:91:11)

docwhat · 2019-06-04T16:39:23Z

Since this issue has been closed, do we need another issue? ...

@riywo I think so. Either a new format TEXT or fix PLAIN and add MARKDOWN

_{Sent with GitHawk}

riywo · 2019-06-12T01:23:09Z

FYI: I have created a PR to fix this behavior #14723

fk added the API/Plugins label Mar 14, 2018

docwhat added a commit to docwhat/website that referenced this issue Mar 17, 2018

turn off excerpt separator

fa2d3e0

It's broken due to bug gatsbyjs/gatsby#4459 — excerpts using the separator are not transformed from Markdown to HTML.

m-allanson removed the API/Plugins label Apr 13, 2018

Undistraction mentioned this issue May 12, 2018

[gatsby-transformer-remark] Add excerpt option to prune to sentence or paragraph #5393

Closed

arlobelshee mentioned this issue May 29, 2018

Bugfix 4459 #5586

Closed

pieh mentioned this issue Oct 9, 2018

Blog excerpt formatting issue #8964

Closed

nadrane mentioned this issue Nov 5, 2018

Fix/4459/excerpt formatting ignored #9716

Merged

gatsbot bot added the stale? Issue that may be closed soon due to the original author not responding any more. label Jan 20, 2019

resir014 added not stale and removed stale? Issue that may be closed soon due to the original author not responding any more. labels Jan 23, 2019

DSchau closed this as completed Mar 4, 2019

riywo mentioned this issue Jun 12, 2019

Change gatsby-transformer-remark excerpt behavior #14723

Merged

LekoArts removed the not stale label May 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gatsby-transformer-remark not transforming excerpts #4459

gatsby-transformer-remark not transforming excerpts #4459

vojtechruz commented Mar 9, 2018 •

edited

Loading

vojtechruz commented Mar 12, 2018

vojtechruz commented Mar 12, 2018 •

edited

Loading

m-allanson commented Mar 16, 2018

docwhat commented Mar 17, 2018

omeid commented Apr 22, 2018

Undistraction commented Apr 29, 2018 •

edited

Loading

arlobelshee commented May 29, 2018

arlobelshee commented May 29, 2018

arlobelshee commented May 30, 2018

strubell commented Jul 8, 2018

MrSaints commented Jul 8, 2018 •

edited

Loading

karl-run commented Aug 31, 2018

resir014 commented Oct 6, 2018

gatsbot bot commented Jan 20, 2019

vojtechruz commented Jan 21, 2019 •

edited

Loading

viktorbengtsson commented Jan 23, 2019

lednhatkhanh commented Mar 3, 2019

DSchau commented Mar 4, 2019

StormPooper commented Mar 5, 2019

markvital commented Apr 2, 2019

DSchau commented Apr 2, 2019

markvital commented Apr 2, 2019 •

edited

Loading

corvidism commented May 12, 2019

docwhat commented May 12, 2019

riywo commented Jun 3, 2019

riywo commented Jun 4, 2019

docwhat commented Jun 4, 2019

riywo commented Jun 12, 2019

gatsby-transformer-remark not transforming excerpts #4459

gatsby-transformer-remark not transforming excerpts #4459

Comments

vojtechruz commented Mar 9, 2018 • edited Loading

vojtechruz commented Mar 12, 2018

vojtechruz commented Mar 12, 2018 • edited Loading

m-allanson commented Mar 16, 2018

docwhat commented Mar 17, 2018

omeid commented Apr 22, 2018

Undistraction commented Apr 29, 2018 • edited Loading

arlobelshee commented May 29, 2018

arlobelshee commented May 29, 2018

arlobelshee commented May 30, 2018

strubell commented Jul 8, 2018

MrSaints commented Jul 8, 2018 • edited Loading

karl-run commented Aug 31, 2018

resir014 commented Oct 6, 2018

gatsbot bot commented Jan 20, 2019

vojtechruz commented Jan 21, 2019 • edited Loading

viktorbengtsson commented Jan 23, 2019

lednhatkhanh commented Mar 3, 2019

DSchau commented Mar 4, 2019

StormPooper commented Mar 5, 2019

markvital commented Apr 2, 2019

DSchau commented Apr 2, 2019

markvital commented Apr 2, 2019 • edited Loading

corvidism commented May 12, 2019

docwhat commented May 12, 2019

riywo commented Jun 3, 2019

riywo commented Jun 4, 2019

docwhat commented Jun 4, 2019

riywo commented Jun 12, 2019

vojtechruz commented Mar 9, 2018 •

edited

Loading

vojtechruz commented Mar 12, 2018 •

edited

Loading

Undistraction commented Apr 29, 2018 •

edited

Loading

MrSaints commented Jul 8, 2018 •

edited

Loading

vojtechruz commented Jan 21, 2019 •

edited

Loading

markvital commented Apr 2, 2019 •

edited

Loading