Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix images in the RSS feed #1245

Merged
merged 12 commits into from
May 6, 2020
Merged

Fix images in the RSS feed #1245

merged 12 commits into from
May 6, 2020

Conversation

rogermparent
Copy link
Contributor

@rogermparent rogermparent commented May 5, 2020

It seems many Gatsby sites don't really have images within their RSS feeds' content, and this turned out to be slightly more involved than I thought it would.

Basically, this branch changes the way the feed content is parsed. Instead of taking pre-rendered HTML from GraphQL, it uses the Rehype htmlAst that gatsby-transformer-remark already makes for us and transforms links and images into a simpler form of HTML that works with RSS.

To get images working, I had to do two things to the feed's AST before rendering to HTML:

  1. Change root-relative src attributes (/example) to absolute links (https://dvc.org/example)
  2. "Unwrap" responsive image wrappers by replacing them with the img tag fallback within.

Fixes the images part of #1177, and since titles are already fixed it fixes #1177 entirely.

This runs the feed's htmlAst (from the remark transformer) through a processor
that changes all root-relative links (/example) into absolute
links (https://dvc.org/example), which is what many non-browser consumers of
HTML expect.
This commit combines the root-relative rewriting with now-working image
unwrapping to deliver working images to the RSS feed!

The processor definition is also moved into the feed scope to take advantage of
the feed query to get the root URL.
Since moving it out of gatsby-config won't be as simple as I thought, I decided
to leave it out of the scope of this branch.
@shcheklein shcheklein temporarily deployed to dvc-landing-feed-images-gm9bcy May 5, 2020 20:35 Inactive
@shcheklein
Copy link
Member

Looks too complicated and affects the website (optimization for smaller screen sizes, etc?). Also, two notes:

  1. I see that other images are rendered in RSS even in the current version.
  2. We should be applying these changes to the RSS render, w/o affecting and complicating even further the website renderer.

@shcheklein
Copy link
Member

  1. If we can't do these two points ^^ - let's just ignore RSS issue for now - it's not the biggest priority.

@rogermparent
Copy link
Contributor Author

rogermparent commented May 5, 2020

This change is only on the RSS render, you can see the site's images are still totally responsive on the deploy preview. These changes all live in gatsby-plugin-feed's config, and do not touch the site render.

@shcheklein
Copy link
Member

Oh, sorry, @rogermparent - function is already big enough that I missed the context.

How is it possible though that other images do work in RSS w/o applying this transformation?

can it be somehow related to some custom image processing plugins/logic that we already have?

@rogermparent
Copy link
Contributor Author

rogermparent commented May 5, 2020

Oh, sorry, @rogermparent - function is already big enough that I missed the context.

How is it possible though that other images do work in RSS w/o applying this transformation?

can it be somehow related to some custom image processing plugins/logic that we already have?

No worries! I'm aware gatsby-config is getting packed, but there are some "global" variables that need some refactoring before we can move things into their own files, and I decided it was out of the scope of this branch.

I'm not sure which images you're talking about that do work, as none of the images in the body work for me. Embeds work, but those are just iframes. Any images that aren't wrapped with a gatsby-image component, like external ones, also should work currently although I haven't seen them.

The problem is twofold- the HTML from the remark plugin has gatsby-image responsive images, which are actually a bunch of different elements wrapped together, embedded in its HTML. This works fantastically for real web browsers, but for more simple embedded browsers like RSS readers the trick actually breaks things.

The other half is root-relative links, which are a browser-specific feature that falls apart outside of that context. Once you have some way to reliably alter HTML properties (rehype, in this case, as it's the highly compatible sister project to remark), it's incredibly easy to make all root-relative links into absolute by simply prepending the site URL to them, much like how gatsby-plugin-feed uses links outside of the content and requires the same site_url variable to do so.

@shcheklein
Copy link
Member

I'm not sure which images you're talking about that do work

hmm : 🤔 This is the most recent post in feedly for me -

Screen Shot 2020-05-05 at 3 28 13 PM

@rogermparent
Copy link
Contributor Author

rogermparent commented May 5, 2020

I just looked at that image, and it's broken for me on a linux-based local reader called FeedReader.
That image in particular is an SVG, which can't be processed by Sharp (nor would it make sense to) and doesn't become a Gatsby Image like raster formats.

This means those particular SVG images bypass the Gatsby-image problem but still suffer from the root-relative problem. The latter must be an issue that only certain readers suffer from.

gatsby-config.js Outdated Show resolved Hide resolved
gatsby-config.js Outdated Show resolved Hide resolved
yarn.lock Outdated Show resolved Hide resolved
gatsby-config.js Outdated Show resolved Hide resolved
gatsby-config.js Outdated Show resolved Hide resolved
Also changed a bit around implementation:

1. Replaced `rehype-urls`, which depended on older rehype libs, with a local
   implementation of the same thing for root-to-absolute linking.

2. The new feed util leans on the existing rehype-to-html function, and now the
   new processor `.use`s less `unified` plugins as it only does AST processing.
@rogermparent rogermparent temporarily deployed to dvc-landing-feed-images-gm9bcy May 6, 2020 20:08 Inactive
Copy link
Contributor Author

@rogermparent rogermparent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted the function out and fixed many issues, but could use a couple final tweaks.

gatsby-config.js Outdated Show resolved Hide resolved
gatsby-config.js Outdated Show resolved Hide resolved
yarn.lock Outdated Show resolved Hide resolved
@rogermparent rogermparent temporarily deployed to dvc-landing-feed-images-gm9bcy May 6, 2020 20:34 Inactive
gatsby-config.js Outdated Show resolved Hide resolved
const { select, selectAll } = require('hast-util-select')

const rootToAbsolute = siteUrl => tree => {
selectAll('a', tree).forEach(node => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put a link here gatsbyjs/gatsby#14133 to remove this when it's fixed upstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

return tree
}

function makeFeedHtml(htmlAst, siteUrl) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really annoying that gastby plugin feed does not handle this ... kinda feels like all of this is a requirement for that plugin to be useful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to a certain extent, especially for our use case, but RSS feeds are very often used for something like podcasts where the encoded content is more of a summary than the whole blog (ads are a lot harder in an RSS reader, after all), and the encoded content is kept simple.

This isn't to say the default and suggested queries of gatsby-plugin-feed are good, I think naively passing through the content's raw html string is kind of poor way to go about things, but it's not as if the plugin is useless without this kind of content reprocessing.

...fallbackImg,
properties: {
...fallbackImg.properties,
style: 'max-width: 100%; margin: auto;'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq- does it makes/do we respect our custom image sizing logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just changes the wrapper's internal image to have max-width: 100% so larger images don't extend offscreen on narrower displays and margin: auto so smaller ones are centered.
The original max-width on the wrapper that the existing plugin operates on is untouched, as well as any other properties on the wrapper.

@rogermparent rogermparent temporarily deployed to dvc-landing-feed-images-gm9bcy May 6, 2020 21:26 Inactive
@shcheklein shcheklein merged commit 2ea63aa into master May 6, 2020
@shcheklein shcheklein deleted the feed-images branch February 28, 2021 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rss: no title, no image in the feed for new posts
2 participants