You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Markdown is a lightweight and easy-to-use syntax for styling all forms of writing on the modern web platforms. Checkout this excellent guide by GitHub to learn everything about Markdown.
HTML::Pipeline intro
HTML::Pipeline is HTML Processing filters and utilities. It includes a small framework for defining DOM based content filters and applying them to user provided content. Read an introduction about HTML::Pipeline in this blog post. GitHub uses the HTML::Pipeline to implement markdown.
Sometimes users may be tempted to try something like:
<imgsrc='' onerror='alert(1)' />
which is a common trick to create a popup box on the page, we don't want all users to see a popup box.
Due to the nature of Markdown, HTML is allowed. You can use HTML::Pipeline's built-in SanitizationFilter to sanitize.
But the problem with SanitizationFilter is that, disallowed tags are discarded. That is fine for regular use case of "html sanitization" where we want to let users enter some html. But actually We never want HTML. Any HTML entered should be displayed as-is.
For example, writing:
hello <script>i am sam</script>
Should not result in the usual sanitized output (GitHub's behavior):
hello
Instead, it should output (escaped HTML)
hello <script>i am sam</script>
So in here we take a different approach:
We can add a NohtmlFilter, simply replace < to <:
classNoHtmlFilter < TextFilterdefcall@text.gsub('<','<')# keep `>` since markdown needs that for blockquotesendend
While we can display escaped HTML, we still need to add sanitization.
Add SanitizationFilter after our markdown got translated into HTML:
# Gemfilegem"sanitize"# RenderMarkdownclassRenderMarkdown
...
defcallpipeline=HTML::Pipeline.new[NohtmlMarkdownFilter,HTML::Pipeline::SanitizationFilter,]
...
end
...
end
So that our HTML is safe!
Nice to have
Syntax Highlight with Rouge
No more pygements dependency, syntax highlight with Rouge.
# Gemfilegem"html-pipeline-rouge_filter"# RenderMarkdownclassRenderMarkdown
...
defcallpipeline=HTML::Pipeline.new[NohtmlMarkdownFilter,HTML::Pipeline::SanitizationFilter,HTML::Pipeline::RougeFilter]
...
end
...
end
Twemoji instead of gemoji (more emojis)
While HTML::Pipeline originally came with an EmojiFilter, which uses gemoji under the hood, there is an alternative solution, twemoji.
# Gemfilegem"twemoji"# new fileclassEmojiFilter < HTML::Pipeline::FilterdefcallTwemoji.parse(doc,file_ext: context[:file_ext] || "svg",class_name: context[:class_name] || "emoji",img_attrs: context[:img_attrs] || {},)endend# RenderMarkdownclassRenderMarkdown
...
defcallpipeline=HTML::Pipeline.new[NohtmlMarkdownFilter,HTML::Pipeline::SanitizationFilter,EmojiFilter,HTML::Pipeline::RougeFilter]
...
end
...
end
Markdown
Markdown is a lightweight and easy-to-use syntax for styling all forms of writing on the modern web platforms. Checkout this excellent guide by GitHub to learn everything about Markdown.
HTML::Pipeline intro
HTML::Pipeline is HTML Processing filters and utilities. It includes a small framework for defining DOM based content filters and applying them to user provided content. Read an introduction about
HTML::Pipeline
in this blog post. GitHub uses theHTML::Pipeline
to implement markdown.Implementing Markdown
Content goes into our pipeline, outputs HTML, as simple as that!
Let's implement
RenderMarkdown
.Install
HTML::Pipeline
& dependency for MarkdownFirst we'll need to install
HTML::Pipeline
and associated dependencies for each feature:1-min HTML::Pipeline tutorial
Filters can be combined into a pipeline:
Each filter to hand its output to the next filter's input:
RenderMarkdown
We can then implement
RenderMarkdown
class by leveragingHTML::Pipeline
:To use it:
It works and it is very easy!
Avoid HTML markup
Sometimes users may be tempted to try something like:
which is a common trick to create a popup box on the page, we don't want all users to see a popup box.
Due to the nature of Markdown, HTML is allowed. You can use
HTML::Pipeline
's built-in SanitizationFilter to sanitize.But the problem with
SanitizationFilter
is that, disallowed tags are discarded. That is fine for regular use case of "html sanitization" where we want to let users enter some html. But actually We never want HTML. Any HTML entered should be displayed as-is.For example, writing:
Should not result in the usual sanitized output (GitHub's behavior):
Instead, it should output (escaped HTML)
So in here we take a different approach:
We can add a
NohtmlFilter
, simply replace<
to<
:Put this
NoHtmlFilter
Before our markdown filter:We keep
>
since markdown needs that for blockquotes, let's try this:While
<
,>
got escaped, it still looks the same from user's perspective.But what if we want to talk about some HTML in
code
tag?The
&
in the code tag also got escaped, we don't want that. Let's fix this:This is awesome, but here comes another bug report, autolink does not work anymore:
The fix is to add a space after our unique string when replacing the
<
:Now autolink works as usual:
But other cases come in. Final version:
Sanitization
While we can display escaped HTML, we still need to add sanitization.
Add
SanitizationFilter
after our markdown got translated into HTML:So that our HTML is safe!
Nice to have
Syntax Highlight with Rouge
No more pygements dependency, syntax highlight with Rouge.
Twemoji instead of gemoji (more emojis)
While HTML::Pipeline originally came with an
EmojiFilter
, which uses gemoji under the hood, there is an alternative solution, twemoji.Wrap Up
We now have a markdown that can:
See JuanitoFatas/markdown@eb7f434...377125 for full implementation!
The text was updated successfully, but these errors were encountered: