Skip to content
This repository has been archived by the owner on Jul 27, 2024. It is now read-only.

Add ParserBlockingJavaScript Check #146

Merged
merged 2 commits into from
Feb 11, 2021

Conversation

charlespwd
Copy link
Contributor

@charlespwd charlespwd commented Feb 8, 2021

Say goodbye to slow sites. Add friction to under performing HTML.

Fixes #78

Copy link
Contributor

@jplhomer jplhomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍 Dang, this is really awesome.

Copy link
Contributor

@macournoyer macournoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to start using regexes. How about <script>s inside comments? Inside strings? How about < string ...? We should use a real HTML parser. Like we do for Liquid and JSON.

We already have an HTML parser in the project: https://github.com/Shopify/theme-check/blob/master/lib/theme_check/checks/valid_html_translation.rb#L26-L25.

We could introduce a new HtmlCheck that has on_... callback methods like LiquidCheck & JsonCheck.

@charlespwd
Copy link
Contributor Author

I don't think it's a good idea to start using regexes. How about <script>s inside comments? Inside strings? How about < string ...? We should use a real HTML parser. Like we do for Liquid and JSON.

We already have an HTML parser in the project: https://github.com/Shopify/theme-check/blob/master/lib/theme_check/checks/valid_html_translation.rb#L26-L25.

We could introduce a new HtmlCheck that has on_... callback methods like LiquidCheck & JsonCheck.

I'm not sure that's possible. How can we make an AST from something like this?

<html>
<head>
  {% if some_property %}<script src=""></script>{% endif %}
  <scr{% if some_property %}ript{% end %}></script>
  <script{% if async %} async{% end %}{% if defer %} defer{% end %}></script>
</head>
</html>

This is technically valid liquid but I don't think you could parse an AST out of this since you'd need to context switch between HTML and liquid. Plus what if the liquid makes the tag insertion conditional? For our linter's purposes, is it there or is it not? What about conditional attributes on a tag?

I know the regex based approach is not ideal and is loopy but the intention is not of blocking loop holes. It's more about preventing you from doing something you were not even aware of / or preventing you from making a mistake.

If your goal is to bypass the rule, you'd have an easier time using {% comment %}theme-check-disable{% endcomment %}

I would even go as far as saying that if the script is parser blocking in a comment and we tell you, then you should just remove it or add a comment that makes it OK.

@macournoyer
Copy link
Contributor

It is static analysis so technically is should analyze the code in all the conditional branches:

{% if some_property %}
<!-- Would analyze this, as an independent HTML fragment -->
<script src=""></script>
{% else %}
<!-- ... and this, as another independent HTML fragment -->
<script src=""></script>
{% endif %}

However for those it would indeed be very complex:

  <scr{% if some_property %}ript{% end %}></script>
  <script{% if async %} async{% end %}{% if defer %} defer{% end %}></script>

But that would also be tricky w/ the regexes, no?

Do you have other HTML checks in mind?

@charlespwd
Copy link
Contributor Author

charlespwd commented Feb 10, 2021

But that would also be tricky w/ the regexes, no?

Like I said, the idea is not really to prevent loopholes. It's to make sure that the basic stuff doesn't go through. Which the regex does kind of well.

Other checks that I have in mind that I may, definitely, implement:

  • File size limits on JS bundles
  • File size limits on CSS bundles

Stuff I'm thinking of maybe implementing:

  • Using srcset on <img> tags
  • Google optimize warning :p

Kicking the can:

  • Loading assets from domains other than shopify

In this sense, the only other ones that would be regex based is the srcset, google optimize and loading assets from domains other than Shopify.

But in most of these cases, you'd have to use liquid for the attributes of the tag (src, srcset, etc.)

@charlespwd
Copy link
Contributor Author

charlespwd commented Feb 10, 2021

Also a common thing that happens is this:

<script src="{{ 'foo.js' | asset_url }}" async></script>

Which can't be parsed as an independent fragment.

Say goodbye to slow sites. Adds friction to under performing HTML.

Fixes #78
@charlespwd charlespwd force-pushed the feature/parser-blocking-scripts-rule branch from 87ac252 to f229e1e Compare February 11, 2021 12:59
@macournoyer
Copy link
Contributor

For all the variables ({{ ... }}), we could just ignore those, the same as if x = "" in {{ x }}. So would end up parsing:

<script src="" async></script>

As a fragment.

But, I get it. The solution is slowly turning more complicated than the problem.

Regexp sounds good for now. We can revisit once we have a few more HTML checks in place.

Copy link
Contributor

@macournoyer macournoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LETS KILL THOSE BLOCKING SCRIPTS!!1 🔥

@charlespwd
Copy link
Contributor Author

True, true! But then what if "async" is the output of the drop?

CHECKMATE!

😛

LET'S GET EM! 🔥

@charlespwd charlespwd merged commit e2474d4 into master Feb 11, 2021
@mmorissette
Copy link

This is so good 🚀

@macournoyer macournoyer temporarily deployed to rubygems February 16, 2021 14:44 Inactive
@charlespwd charlespwd deleted the feature/parser-blocking-scripts-rule branch February 18, 2021 21:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Opinionated HTML rules
4 participants