Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(meta_generator.js): Do not check if meta_generator has been injected before. #4078

Conversation

dailyrandomphoto
Copy link
Member

@dailyrandomphoto dailyrandomphoto commented Jan 14, 2020

What does it do?

Fix #4056 . Related #4070, #3129, #3315, #4051, #4048.

  • Change 1:
-  if (!config.meta_generator
-    || data.match(/<meta([\s]+|[\s]+[^<>]+[\s]+)name=['|"]?generator['|"]?/i))
+  if (!config.meta_generator) {

meta_generator.js was introduced in #3129.
Why do we need to check if meta_generator has been injected before?
Because while generating an HTML, the after_render:html filter will be executed many times at rendering .md file, partial files, so generator meta tag will be injected multiple times. (See #4048.)

Another filter, after_route_render, was introduced in #4051, which ensures that the meta_generator is executed once per HTML. Therefore, this check is no longer needed, and performance is improved without it.

How about if meta_generator() helper was used?
Set meta_generator: false in _config.yml.

  • Change 2:
-  return data.replace(/<head>(?!<\/head>).+?<\/head>/s, str => str.replace('</head>', `${hexoGeneratorTag}</head>`));
+  return data.replace('</head>', hexoGeneratorTag);

The previous regex was introduced in #3697 which ignores empty tags.
The empty <head> tags appears because cheerio and after_render:html filter are used.
The after_render:html filter makes rendered results of partial files processed by meta_generator.js, and when a third-party plugin uses cheerio 1.0.0-rc.1+, <html><head><body> are automatically added to the rendered results of partials.

<html><head></head><body>rendered results of partial file</body></html>

As the use of after_route_render and removal of cheerio, empty <head> tags no longer need to be considered.

Commit history of meta_generator.js: part 1 part 2

How to test

git clone -b perf-meta_generator-filter https://github.com/dailyrandomphoto/hexo.git
cd hexo
npm install
npm test

Pull request tasks

  • Add test cases for the changes.
  • Passed the CI test.

@coveralls
Copy link

coveralls commented Jan 14, 2020

Coverage Status

Coverage increased (+0.0006%) to 97.701% when pulling 915c507 on dailyrandomphoto:perf-meta_generator-filter into 584b093 on hexojs:master.

@SukkaW
Copy link
Member

SukkaW commented Jan 15, 2020

You see, maybe the theme has already inserted the meta[name="generator"] tag while user just forget to disable config.meta_generator. We'd better not to insert duplicated tag.

Also, manipulating ctx.config directly in meta_generator() helper looks not so appropriate.

@dailyrandomphoto
Copy link
Member Author

You see, maybe the theme has already inserted the meta[name="generator"] tag while user just forget to disable config.meta_generator. We'd better not to insert duplicated tag.

You're right. However, this case is rare, and duplicate meta tags do not cause any problems.
We should not sacrifice performance and increase the complexity of code for trivial problems.
Also, duplicate meta tags can give users clues to help disable the meta generator filter in _config.yml(set meta_generator: false), to improve performance(#3919 ).

@dailyrandomphoto
Copy link
Member Author

Also, manipulating ctx.config directly in meta_generator() helper looks not so appropriate.

What does it mean?

@SukkaW
Copy link
Member

SukkaW commented Jan 16, 2020

@dailyrandomphoto Although simply changing meta_generator from true to false will not cause any problem, we should avoid write config back to ctx.config, especially changing the format of the config. hexojs/hexo-generator-feed#115

Also, I have tested the performance impact of registered filter. As you can see, external_link filters are always registered no matter ctx.config.external_link is enabled or not. However, there is no performance regression since there is an early return inside.

@SukkaW
Copy link
Member

SukkaW commented Jan 16, 2020

However, there is no performance regression since there is an early return inside.

Based on this, I have already added a cache in my #4070 in order to make sure return early before data.replace.

@SukkaW SukkaW closed this Jan 16, 2020
@SukkaW SukkaW reopened this Jan 16, 2020
@dailyrandomphoto
Copy link
Member Author

Although simply changing meta_generator from true to false will not cause any problem, we should avoid write config back to ctx.config, especially changing the format of the config.

This PR does not write config back to ctx.config.

The main change for this PR is

-    || data.match(/<meta([\s]+|[\s]+[^<>]+[\s]+)name=['|"]?generator['|"]?/i))

@SukkaW
Copy link
Member

SukkaW commented Jan 16, 2020

@dailyrandomphoto

We could only call data.match for once. I mean we could assume if meta[name="generator"] has been added to one page, then the tag must be added to all pages. That's what I did in #4070, simply cache the value.

@@ -24,15 +24,15 @@ describe('Meta Generator', () => {
resultType.should.eql('undefined');
});

it('no duplicate generator tag', () => {
it.skip('no duplicate generator tag', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to retain the tests. it is possible for theme to have any of these mistakes, we need to account for them so that we don't make it worse.

@SukkaW
Copy link
Member

SukkaW commented Mar 30, 2020

This PR might be superseded by #4208.

@stevenjoezhang
Copy link
Member

#4208 merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use Cache to determine if filter is enabled
5 participants