Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn or error if generated HTML page is too large #2142

Closed
mortenpi opened this issue Jun 15, 2023 · 2 comments · Fixed by #2205
Closed

Warn or error if generated HTML page is too large #2142

mortenpi opened this issue Jun 15, 2023 · 2 comments · Fixed by #2205
Labels
Format: HTML Related to the default HTML output Type: Enhancement
Milestone

Comments

@mortenpi
Copy link
Member

We include the text/html output of plots etc directly in the generate HTML pages, which means it's quite easy to end up with HTML files that are tens of MBs (or more), if you e.g. have a plot with too many points.

We should at least print a warning if the file size goes above some reasonable (and configurable) limit. But it might even be a good idea to make this an error by default (in which case we'd want this in 1.0.0).

Not sure what a reasonable default limit would be. One option would be to do a quick survey of existing doc deployments.

cc @bauglir

@mortenpi mortenpi added this to the 1.0.0 milestone Jun 15, 2023
@mortenpi mortenpi added the Format: HTML Related to the default HTML output label Jun 15, 2023
@bauglir
Copy link
Contributor

bauglir commented Jun 16, 2023

We should at least print a warning if the file size goes above some reasonable (and configurable) limit. But it might even be a good idea to make this an error by default (in which case we'd want this in 1.0.0).

A warning sounds like a good place to start. Although those might also easily get buried between other warnings when building documentation. At least some of the documentation builds that I see spit out very large amounts of logs. So an error might be better, especially in CI. This might be a bit of a nuisance for long builds if such an error completely fails the entire build, so perhaps this is more of a check at the end of a build? Or at least one that doesn't error until the end, so that build artifacts are available for inspection.

I'm not sure if you'd want to be able to explicitly turn this "off", or whether that'd just amount to setting a very high limit.

One option would be to do a quick survey of existing doc deployments.

The primary trigger for us to run into this was SEO. This is obviously a much bigger topic, but this (hopefully) might be something that is relatively easy to catch. I'm seeing some statements that the actual page size is not actually a factor here. This indicates that the large file sizes I'm seeing are only indicators of the actual issue which appears to be the ratio between "user readable content" and "HTML content", which I interpret as effectively element.innerText.length / element.innerHTML.length.

For what it's worth, speaking about absolute file sizes I start getting warnings for files just over 2MB in size. But if we're able to do the other comparison that might actually be more worthwhile.

For the ratio I actually have no idea what the exact limit should be. A quick search didn't surface any hard numbers. For the particular cases that I'm looking at we're talking about ratios of content to HTML of less than 1%.

@Dattax
Copy link

Dattax commented Aug 3, 2023

Hi @mortenpi - when you're back - what can we do to close out this issue? A warning is nice, but doesn't solve the actual problem...we can def ask the SEO specialists about this one if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Format: HTML Related to the default HTML output Type: Enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants