-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Switch XML minifier from pretty-data to minify-xml #488
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your statement about whitespace and
tags is wrong. pretty-data does removes whitespace if it is the only text content between tags. For xhtml:pre Tags, this is still wrong.Does minify-xml behave differently? According to your description of the
removeWhitespaceBetweenTags
option, I don't think so.
Ahhh I see what you mean Frank! I think you are right. Let me optimize that in |
I don't think that minify-xml should know our usage of xhtml in XMLViews, or should it? |
No, it shouldn't... however "mixed content" (an element not only containing whitespaces and other elements) is a general concept in XML, so I am thinking about adding an option to generally ignore removing whitespaces in mixed content. I think that'd be generally favorable, what's your opinion? Without the flag set (default option), tags like: <Tag>Hello <Tag/> <Tag/> World</Tag> Would not get optimized. With the flag set it would behave the same as <Tag>Hello <Tag/><Tag/> World</Tag> With such an option I can only think of certain special cases where this logic might fail: <Tag>Hello <Tag> <Tag/> </Tag> World</Tag> Would still get reduced to: <Tag>Hello <Tag><Tag/></Tag> World</Tag> Alternative would be to completely igore all tags inside of mixed tags, which might be the right thing to do, if I think about it... Or we stick with your pre-check... I am not sure. What do you think? I'll have to read a little bit more about XHTML, maybe I have to even add an option to ignore certain tags from minification. I am not entirely sure how |
But wait... if you really want to have tags rendered in a <pre> definitely < would newd to get encoded as <. Otherwise it's out of spec. also inside <pre>. The only other special case I can think of is: <pre> So a <pre> with only whitespaces, which would not classify as mixed conent. This is again where I would need a flag to specify a list of "don't optimize these tags"... I'm unsure... Let me think about it and read how XML1.1 deals with mixed content in general. Edit: Hell, even GitHub is confused what we are discussing here... |
Trying to explain encoding issues in a WYSIWYG editor is hell, yes. To me the <pre> check seemed fairly simple, it should match only in very rare cases and skip the optimisation for them. With minify-xml, it could even just toggle the |
You are right... it's such a narrow use case, that putting so much "special case handling effort" in I will think about it one last time when I am back to my PC. Otherwise let me add that and fix the bugs you reported in minify-xml! Thanks Frank! |
While changing this, I also found that your pre-tag check did never work with pre-tags containing attributes. Changed: const xmlHtmlPrePattern = /<(?:\w+:)?pre>/; To: const xmlHtmlPrePattern = /<(?:\w+:)?pre\b/; I will soon amend my change. Just need to fix two more tests. |
Fixed all issues you addressed also in kristian/minify-xml#1 and kristian/minify-xml#2 @codeworrior! Thanks again for the code review! 👍 With version 2.1.0 I also added an option so empty XML elements are collapsed by default (<xml></xml> to <xml/>). |
Hey @kristian, Have you already done some benchmarking? It would be interesting to see the impact on size and runtime performance. We usually use the tool hyperfine for measuring runtime changes, like here: #390 (comment) For example |
I did a benchmark with an application that contains 93 relevant *.xml files:
|
Mhm, the diff in size doesn't really make sense to me? The two regex which where performed in |
I mean, the difference is just 50 kilobytes. I find that neglectable. I'll send you a link to the project via chat. It's not open source. |
Yes, but I mean the reason for switching to a better XML minifier was to reduce the bootstrap size ;) So I would like to look into what is the root cause of the larger size. Thanks! |
@RandomByte I was finally able to do some more benchmarking, sorry for the delay! First I checked the project you sent me and try to reproduce your benchmark results. Generally I noticed two things. The change you sent me was based on quite an old version of the CLI, I don't know if you linked your original run locally, but if you didn't this could have been the cause of the increased size. Also I noticed, that (at least what I can see from your hyperfine command (if you have not altered it), that you have not specified the hyperfine --prepare 'rmdir /s /q dist || rm -rf ./dist || echo' --warmup 1 '( set UI5_CLI_NO_LOCAL=X || UI5_CLI_NO_LOCAL=X ) && node ./node_modules/@ui5/cli/bin/ui5.js build self-contained' --export-markdown ./measure.md Executing it on the current builder release (2.1.0) and my fork, it yielded the following results:
So in my tests the performance stayed the same, if not sligthly better, whereas the whole boostrap size was reduced by about 50kb, which lies in the margin what I would have expected, based on the optimizations I did in |
@RandomByte any update on this PR? 👍 |
I'll try and replicate your results.
But is that true? XML minification should also take place for the Component-preload and library-preload generation (default |
Hm, indeed I might have messed something up in my first benchmark. I repeated it and also did another one for the self-contained build like you did. I can confirm that in both cases, the result is about 50 KB smaller. For the Component-preload.js of this application, that's a reduction of 1,16%. However in both cases the build took almost a second longer. For the default preload build that's 26% of the total build time. We will discuss this in the team and might do some additional tests with other projects. I chose this project since it has an uncommonly high amount of XML views and fragments (93). The question is how more common applications with <15 XML views are affected. self-contained Build
Preload Build (default)
I directly called the CLI in my development setup where all UI5 Tooling modules are cloned and linked between each other. So basically |
I just linked the builder module and used the same CLI. But anyways... for me the runtime was even faster when using my new module, which does not make sense, because at least my module contains more regular expressions executed. I can only say, I also always got the warning message of hyperfine, that there was quite a high variation between runs, thus the results might not be significant enough. I can't really tell whether the variation in build time is due to my change, or whether it is due to variations in disk speed, etc.? Also, even if the performance is decreased, I cannot tell if it is a linear performance decrease, what I mean is, whether my change generally adds 1 second "flat" to the build time (e.g. the whole build takes 120s normally, and with my build it takes 121s), or if it depends on the number of resources etc. All I can say, that for me performance measurements in NodeJS always have a high number of variables involved, so a differentiated test is very hard to do. What might be reasonable to do, instead of executing a "test build" to analyze the performance, let's try to execute the realted unit-tests only. This (for my opionion) should give a much more accurate performance measurement result. |
For your results this might be the case, since it reports a deviation of ± 1.877. Are you using windows? If yes, did you execute this benchmark using the native CMD or some BASH emulation like the git-scm BASH? Anyways, my results are pretty consistent across all benchmarks and show only a small deviation of less than a second. Even the one I did almost a month ago shows a 700ms increase (given that only the dist-size and not the runtime was incorrect in that benchmark). I was also wondering wether it's one second flat or linear with the amount of XMLs to process. This should become clear once we benchmark some more applications. |
@RandomByte I rebased the commit on |
@RandomByte did you have the chance of repeating your measurements with the latest version of |
FYI, I again ran the minification on our current bootstrap. With the new old xml minifcation bootstrap size 4,97 MB (5.214.753 bytes) Reduction of 38,06 kB (38.060 bytes) only by the introduction of the new minifier. I think that's a very good result. 👍 |
I did another benchmark with the same app I tested earlier. The size reduction stayed the same, but it seems your change indeed improved the runtime. The runtime is now almost identical (0%-2% difference) and the size consistently smaller (0,5% of Preload Build (default)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks Merlin! Thanks Frank! :) 50kB I find "reasonable", but it also depends on how optimized your code already is. E.g. what happens very often in our use case is, that developers simply copy a view and forget to adapt their namespace imports. Over a couple hundrets of views this adds up! 👍 Thanks for the efforts 🥇 |
To also share the result of our team internal discussion: We will go ahead and merge this PR. Based on my last measurements there seems to be a (small) improvement in XML size reduction at an equal runtime. Also pretty-data might not be maintained anymore. Let me try and resolve the CI issue and I'll merge it as a new feature. Thank you for your contribution @kristian! 👍 |
All in all pretty-data is a pretty bad (phun intended) XML minifier. It does for instance not minify tag contents, nor it optimizes anything else than removing comments and whitespaces between tags. Tags like: <Tag attributeA = "..." attributeB = "..." /> Will not get reduced in size. Switched to minify-xml (a package which I specifically created for this patch, to do a better job at minfying) on the other hands removes comments and whitespaces between tags, as well as collapses whitespaces in tags and removes unused namespaces from views. These are all pretty common use-cases for UI5 xml views, thus I clearly think it's the better option compared to pretty-data. Also this change removes the strange assumption that pretty-data (or now minify-xml) would modify any tag content (like pre tags), which both libraries clearly don't do. Fixed typo in test (autpSplitter).
PRs where this process was already used: * SAP/ui5-builder#488 * SAP/ui5-builder#390 JIRA: CPOUI5FOUNDATION-261
All in all pretty-data is a pretty bad (phun intended) XML minifier. It
does for instance not minify tag contents, nor it optimizes anything
else than removing comments and whitespaces between tags. Tags like:
Will not get reduced in size. Switched to minify-xml (a package which
I specifically created for this patch, to do a better job at minfying)
on the other hands removes comments and whitespaces between tags, as
well as collapses whitespaces in tags and removes unused namespaces from
views. These are all pretty common use-cases for UI5 xml views, thus I
clearly think it's the better option compared to pretty-data.
Also this change removes the strange assumption that pretty-data (or now
minify-xml) would modify any tag content (like pre tags), which both
libraries clearly don't do. Fixed typo in test (autpSplitter).
Thank you for your contribution! 🙌
To get it merged faster, kindly review the checklist below:
Pull Request Checklist