-
-
Notifications
You must be signed in to change notification settings - Fork 3
XZ compression support #8
Comments
I made some example implementations:
Side note: I found working with the command-line arguments section of the script pretty easy, so I'm thankful it is set up so neatly to begin with! |
Hi @DeeDeeG
Making xz the default is my preferred method, if it is robust enough and not too complicated! Looking at other managers:
I do not want all the complicated version testing like nvm does, thinking just the |
Thanks for the reply. Here is an implementation with those checks:
If both of these pass, xz use is allowed. I also added My thoughts regarding this implementation:
|
Update about macOS: On recent macOS, the In any case...
(Oddly, and mildly off-topic/a bit of a sidebar: downloading the xz archive and running I don't think macOS tar properly knows what to make of the -J flag, or how to properly understand that what an xz archive is locally on disk... But can extract it if told "this is the archive I want to extract" with the I don't 100% understand why it all behaves that way, but the behavior has been mapped out. This requires a bit of extra handling, so I will think about it and look for a good solution for the "supports xz" test. I am thinking of blanket whitelisting macOS as supporting xz, but this is probably not totally accurate? Not sure how far back macOS tar can decompress xz. Edit: Came back to add a link/source |
I had noticed the Mac setup when we were looking at adding the environment variable to You can see why I was happier initially to add xz as an env var than on by default! 🙂 I did a bit of research and found tar mentions libarchive, and that lists support for xz from 2009. I have not tied that back to tar and Mac and Linux et al, so can't draw any conclusions yet. https://github.com/libarchive/libarchive/wiki/ReleaseNotes#libarchive-270 |
Short summary of this comment: It looks like we can enable xz on macOS 10.7 and above. Found some info about OS X and libarchive versions. According to this: https://opensource.apple.com/ OS X 10.6.x shipped with libarchive 2.6.2. And OS X 10.7.x shipped with libarchive 2.8.3. (2.8.3 is still used in Mojave, albeit with various, minor-looking, darwin-specific patches accumulated over the years if I understand correctly. For example, see libarchive.plist from macOS Mojave 10.14.5 for details.) So without being able to test a bunch of old mac computers, I'd speculate the correct cutoff is OSX 10.7 and above can use xz, 10.6 and below can't. Implementation note: It looks like the user's macOS/OSX version can be read on the command-line in a script-friendly way: https://www.cyberciti.biz/faq/mac-osx-find-tell-operating-system-version-from-bash-prompt/ Like so: I also poked around in the libarchive source history, and found references to On a mac running El Capitan, I found the following:
|
Good digging, thanks. I did some digging tonight and came across a few interesting links. I have not formed conclusions yet, but starting to wonder whether full "properly" is too hard! Interesting that node are talking about reducing number of options.
|
Mini update to the update about macOS: As this StackOverflow answer rightly points out, the xz/lzma feature of bsdtar is configurable at compile time. The feature is present back to macOS 10.7, but it is configured off until macOS 10.9. So I'd say move the cutoff version to macOS 10.9 and above. (Wish I had an array of old macs to test this on!) I can test a bunch of old Linux releases, so once the more theoretical/research parts are done, I plan to do that (test a bunch of Linux distros against the draft PR) and confirm this works. I don't feel that working this out for SunOS or AIX is worth it, or feasible, since as far as I can see, Regarding AIX: There are no xz tarballs for AIX. e.g. this is the only AIX tarball for node v12.9.1: node-v12.9.1-aix-ppc64.tar.gz) So xz is off the table for AIX. But we can tell that the arch for AIX should always be "ppc64". That could be addressed in a separate PR.
We can always do
That's what I presumed when I began, but I just figured I'd give it a good try and see how far I get. I'm happy with progress so far. |
Here are the work-in-progress implementations:
|
Short summary: Linux looks good with these changes. (I know that 's a broad thing to say, but I did test quite a few distros. All Linux distros that I tested use GNU tar. GNU tar uses xz from the PATH by default (unless configured to do otherwise; no distro appears to have done otherwise). The test added to nvh for this issue, which similarly looks for xz on the PATH, had 1:1 correlation with the given Linux distro's [GNU] tar successfully using xz to extract tarballs.) (SmartOS is a different story, since it's not Linux, and has its own (unique?) tar.) (As mentioned in previous comments, macOS uses bsdtar and acts a bit different compared to Linux, so I would like to test on old macOS.) I have been able to test the script as updated for this issue. I tested it on quite a variety of Linux distros, including some older ones to check how far back xz support goes. Here are my results: (click to expand)
Legend:
~ Statuses:
Notes on Oses:
None of the operating systems/distros tested were adversely affected by the changes for this issue. (i.e. no OS had less ability to successfully download node with nvh after the changes for this issue, compared to the script as it is on the develop branch.) Some distros failed to download node with nvh, but this was due to various things unrelated to the changes proposed for this issue. (These issues could be worked around with varying degrees of difficulty.) For example: RHEL 7 was a bit wonky because it had no rsync installed. (unrelated to xz/this issue). SmartOS (what Node means when they host tarballs for "sunos") was another case entirely. (Technically not Linux, though.) It rolled its own unique (Apparently SmartOS isn't intended to be used directly as a general-purpose OS; rather, you are supposed to install VMs and containers on it. But the tarballs for SmartOS are up at nodejs.org/dist, so I thought I'd test anyway.) Would still want to test this on older macOS, in an ideal world. But I am not exactly sure if I will get the opportunity. I know you can make bootable USB installers for old macOS, so maybe I will try it in Terminal on the bootable installer. (Don't want to actually re-install macOS multiple times.) Misc note: xz support was added to GNU tar in version 1.22, back in March 2009. Here's the changelog entry. [Edited 8 September to correct/clarify results and explanatory notes for Karmic Koala.] |
I was able to test on OS X. Here are my results: Click to expand.
Legend:
Notes:
OS Notes:
I found that Mavericks (OS X 10.9.x) can extract xz tarballs piped into So I believe the platform checks in place for macOS are correct. [Edited 8 September to clarify that testing was done in VirtualBox, and that nvh was tested after installing to the hard disk, rather than testing "on the iso", or in the Terminal available on the installer image.] |
I have confidence in these platform checks, but I admit there is increased complexity by trying to check the platform for xz support. It is a low impact problem if it's wrong and off-by-default, but it does become a big deal if it's wrong and on-by-default. Other version managers seem to have decided it was worth it. I do think it works pretty well, but would like this to have a way to turn it off. A) just so people have control, and B) For the (seemingly unlikely) scenario where the platform checks are wrong. I feel that the environment variable, a command-line switch, or both would be adequate end-user control, which I have already made work in drafts posted to this issue. |
This comment is mostly about whether to do checks that help us when the user has installed a non-platform-default Click to expand.For what it's worth: Right now I only anticipate the platform checks to be wrong if the user has installed a non-platform-default tar that has no xz support.* I don't know why someone would do that, but... you never know? And the other scenario is if Apple breaks my macOS version check. e.g. If they increment the major version ("10") of macOS, or release/switch to a new platform other than macOS, or stop having the * (We could check for this by making sure Linux users' Again, I am really unclear why someone would go through the trouble of installing non-platform-standard Review of prior art... Do they check which tar is being used? NVS decided to check if local NVM mostly doesn't do so. see here. (I note that they only platform check by whether |
Some great research, and I love the Linux and macOS tables of tests. Good work tracking down the release note for GNU tar adding xz support. Likewise, digging into the |
Not sure if you discovered it, but there is support for trying things across multiple docker containers (currently a hard-coded list in the script).
|
The https errors are presumably due to old certificates in the docker containers. The best fix is to update the certificates, like here:
or as you mentioned, configure curl or wget or nvh to be insecure:
or change the protocol to http when defining
|
Is |
Indeed, I would also point out that running the Last note: I had it not set to xz on by default, just to make sure I was awake and testing with/without xz on all platforms. |
On second thought: If you have interest in a command-line option for xz, such as They are not in Decided questions:
Other Questions:
Research question:
|
I noticed this after doing much of the testing with LiveCDs in VirtualBox. It looks really convenient. (Probably a good candidate for CI such as Travis CI, if you wanted to run it on all Pull Requests, but I would equally understand leaving it to be run manually. Personal preferences and all.) I had tried doing (I realize now my docker setup requires running with sudo, e.g. For the tests themselves, I am happy to report they all passed on Some tests didn't work for me outside of docker, so I wrote up the details of that below (in an expandable section of this comment). Click to expand bug info
Whereas
I don't think
when I ran So I added that to the
Apparently, and perhaps because this is not the first test file being run, these installs are being conditionally skipped. I commented out the I came up with this patch to make things work outside of docker (Ubuntu Linux in my case): DeeDeeG/nvh@develop...DeeDeeG:test-patch-for-linux-outside-of-docker Unclear if this defeats some optimization for use in docker, but it doesn't break the tests in docker, at least. |
Extra nerdy research for Linux that I refrained from posting thus far (in favor of real-world tests, but this is how I knew which OS versions to test to get interesting results):
|
For ease of review, I prepared some updated implementations. They differ in how many ways the end-user can control which compression algorithm is used.
(Edit to add: Commits are separated neatly, so any of these features can easily be included in a final PR. Neither of these two extremes necessarily needs to be the final form of this feature.) |
I was able to test on OS X Mountain Lion (10.8.5). I can confirm it does not support extracting xz tarballs. (This has been added to the results chart above.) That supports a macOS cutoff of 10.9 or above, as is present in my existing implementations for this issue. For thoroughness' sake: Mountain Lion does not support extracting xz-compressed tarballs locally, via Further details: OS X Mountain Lion does not have I suppose we could test for this:
But again, I suspect this will gate the same systems, and it should be roughly equivalent. This might make it easier to support platforms other than Linux/macOS/SmartOS/AIX, but the tarballs are only prebuilt for those platforms anyway. So IMO being more obvious that we are targeting certain distros/releases of Linux and macOS makes the code more readable and perhaps more to-the-point. I like a principle of code-what-you-mean, so that the code is more readable, and to discourage expansive scope creep of the given project. Just my personal thoughts on this. And FWIW, neatly version-testing tar would take several more lines in Edit: here's an implementation with tar version checks instead of OS checks: Not really OS-neutral for macOS, since I don't believe other platforms ship libraries as I do note that the homebrew repos offer both GNU tar and
|
(I have been busy on another project. I'll try and take a look at this soon. Thanks for al the investigations!) |
|
Implementation feedbackYes, also leaning towards auto-detection for xz rather than just manual. I think an env override is worthwhile. I don't feel a command line switch is high value but the implementation is tidy, and more findable when hit problems, so ok with keeping that too. I do not want short flag of
Rather than testing for xz url using is_ok, I would prefer to test target node version to avoid the network call. I feel a bit guilty for suggesting performance over robustness, but want to minimise the network calls where reasonable. Ask questions if any of these seem dubious or unclear, I might have misunderstood. :-) |
Responding to feedback, then will work on an implementation:
✔️ Agreed on these points. Thank you.
Okay. This is certainly doable, if a little less compact/neat in the setup phase. Will try a proof-of-concept for this in my next iteration. Wrote up a bunch below on why alternatives are more complicated than parameter expansion, but this is all theory, and it's probably not a huge deal in practice. (I think we need to use Lots of thoughts about truthiness/non-null in bash, and what to do about it in nvh (click to expand)Aside (some background info) about
|
Example implementation with true/false for I'm inclined to go with "true/false" for this feature at the moment, if only to match the rest of the script's internals, but I suppose "1/0" would also be okay. (The internals are a code style/developer consideration, but what we expect users to set their environment variables to is also a UI/UX consideration.) Might not need the Aside about performance of "full-fat string comparisons" (click to expand)As for "full-fat string comparisons"... For some reason it's been drilled into my head that "string comparisons are bad" because they're supposedly "very slow." I tried changing |
I had something like this in mind. Note, this code fragment is not passing in the target version to
|
Style: I am using https://google.github.io/styleguide/shell.xml?showone=Test,_%5B_and_%5B%5B#Test,_%5B_and_%5B%5B |
I think you have all the pieces I would want, scattered among the branches! Should I try and sort out the pieces so you can put together a PR? |
Sure, that sounds good. It's getting into the smaller details now where I think a PR makes sense. |
Re: this comment (which suggests using a single if-else[...] statement to handle Unclear if the "xz on" code path should "force" xz on, i.e. skip the minimum node version check (or xz URL Pros:
Con:
Related: If we don't distinguish between "forced on" and "auto on" later in the script, the |
Re: #8 (comment) again: I just noticed we can't use the function Working on putting together a PR. |
Interesting question, and I have been thinking about that. I started out thinking that command line should override all checks, but have changed my mind. I think it is reasonable that the version check is independent of the auto check. The auto check is looking at the system support for xz, and can be overridden by user due to preference or incorrect or incomplete detection. The version check is a limitation in availability of that archive format. (While in theory there could be mirrors with different combinations of archive formats, only worth revisiting when proves an actual use case.) |
Not a consideration at this time. On a related note, there was some support in |
Simple answer is no. I don't see issues reported from people having issues integrating with forks, including from the recent major changes. Focus on making the code simple and robust for our use case. 😄 |
In other words, trying to get io.js doesn't really work anymore? (because the "product name" would be "iojs" rather than "node" or something like that?) I am trying setting iojs.org as the custom mirror at the moment, but it fails with this: whereas the proper url would have v[num]/iojs-v[num]-[os]-[arch].[ext] |
A musing, rather than a suggestion. If the internal and external formats for a variable are different, then may be appropriate to use a different name. For example if externally It is convenient to reuse when supplying a default and doing minor normalising, like with NVH_NODE_MIRROR. |
It was more a general comment that |
Adapted from this comment: shadowspawn#8 (comment) Co-authored-by: John Gee <[email protected]>
Skips node version checks, which would otherwise gate xz usage. In other words: Overrides any checks that would normally be run, so that nvh always attempts to use xz. --- Partly adapted from this comment (with liberties taken on implementation): shadowspawn#8 (comment) Co-authored-by: John Gee <[email protected]> --- Also updates "--use-xz" to not set the "xz forced on" state. (This matched the flag's description in the help messages.)
Adapted from this comment: shadowspawn#8 (comment) Co-authored-by: John Gee <[email protected]>
Skips node version checks, which would otherwise gate xz usage. In other words: Overrides any checks that would normally be run, so that nvh always attempts to use xz. --- Partly adapted from this comment (with liberties taken on implementation): shadowspawn#8 (comment) Co-authored-by: John Gee <[email protected]> --- Also updates "--use-xz" to not set the "xz forced on" state. (This matches the flag's description in the help messages.)
Adapted from this comment: shadowspawn#8 (comment) Co-authored-by: John Gee <[email protected]>
Skips node version checks, which would otherwise gate xz usage. In other words: Overrides any checks that would normally be run, so that nvh always attempts to use xz. --- Partly adapted from this comment (with liberties taken on implementation): shadowspawn#8 (comment) Co-authored-by: John Gee <[email protected]> --- Also updates "--use-xz" to not set the "xz forced on" state. (This matches the flag's description in the help messages.)
Adapted from this comment: shadowspawn#8 (comment) Co-authored-by: John Gee <[email protected]>
Skips node version checks, which would otherwise gate xz usage. In other words: Overrides any checks that would normally be run, so that nvh always attempts to use xz. --- Also updates "--use-xz" to not set the "xz forced on" state. (This matches the flag's description in the help messages.) --- Partly adapted from this comment (with liberties taken on implementation): shadowspawn#8 (comment) Co-authored-by: John Gee <[email protected]>
Preference for xz downloads, if detected, released in Thank you for your contributions. |
Hi again.
I recently made the pull request to add an xz compression option to n. I noticed NVH doesn't have this option at the moment.
I wondered about the way to do it "properly," since it's a bit obscure the way it's implemented in n right now. As such, I decided to open an issue and not jump straight to a pull request this time.
The feature could be added similarly to how it was done over at
tj/n
:NVH_USE_XZ
from the environment, and decide to use xz or gzip based on thatBut I also thought about the following:
Overall I figured it would be a low-pressure way to figure out doing xz compression the "right way" to do so in this repo. As a fork, there is a bit more freedom. And I suppose whatever solution is decided here can probably be ported to
tj/n
relatively easily if desired.If you have thoughts on the preferred way to do this, I would be interested in working on and/or collaborating on an implementation.
Best regards.
The text was updated successfully, but these errors were encountered: