Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readlink has no -e option on busybox/alpine, realpath has no options at all #49

Closed
pawamoy opened this issue Nov 29, 2018 · 28 comments
Closed

Comments

@pawamoy
Copy link
Contributor

pawamoy commented Nov 29, 2018

The resolve_link functions do not work on Alpine / Busybox (Alpine uses Busybox), because
Busybox's realpath has no option at all and its readlink just has the following (not -e):

BusyBox v1.28.4 (2018-07-17 15:21:40 UTC) multi-call binary.

Usage: readlink [-fnv] FILE

Display the value of a symlink

	-f	Canonicalize by following all symlinks
	-n	Don't add newline
	-v	Verbose

So I guess we'll have complexify the function a bit. Also I noticed you did not update the resolve_link function in basher, it still uses greadlink.

Here is a first draft:

resolve_link() {
  if type -p realpath >/dev/null; then
    realpath "$1"
  elif type -p readlink >/dev/null; then
    readlink -f "$1"
    # -e is actually not mandatory for our use-case, but -f is, to resolve . and ..
  # else ??
  fi
}
@juanibiapina
Copy link
Member

juanibiapina commented Nov 30, 2018

Maybe just:

resolve_link() {
  if type -p realpath >/dev/null; then
    realpath "$1"
  else
    readlink -f "$1"
  fi
}

I didn't update resolve link in the binary file because it's copied over from rbenv and if it matches their use case, it should match ours.

@pawamoy
Copy link
Contributor Author

pawamoy commented Nov 30, 2018

I didn't update resolve link in the binary file because it's copied over from rbenv and if it matches their use case, it should match ours.

Yep I saw that while experimenting, it shouldn't be changed indeed 🙂

I like defaulting to readlink. I can open a PR if you want! We only need to update the function in basher-link and basher-link.bats

@pjeby
Copy link
Contributor

pjeby commented Dec 1, 2018

FYI, bashup/realpaths has a well-tested ultraportable microlibrary for handling symlink resolution that requires only readlink with no args, and supports both OS X and busybox. It's CC0 so you can literally copy and paste whatever you want from it, and for your usecase you only need the first 31 lines (or just the first 20 if you don't care about absolute paths).

@pawamoy
Copy link
Contributor Author

pawamoy commented Dec 1, 2018

It's up to @juanibiapina. Using your library can be a good idea for portability reasons. Though we would have to decide how we use it (copy/paste in two files? share code in one file and source it from elsewhere?) and I'd leave that design decision to @juanibiapina too 🙂

Also, could be nice to update the Travis config to test on different distributions!

@juanibiapina
Copy link
Member

For now I'd say to go with the simpler solution. If we ever need even more portability, we can revisit this decision.

The major difficulty I see is how to include dependent code in basher itself without complicating the installation process. I've tried this before and couldn't find a good way. We could always copy the code, but then this create other kinds of problems.

@pjeby
Copy link
Contributor

pjeby commented Dec 3, 2018

I've got a solution for that, too. 😉

The way my projects do dependencies like that, is that during development, dependencies are checked out in a project-local, git-ignored basher directory (.deps/), and then any needed files are copied by a build script to the distribution directories and saved in revision control. To update a dependency, you just check out and build the project, then commit the copied file(s). This means that a plain checkout is still a valid, ready-to-use distribution, without needing --recursive. But you can still easily update the dependency.

The bashup/.devkit toolset makes this whole process easy to do; in fact it's a big part of what it was made for. A project using .devkit that needed to add realpaths would just add this to package.sh:

# BUILD_DEPS is a :-separated list, can use '@ref' to pin a branch or tag
# These projects' files can then be accessed under .deps/, and their BINS under .deps/bin
BUILD_DEPS=bashup/realpaths

And this to the end of .dkrc:

# Copy updated dependency to libexec
on build cp .deps/bin/bashup.realpaths libexec/

And then anybody who checked out the package could update the dependency by running script/build in a fresh checkout, and committing the change. In this way, your existing distribution process isn't affected, but you're not reliant on manual copy-paste to follow changes. You can also use script/clean at any time to get rid of cached dependencies and force an update.

Of course, you don't need .devkit in order to do all that; you can just add stuff to your makefile to do the same things!

The real point is just that this overall approach lets you bundle dependencies in git, without needing to copy-and-paste the updates, and without needing --recursive checkouts. Someone who only needs to use basher can still clone basher and run it; the dependency-fetching and copying would only happen when someone is developing basher, and they run the appropriate makefile target or .devkit script/.

@juanibiapina
Copy link
Member

I like some of the ideas there. I'll keep that in mind!

@kadaan
Copy link

kadaan commented Feb 2, 2019

@juanibiapina What about falling back to perl for at least the osx (or any system with perl) case...

resolve_link() {
  if type -p realpath >/dev/null; then
    realpath "$1"
  else
    directory="$(readlink -f "$1" 2>&1)"
    if [ "$?" -ne "0" ] && [ "${directory#*readlink: illegal option -- f*}" != "$directory" ]; then
      if type -p perl >/dev/null; then
        perl -MCwd -le 'print Cwd::abs_path shift' "$1";
      else
        echo "$directory"
      fi
    else
      echo "$directory"
    fi
  fi
}

Might even be able to just try perl if the readlink -f call fails without checking for the failure message readlink: illegal option -- f. That would probably be fine, but might slow things down for readlink -f calls that were already returning errors other than illegal option -- f.

@pawamoy
Copy link
Contributor Author

pawamoy commented Feb 2, 2019

Well, perl is not available in official Bash Docker images 😕

@kadaan
Copy link

kadaan commented Feb 2, 2019

@pawamoy but in that case, readlink -f will work. And if not, the type -p perl will fail and the current behavior (failing tests...) will occur.

@pjeby
Copy link
Contributor

pjeby commented Feb 2, 2019

True, but if you're writing that much code you might as well use something like:

resolve_link() {
	local target srcdir resolved
	while [[ -L "$1" ]] && target="$(readlink -- "$1")"; do
		srcdir="$(dirname "$1")"
		# Resolve relative to symlink's directory
		[[ $srcdir != . && $target != /* ]] && resolved=$srcdir/$target || resolved=$target
		# Break out if we found a symlink loop
		for target; do [[ $resolved != "$target" ]] || break 2; done
		# Add to the loop-detect list and tail-recurse
		set -- "$resolved" "$@"
	done
	cd "$(dirname "$1")" >/dev/null
	echo "$PWD/$(basename "$1")"
}

which isn't much bigger and only needs readlink.

@kadaan
Copy link

kadaan commented Feb 2, 2019

Nice code. In this case you would need a lot more tests. An alternative to add OS X support easily is to check for greadlink, that way I can install core utils with brew and things will just work. I might be wrong, but it looks like that is what bats does.

@pjeby
Copy link
Contributor

pjeby commented Feb 2, 2019

The code I posted is a rip-off of realpath.follow from realpaths, which has a rather thorough test suite. So copying the original code instead of using the ripped-off version would also work, without requiring brew.

Conversely, checking for a bunch of different alternatives to readlink means tests need to be run on those OSes, but plain readlink -- X is the same everywhere you go, even OS X without homebrew.

@kadaan
Copy link

kadaan commented Feb 2, 2019 via email

@kadaan
Copy link

kadaan commented Feb 2, 2019

BTW, the greadlink version, to add OS X support, is much shorter:

resolve_link() {
  if type -p realpath >/dev/null; then
    realpath "$1"
  elif type -p greadlink >/dev/null; then
    greadlink -f "$1"
  else
    readlink -f "$1"
  fi
}

@pjeby
Copy link
Contributor

pjeby commented Feb 2, 2019

Technically, that's homebrew support, not actually OS X support. When used on OS X without greadlink, the above code fails and then the code calling resolve_link silently breaks in some fashion. The slightly-longer code to emulate readlink -f would work correctly on OS X without homebrew or greadlink. And in the event that readlink itself isn't there, it fails safely by at least returning the unresolved link.

Currently, basher has three separate partial implementations of functionality from realpaths, and none of them are directly tested. It would probably make more sense to embed realpaths in the main executable and then use the functions elsewhere, e.g. realpath.location to find basher's executable directory, realpath.resolved to get a source directory for linking, and realpath.canonical to do file path comparisons in the tests.

Come to think of it, looking more closely at where symlink resolution is used in basher link, I'm not sure it should be doing symlink resolution at all! Tools usually shouldn't flatten symlinks implicitly because there may be reasons for the original source to be accessed via symlink.

For example, one hosting provider I had would sometimes move my shell account to a different box, with different physical hard drive names, and then every tool I used that blindly unpacked symlinks quit working (I'm looking at you, Resilio Sync!) because it was pointing to physical directories that no longer existed. If the tool just took what I gave it in the first place (a virtual path), that problem wouldn't have happened.

That would basically leave the following two uses for symlink resolution:

  • Finding the basher executable directory
  • Comparing canonical paths in the tests

For the former usecase, the relevant bits of realpath can be embedded, and for the latter, they can be sourced. But with it embedded in the main basher executable, the functions could be made available to child processes via export -f, so at least duplication in other commands wouldn't be needed.

At this point, though, we're just going back and forth so I'm going to stop here. It's up to the project devs how to resolve this (no pun intended). 😉

@juanibiapina
Copy link
Member

Thanks for all comments and discussions. This is really great. @pjeby great summary!

I'll continue advocating for less code, and probably no copy and paste, as long as the use cases are very simple. It hasn't so far been a requirement, so I'm unwilling to introduce complexity. If rbenv can make do, so should we.

@kadaan
Copy link

kadaan commented Feb 3, 2019

@juanibiapina @pawamoy Everything you guys are saying makes sense, but at the moment an entire platform is unusable without forking basher, even though there is a simple change that can make it possible to support that platform. I'd advocate that adding greadlink support doesn't increase complexity and gives the possibility for OS X support.

@juanibiapina Given the reference to rbenv, it (and all derivatives) work on OS X because of code like READLINK=$(type -p greadlink readlink | head -1). If basher does something similar, OS X is capable of working today.

@juanibiapina @pawamoy At the moment, alpine support for basher can be added by installing coreutils. So, just like OS X, support for basher is possible right now. Even better than OS X, this can be done with zero code changes to basher.
kapture 2019-02-03 at 9 46 15

I would argue that the overhead of adding coreutils in minimal enough that until a better solution is agreed upon, alpine has the capability of running basher

Before installing coreutils

$ docker ps -s
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES               SIZE
7c8ae7a23230        alpine:latest       "/bin/sh"           18 seconds ago      Up 17 seconds                           alpine              0B (virtual 5.53MB)

After installing coreutils

$ docker ps -s
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES               SIZE
7c8ae7a23230        alpine:latest       "/bin/sh"           58 seconds ago      Up 57 seconds                           alpine              2.47MB (virtual 8MB)

@pawamoy
Copy link
Contributor Author

pawamoy commented Feb 3, 2019

@kadaan I think we started wrong. It wasn't clear to me that Basher is not working (or partially not working) on OSX. Maybe you could open a new issue describing the problem, so we can start fresh again? If I understood correctly, the current resolve_link function does not work on OSX?

Also @pjeby made a really interesting point, asking if it is even needed to resolve a symlink? At first when I opened a PR to change basher link, I wanted to resolve the path given as argument to basher link so we could actually use . and .. as argument, but maybe they are the only two cases where we want to resolve it to a canonical path!

Let me know if I got things wrong of course!

(also @kadaan thanks for the note about coreutils, I didn't think of that and it's actually very handy, I would totally be OK to install coreutils if needed in my Docker images. I would prefer that it works by default everywhere of course, including OSX 🙂 )

@pjeby
Copy link
Contributor

pjeby commented Feb 3, 2019

It doesn't need a "canonical" path even in those cases, just an absolute path, which can be easily had via cd dirname; pwd. And the parent basher process only needs to follow symlinks in order to find its executable directory, which the code I gave suffices for.

As for copy-paste, basher already has copy-pasted this very function into three different files. So it would probably be be a good idea to have less copy-pasting. An easy way to do that is to put the function in a file, say libexec/follow-link, where it can be sourced by commands, e.g.:

#!/usr/bin/env bash

follow_link() {
    # ...
}

abs_dirname() {
    # ...
}

if [[ $0 == "$BASH_SOURCE" ]]; then
    bin_path="$(abs_dirname "$0")"
    export PATH="${bin_path}:${PATH}"
    source "${bin_path}/basher" "$@"
fi

This file can now be sourced by the tests and basher-link. And by making it the symlink target of bin/basher, the bootstrap is done from the same place.

Voila, no more copy-paste, regardless of which of the many proposed versions of follow_link are used.

@pjeby
Copy link
Contributor

pjeby commented Feb 4, 2019

I've now created a proof-of-concept PR for this approach (#60), which eliminates the existing code duplication, uses absolute paths for the base dir of basher link (instead of canonical physical paths), and should work with busybox, docker alpine bash (without needing coreutils), and OS X (without needing homebrew or greadlink). It passes all existing tests, and works from a plain checkout with no changes to the existing installation steps.

The PR actually reduces the total lines of non-vendor code in basher, as well as eliminating any dependency on platform-specific tools like realpath and greadlink for symlink resolution.

Hopefully, having a concrete PR will aid in a prompt resolution (no pun intended 😉) to the issue of cross-platform symlink resolution. If anyone would like to try it out for themselves, you can get a copy via:

git clone -b realpaths https://github.com/bashup/basher

Please let me know if you find any issues not covered by the automated tests.

@kadaan
Copy link

kadaan commented Feb 4, 2019

@pjeby Works great when I run it on OS X. I do get this logging on the first run of make:

$ make
echo "#!/usr/bin/env bash" >libexec/basher.realpath
echo "# This file is automatically generated by makefile; do not edit!" >>libexec/basher.realpath
echo >>libexec/basher.realpath
cat vendor/bashup/realpaths/realpaths libexec/basher.stub >>libexec/basher.realpath
chmod +x libexec/basher.realpath
bats tests
 ✓ without arguments prints usage
 ...

Seems like this is new. What is the purpose?

@pjeby
Copy link
Contributor

pjeby commented Feb 4, 2019

It's rebuilding the bootstrap script that's used to find the real location of the libexec directory at runtime. I've now changed it to be a separate make target (make vendor), as it doesn't really need to be updated every time somebody checks out the code for development.

@juanibiapina
Copy link
Member

@kadaan I use basher o OSX, why do you say it is not supported?

@kadaan
Copy link

kadaan commented Feb 7, 2019

@juanibiapina Because without installing coreutils and overwriting the default install of readlink the basher-link commands don't work.

@juanibiapina
Copy link
Member

got it. So coreutils has always been an implicit requirement it seems. It's probably my fault because I never have a mac without it.

@kadaan
Copy link

kadaan commented Feb 7, 2019

Cool. The issues might be that coreutils is required, but not documented, and that it doesn't look for greadlink, expecting that coreutils was installed over the existing tools.

@pjeby
Copy link
Contributor

pjeby commented Feb 7, 2019

It's only been an implicit requirement for link and the tests, though: if you didn't use either of those, the abs_dirname function in basher still works -- and it's effectively a pure-bash emulation of dirname "$(readlink -f "$1")". (So there's already a workaround for the lack of realpath/readlink -f.)

If the coreutils requirement is documented, then the stuff in the main basher script can go away too, since it's emulating `dirname "$(realpath "$1")" and could just be replaced with using that directly.

However, since there's already emulation of part of coreutils in basher, I don't see a point to adding the requirement; to me it makes more sense to just use a better emulation, and to do it without copying the code into each subcommand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants