Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setup.sh: substitute uses only valid bash names #14907

Merged
merged 2 commits into from
May 13, 2016

Conversation

Profpatsch
Copy link
Member

@Profpatsch Profpatsch commented Apr 22, 2016

bash variable names may only contain alphanumeric ASCII-symbols and _,
and must not start with a number. Nix expression attribute names however
might contain nearly every character (in particular spaces and dashes).

Previously, a substitution that was not a valid bash name would be
expanded to an empty string. This commit introduce a check that throws
a (hopefully) helpful error when a wrong name is used in a substitution.
That was discovered by chexxor when it happened to me in a substitution.

Additionally, substituteAll previously filtered out all environment
variables starting with an uppercase letter. That increased the number
of silently(!) ignored envvars (and nix attributes) even further.
Because it was quite arbitrary, it is removed.

This will trigger a mass rebuild.

@mention-bot
Copy link

By analyzing the blame information on this pull request, we identified @edolstra, @vcunat and @urkud to be potential reviewers

@abbradar abbradar added the 0.kind: enhancement Add something new label Apr 23, 2016
@vcunat vcunat added the 1.severity: mass-rebuild This PR causes a large number of packages to rebuild label Apr 23, 2016
@vcunat
Copy link
Member

vcunat commented Apr 23, 2016

I'm convinced the filtering is meant to reduce the risk of accidental substitution. It's a convention that standard global env vars are all-uppercase, and typically you have large amount of those in the env. On the other hand, the main intended purpose of substituteAll seems to act on converted nix variables which rarely start with an uppercase character.

@vcunat
Copy link
Member

vcunat commented Apr 23, 2016

The first change seems good to me. It's a bit unfortunate for subsituteAll that we prefer to use dashes in nix bindings. For example, I now tested passing a dashed attribute to a derivation and it's just not added to the env silently (or invisible to bash).

@Profpatsch
Copy link
Member Author

Profpatsch commented Apr 23, 2016

I'm convinced the filtering is meant to reduce the risk of accidental substitution.

@aszlig brought the same objection. However in my opinion the worst an automatic computing system can to is silently ignore user input.
So unless you can think of a way to throw an error for these cases I’m in strong favor of risking wrong substitutions.
Even better would be to have a clean namespace for these substitutions in the first place. Maybe that can be done, too.

@vcunat
Copy link
Member

vcunat commented Apr 23, 2016

If we do not change this filtering, we should at least document it in http://nixos.org/nixpkgs/manual/#fun-substituteAll

To me there seems a more pressing "silent" problem: if you forget to put $var into env or make a typo somewhere, any @var@ occurrences just won't be substituted. We might make substituteAll check afterwards that no such references have remained.

@Profpatsch
Copy link
Member Author

if you forget to put $var into env or make a typo somewhere, any @var@ occurrences just won't be substituted.
We might make substituteAll check afterwards that no such references have remained.

How about files where @something@ is intended to be left as-is? It’s the classic “symbol-space used by multiple layers” problem.

@vcunat
Copy link
Member

vcunat commented Apr 23, 2016

I don't think you should use substituteAll in such cases, as anyone can define $something for some other reason.

bash variable names may only contain alphanumeric ASCII-symbols and _,
and must not start with a number. Nix expression attribute names however
might contain nearly every character (in particular spaces and dashes).

Previously, a substitution that was not a valid bash name would be
expanded to an empty string. This commit introduce a check that throws
a (hopefully) helpful error when a wrong name is used in a substitution.
@Profpatsch Profpatsch force-pushed the setup-bash-names branch 2 times, most recently from bc6fffd to 3bb93c1 Compare April 23, 2016 16:03
@Profpatsch
Copy link
Member Author

Profpatsch commented Apr 23, 2016

@vcunat I added the filter again and included documentation.

The first fallout of the changes is already visible, the documentation travis build is failing. If I did everything right, all of these failures should have been bugs from the get-go, though.
Maybe we should create a hydra branch to fix all those first?

@vcunat
Copy link
Member

vcunat commented Apr 23, 2016

Are you sure about the failures? It seems like stdenv won't even bootstrap.

building path(s) ‘/nix/store/r6p6ipkqaf7ivbszwf25nnwf16fvr604-bootstrap-gcc-wrapper’

substitution variables must be valid bash names, "ccLDFlags+" isn't.

@vcunat
Copy link
Member

vcunat commented Apr 23, 2016

OMG, the implementation has been crappy all along – it uses env call to get all env variables, but that isn't one-var-per-line if there are multi-line strings in the vars...

@vcunat
Copy link
Member

vcunat commented Apr 23, 2016

Let me properly test a fix for that.

@vcunat
Copy link
Member

vcunat commented Apr 23, 2016

It got to building the final gcc so far, so it's probably OK. @Profpatsch: checkout Profpatsch#1 ;-)

The filtering of environment variables that start with an uppercase
letter is documented in the manual.
@Profpatsch
Copy link
Member Author

@vcunat I changed it, should we make Travis build everything? :)

@Profpatsch
Copy link
Member Author

@vcunat What’s the next steps here? Should we merge into staging?

@joachifm
Copy link
Contributor

@Profpatsch sounds like the next logical step

@abbradar abbradar mentioned this pull request Apr 25, 2016
7 tasks
@abbradar
Copy link
Member

Closed by 62616ec. Thanks!

@abbradar abbradar closed this Apr 25, 2016
@vcunat
Copy link
Member

vcunat commented Apr 28, 2016

I think this PR is what causes that mass-breakage on darwin: http://hydra.nixos.org/build/34945843/nixlog/1/raw I assume the bash used at that point behaves differently, or something similar, as there seem to be no larger problems on the two linux platforms.

@vcunat
Copy link
Member

vcunat commented Apr 28, 2016

Unfortunately, this leaves me in kind-of stuck situation, as I can't well test darwin stuff and I don't like bringing that mass failure to master either...

@Profpatsch
Copy link
Member Author

Hm, do you think the set does not work the same on darwin bash? That’s nearly all we changed. Maybe someone on Mac can post an output of set?

@Profpatsch
Copy link
Member Author

And maybe an output of

for e in $(set); do
  if ! [[ "$varName" =~ ^[a-zA-Z_]+[a-zA-Z0-9_]*$ ]]; then
    echo "$varName"
  fi
done

@aszlig
Copy link
Member

aszlig commented Apr 28, 2016

I think using set instead of env is very dangerous, because it not only shows exported variables but also shell-internal variables. In addition to that it handles different types with different syntax.

What I think is the problem on Darwin is that the contents of a function is recognized as a variable name here. For example if I run set within bash on my system I get the following along the lines:

command_not_found_handle () 
{ 
    local p=/run/current-system/sw/bin/command-not-found;
    if [ -x $p -a -f /nix/var/nix/profiles/per-user/root/channels/nixos/programs.sqlite ]; then
        $p "$@";
        if [ $? = 126 ]; then
            "$@";
        else
            return 127;
        fi;
    else
        echo "$1: command not found" 1>&2;
        return 127;
    fi
}

While command_not_found_handle isn't usually set inside stdenv, other functions may have been set and whenever there is a line within that function without prior whitespace it might end up getting included in substitute.

@aszlig
Copy link
Member

aszlig commented May 13, 2016

@abbradar, @vcunat: What's the minimum bash version we should aim for if we want to support Darwin as well?

@vcunat
Copy link
Member

vcunat commented May 13, 2016

I don't know, but in this case the largest stumbling block seems that env doesn't support -0 at some point during the bootstrap.

@vcunat vcunat merged commit a2d38bc into NixOS:master May 13, 2016
@aszlig
Copy link
Member

aszlig commented May 13, 2016

@vcunat: Hm, what about doing that portion in C instead... like this (untested):

#include <stdio.h>

int main(int argc, char **argv, char **envp) {
    char *tmp = NULL;
    fputs("local -a args=(", stdout);
    while (*envp != NULL) {
        if (tmp != NULL) putchar(' ');
        fputs("'--subst-var' '", stdout);
        tmp = *envp++;
        while (*tmp != 0 && *tmp != '=') {
            if (*tmp == '\'')
                fputs("'\\''", stdout);
            else
                putchar(*tmp);
            tmp++;
        }
        putchar('\'');
    }
    puts(")");
    return 0;
}

This could then be simply evaled and we directly get out the args variable, although it feels a bit like breaking a fly on the wheel :-D

@aszlig
Copy link
Member

aszlig commented May 13, 2016

OTOH... what about replacing the whole set of substitute* functions by a single small C program (substitute with the same functionality as before and substitute -i as an alias for substituteInPlace and substitute -a as an alias for substituteAll)? I mean the functionality is very simple and we don't need to make it ugly to read by even uglier shell workarounds.

@vcunat
Copy link
Member

vcunat commented May 13, 2016

Yeah, writing a C utility for all those substitute* functions seems a good longer-term goal. (Github won't allow to reopen this PR anymore.)

@vcunat
Copy link
Member

vcunat commented May 13, 2016

We might finally make it an error when substituteInPlace does no changes :-)

@aszlig
Copy link
Member

aszlig commented May 13, 2016

@vcunat: Mhm, that might be a good idea for a lot of these sed calls as well. But let's first look if there is already something that does what we want before we start writing our own.

@domenkozar
Copy link
Member

We really need a good DSL like Elm wraps JavaScript, but for Shell. Something simple, but well typed.

@aszlig
Copy link
Member

aszlig commented May 13, 2016

@vcunat
Copy link
Member

vcunat commented May 13, 2016

If you know a good typed shell-suitable language, please, tell me. (Getting further from the topic, I'm afraid.)

@eikek
Copy link
Member

eikek commented May 13, 2016

Scala with https://github.com/lihaoyi/Ammonite ;-)

@Profpatsch
Copy link
Member Author

lol scala.
Let’s put everything in a JVM and call it Noracle.

@domenkozar
Copy link
Member

Please, let's move this discussion elsewhere (sorry for bringing it up). Let's focus on the goal of the PR. Thank you!

@copumpkin
Copy link
Member

I'm getting a bunch of

building path(s) ‘/nix/store/cx3az6zxzzzyxzrk14bda2x9k6vb8v20-clang-wrapper-3.8.0’
WARNING: substitution variables should be valid bash names,
  "ccLDFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccCFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccLDFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccCFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccLDFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccCFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccLDFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccCFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccLDFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccCFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccLDFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.
WARNING: substitution variables should be valid bash names,
  "ccCFlags+" isn't and therefore was skipped; it might be caused
  by multi-line phases in variables - see #14907 for details.

Seemingly from:

      ccCFlags+=" -B${cc_solib}/lib"

in cc-wrapper.nix.

@edolstra
Copy link
Member

@copumpkin Seeing the same thing.

@vcunat
Copy link
Member

vcunat commented Oct 16, 2016

Yes, explained in 577fefe1d36. (We've always been attempting those bogus substitutions, only it was silent.)

@copumpkin
Copy link
Member

@vcunat I'm not sure I understand what's actually happening here. Does that mean someone's trying to pass in a key called "ccCFlags+" from the Nix side? Isn't my example above just using the += bash notation?

@vcunat
Copy link
Member

vcunat commented Oct 21, 2016

@copumpkin: it's a false positive. The problem is that the string is in a multi-line variable (a build phase) and env separates the listing by newlines. As seen above, I wanted to fix that by using env -0 but that didn't work out as the parameter wasn't supported early in bootstrap (for Darwin, IIRC).

@copumpkin
Copy link
Member

Ah yes, if you use /usr/bin/env, we have no control (literally, the OS wouldn't let us overwrite it even if we wanted to) over where that points in Darwin 😦

@vcunat
Copy link
Member

vcunat commented Oct 21, 2016

In this case env was just searched on current $PATH.

@copumpkin
Copy link
Member

Hmm, then I'm confused, because we use the usual GNU coreutils everywhere inside of Nix:

[nix-shell:~]$ uname -a
Darwin truffles 16.0.0 Darwin Kernel Version 16.0.0: Mon Aug 29 17:56:20 PDT 2016; root:xnu-3789.1.32~3/RELEASE_X86_64 x86_64 i386 MacBookPro11,3 Darwin

[nix-shell:~]$ env -0 | head -n 1
GHC_DOT_APP=/Applications/ghc-7.10.2.appTERM_PROGRAM=Apple_TerminalSSL_CERT_FILE=/Users/copumpkin/.nix-profile/etc/ca-bundle.crtNIX_CC=/nix/store/gppkhlphb093bq35l4836s378fra84w5-clang-wrapper-3.7.1__sandboxProfile=(allow file-read* (literal "/usr/lib/libncurses.5.4.dylib"))

@vcunat
Copy link
Member

vcunat commented Oct 21, 2016

@copumpkin: yes, but it wasn't so at some point during bootstrapping, at least according to my comment on 81df035. You can try before that commit yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: enhancement Add something new 1.severity: mass-rebuild This PR causes a large number of packages to rebuild
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants