Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(run): improve escaping for script arguments #4135

Merged
merged 1 commit into from
Oct 3, 2017
Merged

Conversation

rhendric
Copy link
Contributor

Summary

Extra command-line arguments to scripts were not being escaped correctly. This patch adds robust shell quoting logic for both Windows and Linux/macOS.

Test plan

On *nix, create a package.json containing "scripts":{"echo":"echo"}. Run yarn run -s echo -- '$X \"blah\"'. Expect to observe  \blah\ prior to this patch, and $X \"blah\" after it.

Testing on Windows should be similar, but may require fancier escaping to get the arguments into yarn in the first place. (I don't have access to a Windows box to verify the exact procedure to follow, sorry—but I did confirm that my automated tests succeed in AppVeyor.)

@rhendric rhendric force-pushed the master branch 6 times, most recently from 72bb441 to 156bb01 Compare August 10, 2017 13:02
Copy link
Member

@BYK BYK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for doing this!

function quoteForCmd(arg: string): string {
// See the below blog post for an explanation of what's going on here:
// eslint-disable-next-line max-len
// https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah! I'm glad there is another one who read the same article 🗡

I think you should release this as a library so yarn and others can use it from there because I've yet to see a proper Windows-compatible shell escaping library on NPM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree; I started writing that library this morning. I was a little shy about writing it first and then proposing that yarn take a dependency on a 0.0.1 library that nobody's used yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well then, add lots of tests and have a high coverage and mark it as 1.0 ;)

return arg;
}

const quoteForShell = process.platform === 'win32' ? quoteForCmd : arg => `'${arg.replace(/'/g, "'\\''")}'`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we should have the following in constants.js:

export const IS_WINDOWS = process.platform === 'win32';

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a bit more to bash escaping too: https://github.com/xxorax/node-shell-escape/blob/master/shell-escape.js

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable to me. There are a few instances of that check throughout the code; I looked to make sure there wasn't some other pattern already in place that I should follow.

If I understand correctly, I think those extra replaces in node-shell-escape are just cosmetic; they remove some redundant quotes but don't affect the final interpreted string. I'll probably add something like that though if I do go and release this as a library.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable to me. There are a few instances of that check throughout the code; I looked to make sure there wasn't some other pattern already in place that I should follow.

On second thought, I think that change deserves its own diff so I'll work on it. For now, your diff is fine. :)

If I understand correctly, I think those extra replaces in node-shell-escape are just cosmetic; they remove some redundant quotes but don't affect the final interpreted string. I'll probably add something like that though if I do go and release this as a library.

Fine by me if you say they are safe. It's frightening for me to realize that I know more about Windows escaping than Bash escaping at this point :D

function joinArgs(args: Array<string>): string {
return args.reduce((joinedArgs, arg) => joinedArgs + ' "' + arg.replace(/"/g, '\\"') + '"', '');
return args.length ? ' ' + args.map(quoteForShell).join(' ') : '';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need the args.length ? check. [].map(whatever).join(' ') still returns ''.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the entire function would return the ' ' that gets prepended to the joined string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry I missed that part! That said, then the name of this function should be stringifyArgs or getArgsString to convey the meaning better.

Optionally, you can pass in the first part and name the function serializeShellCommand and do something like:

function serializeShellCommand(cmd: string, args: Array<string>): string {
    return [cmd].concat(args.map(quoteForShell)).join(' ');
}

That would take care all of it. Anyways, I'm just rambling. Just changing the name of the function would be enough for me but I'm still a bit concerned that people may miss that extra space at the beginning or don't understand the reason at first glance.

@@ -0,0 +1,5 @@
{
"scripts": {
"write-args": "node write-args.js"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do node -p -e "process.argv[2], process.argv.slice(3).join(' ')" instead and then read the output from stdout. I'd prefer not to write to file system if/when possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I'll give it a try.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out it should just be node -p "..." without the -e part.

@@ -13,6 +13,7 @@ jasmine.DEFAULT_TIMEOUT_INTERVAL = 90000;
const execCommand: $FlowFixMe = require('../../src/util/execute-lifecycle-script').execCommand;

const path = require('path');
const q = process.platform === 'win32' ? '"' : "'";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this to shellQuotes or something. q is a very ambiguous variable name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with you in general, but in this case, this guy gets interpolated several times into several strings below, and having it be a short name really helps readability in my opinion. Compare:

const args = ['cat-names', config, `${q}${script}${q} ${q}--filter${q} ${q}cat names${q}`, config.cwd];

const args = [
  'cat-names',
  config,
  `${shellQuote}${script}${shellQuote} ${shellQuote}--filter${shellQuote} ${shellQuote}cat names${shellQuote}`,
  config.cwd,
];

Easy enough to change if you still disagree though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, let's agree to add a comment above it stating this then and call it a day? :)

}
}
}
cases.sort(() => Math.random() - 0.5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very bad way to randomize an array. Also, why do we need randomization anyways?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's bad if true randomness is required, but it's good enough for a quick scramble (would you prefer I implement/require a Knuth shuffle just for this test?). I'm randomizing because I'm not actually testing every combination; I didn't want to take too long on this test but I also wanted to get a decent sampling of the space. Basically it's a poor man's property-based test. If you think it's a good idea, I'd be happy to pull in jsverify or jssmartcheck and write a real property-based test. Alternatively, if this is a library, the heavy property checking can go there and yarn can test a much smaller sampling of cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any non-determinism in tests would create doubt when running things on the CI: did it fail for reals or was that a random blip? Did I catch a real edge case thanks to randomness, or the CI had a bad network connection? So let's have this test all possible combinations instead of random sampling for each test run or agree to ahve the same random sample for each run.

return run(config, reporter, flags, args);
});

jasmine.DEFAULT_TIMEOUT_INTERVAL = 30000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh. Something is happening here that I don't understand.

Originally I wrote this test to invoke the script serially, and that was consistently timing out on AppVeyor, even with a much higher DEFAULT_TIMEOUT_INTERVAL (2.5 minutes!), and even with only three iterations of the script invocation. Once I parallelized, the tests stopped timing out and in fact finished much faster than timeout/degree of parallelism, so there was probably something I was doing wrong with the serial implementation (I was awaiting on all the right things, I thought!), but I was nervous that there was a chance that Windows just wanted to take its time spawning child processes on that build machine sometimes. So I took this down a little but it is still higher than the current test times seem to justify. What's your general sense here; should I try to keep the value conservatively high to reduce the chance of random build failure, or lower to try to crack down on build time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, now I see. Actually, I think you can benefit from using test.concurrent and splitting your tests into individual test cases, generated programmatically, wrap them in a describe block so you can use beforeEach hooks and then jest should take care of it. If even after that and not writing to the file system (remember, we agreed to try using stdout instead above?) I'm okay leaving the timeout higher.

cases.sort(() => Math.random() - 0.5);
const tempDir = await makeTemp();
const prefix = path.resolve(__dirname, '../..');
const origPrefix = process.env.PREFIX;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't modify process.env directly, please. There should be a way to pass this to runRun.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give me more of a hint? I looked through the code for such a way and couldn't find a straightforward one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Search for getGlobalPrefix in the code. Spoiler: you can use flags or config to override the prefix.

@BYK
Copy link
Member

BYK commented Sep 12, 2017

Hey, @rhendric! Long time no see. Any updates?

@rhendric
Copy link
Contributor Author

Hi! Sorry for the delay; I still expect to finish the swing on this but I think it'll be another week or so before I have the time.

@BYK
Copy link
Member

BYK commented Sep 12, 2017

No worries, just wanted to check if you're still working on it :)

@rhendric
Copy link
Contributor Author

Hi, I'm back on the case! Quick question: I see that there is now a comment in __tests__/integration.js linking back to this PR, claiming that this is blocking those tests from being enabled on Windows, but I don't see the connection—those tests look like they don't use any characters that are likely to cause problems. Looks like it was added in #4152. @arcanis, or anyone, what's the story with that? Should I try to remove that gating if as part of this PR?

@rhendric rhendric changed the title Fix: improve escaping for script arguments [WIP] Fix: improve escaping for script arguments Sep 29, 2017
@rhendric
Copy link
Contributor Author

I figured out what the deal with those tests is. It's a combination of the below:

  • the tests use native echo,
  • echo on Windows does not strip quotation marks from its arguments (in Windows, unlike in Unix-based OSes, the choice of how to handle command line argument quotation is left to the executable being invoked, not to the invoking shell—strictly speaking, all commands in Windows receive a single command line instead of an array of command line arguments), and
  • Yarn puts quotes around all of its arguments whether they need them or not.

I see three ways to get these tests passing, the first of which I'm going to go with unless I hear objections:

  1. Make Yarn not put quotes around arguments unless necessary (my library handles this, and I've put it through quite a series of torture tests). This has the advantage of being less likely to surprise the next Yarn user who tries to use echo in a script. This is also perhaps the intent indicated by leaving these tests disabled until my work here is done, because either of the below things could have been done earlier.
  2. Replace echo in these tests with something else: node -p, perhaps, or add a dependency on echo-cli.
  3. Just make the tests expect to see '"--opt"' on Windows instead of '--opt'.

**Summary**

Extra command-line arguments to scripts were not being escaped
correctly. This patch uses puka to add robust shell quoting for both
Windows and Linux/macOS.

**Test plan**

On *nix, create a `package.json` containing `"scripts":{"echo":"echo"}`. Run
`yarn run -s echo -- '$X \"blah\"'`. Expect to observe ` \blah\` prior
to this patch, and `$X \"blah\"` after it.

Testing on Windows should be similar, but may require fancier escaping
to get the arguments into yarn in the first place.
@buildsize
Copy link

buildsize bot commented Oct 2, 2017

This change will increase the build size from 9.83 MB to 9.9 MB, an increase of 73.16 KB (1%)

File name Previous Size New Size Change
yarn-[version].noarch.rpm 848.57 KB 856.33 KB 7.76 KB (1%)
yarn-[version].js 3.74 MB 3.77 MB 25.48 KB (1%)
yarn-legacy-[version].js 3.79 MB 3.82 MB 25.45 KB (1%)
yarn-v[version].tar.gz 854.27 KB 862.21 KB 7.94 KB (1%)
yarn_[version]all.deb 645.38 KB 651.91 KB 6.53 KB (1%)

1 similar comment
@buildsize
Copy link

buildsize bot commented Oct 2, 2017

This change will increase the build size from 9.83 MB to 9.9 MB, an increase of 73.16 KB (1%)

File name Previous Size New Size Change
yarn-[version].noarch.rpm 848.57 KB 856.33 KB 7.76 KB (1%)
yarn-[version].js 3.74 MB 3.77 MB 25.48 KB (1%)
yarn-legacy-[version].js 3.79 MB 3.82 MB 25.45 KB (1%)
yarn-v[version].tar.gz 854.27 KB 862.21 KB 7.94 KB (1%)
yarn_[version]all.deb 645.38 KB 651.91 KB 6.53 KB (1%)

@rhendric rhendric changed the title [WIP] Fix: improve escaping for script arguments Fix: improve escaping for script arguments Oct 2, 2017
@rhendric
Copy link
Contributor Author

rhendric commented Oct 2, 2017

@BYK, re-review please?

Copy link
Member

@BYK BYK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! @arcanis can you also give it a look before merging?

@BYK
Copy link
Member

BYK commented Oct 3, 2017

@rhendric thank you so much for this!

@BYK BYK changed the title Fix: improve escaping for script arguments fix(run): improve escaping for script arguments Oct 3, 2017
Copy link
Member

@arcanis arcanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm !

@BYK BYK merged commit 38790e8 into yarnpkg:master Oct 3, 2017
joaolucasl pushed a commit to joaolucasl/yarn that referenced this pull request Oct 27, 2017
**Summary**

Extra command-line arguments to scripts were not being escaped correctly. This patch adds robust shell quoting logic for both Windows and Linux/macOS.

**Test plan**

On *nix, create a `package.json` containing `"scripts":{"echo":"echo"}`. Run `yarn run -s echo -- '$X \"blah\"'`. Expect to observe ` \blah\` prior to this patch, and `$X \"blah\"` after it.

Testing on Windows should be similar, but may require fancier escaping to get the arguments into yarn in the first place. (I don't have access to a Windows box to verify the exact procedure to follow, sorry—but I did confirm that my automated tests succeed in AppVeyor.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants