Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallbacks to /tmp if the preferred cache folder isn't writable #4143

Merged
merged 9 commits into from
Aug 17, 2017

Conversation

arcanis
Copy link
Member

@arcanis arcanis commented Aug 10, 2017

Summary

Fixes #4141. If $(yarn cache dir) isn't writable, Yarn will crash and burn with EACCESS / EPERM. This patch adds a fallback mechanism so that we try different folders until we find one that fits. As a first step, I've added /tmp/.yarn-cache.

Test plan

Tested with

chmod 000 /Users/mael/Library/Caches/Yarn && ~/yarn/bin/yarn

@arcanis arcanis requested a review from BYK August 10, 2017 11:29
@BYK BYK self-assigned this Aug 10, 2017
src/config.js Outdated

try {
await fs.mkdirp(tentativeCacheFolder);
await fs.writeFile(path.join(tentativeCacheFolder, constants.WTEST_FILENAME), `testing write access`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use fs.access with both read and write checks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking of it, I think this would still have issues (since the cache folders have to be assumed empty, we would have to make the check on the directory itself, which might yield different results than what we would have add on a file). However, I'll also add a read on the file to account for the read permissions. Does that sounds ok?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing this with @BYK we came to the conclusion that --cache-folder would keep its current semantic (and would crash if the directory is not writable) since it would break the commonly assumed contract, potentially exposing the application to vulnerabilities (the /tmp folder could be world-readable whereas the --cache-folder wouldn't). However, we'll also add a --preferred-cache-folder option with less restrictions. Will even make it easier to test the feature :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking of it, I think this would still have issues (since the cache folders have to be assumed empty, we would have to make the check on the directory itself, which might yield different results than what we would have add on a file). However, I'll also add a read on the file to account for the read permissions. Does that sounds ok?

I don't see how this can be the case? If you can create a file in the directory it should be equivalent to fs.access with write permissions. Same goes for read. We can make the case for space exhaustion in the cache directory (if it is on a different drive etc.) but this doesn't protect us against that anyways.

Copy link
Member Author

@arcanis arcanis Aug 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can create a file in the directory it should be equivalent to fs.access with write permissions.

Yes but the file we've created (ie the file in the folder) might not be accessible. Umask, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really think we should just use fs.access here.

src/config.js Outdated
const tentativeCacheFolder = constants.PREFERRED_MODULE_CACHE_DIRECTORIES[t];

try {
await fs.mkdirp(tentativeCacheFolder);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd separate this part out and try cleaning up the cache directory if we created it but it is not usable. Kind of an edge case tho, so not feeling too strongly about this.

if (!cacheRootFolder) {
throw new MessageError(this.reporter.lang('cacheFolderMissing'));
} else {
this._cacheRootFolder = String(cacheRootFolder);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need the String() casting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there before. Probably because of Flow, I'd guess :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we can remove it now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried, still fails.

src/constants.js Outdated
const preferredCacheDirectories = [getDirectory('cache')];

if (process.platform !== 'win32') {
preferredCacheDirectories.unshift('/tmp/.yarn-cache');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Use os.tmpdir() please. Note that this is also platform independent.

src/constants.js Outdated
if (process.platform === 'darwin') {
return path.join(userHome, 'Library', 'Caches', 'Yarn');
preferredCacheDirectories.unshift(path.join(userHome, 'Library', 'Caches', 'Yarn'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of changes our previous logic so not sure about that. We should directly fallback to tmp when our first choice fails. With this one, it will try this on OS X and then try the Linux version second.

opts.cacheFolder || this.getOption('cache-folder', true) || constants.MODULE_CACHE_DIRECTORY,
);

let cacheRootFolder = opts.cacheFolder || this.getOption('cache-folder', true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be pushed to the top of the list we try and we should fall back to tmp at all times for resilliency. Right now this override seems to be ignored.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree - if the explicitely specified cache folder cannot be used, that's an hard error. This option contract forces us to use this folder. By (almost) silently using a different directory, we might end up with subtle bugs. For example, two Yarn process might run concurrently with different caches. In such a scenario, if for some reason their respective cache could not be reached, they would end up corrupting themselves.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the readers of this thread, if there are any: we have talked about getting a unique folder once we fallback to tmp like by using a timestamp or the PID of the process, or both, but then decided to do this: #4143 (comment)

We should still probably try to ensure a unique folder in tmp when we fallback for safety.

@BYK
Copy link
Member

BYK commented Aug 10, 2017

We also need automated tests for this.

src/config.js Outdated

let cacheRootFolder = opts.cacheFolder || this.getOption('cache-folder', true);

for (let t = 0; t < constants.PREFERRED_MODULE_CACHE_DIRECTORIES.length && !cacheRootFolder; ++t) {
Copy link
Member

@BYK BYK Aug 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& !cacheRootFolder part totally gets lost in the code. Please make this an explicit if statement for clarity.

Also half-joking, half-serious: why are we using t as the counter and not i? 😀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, but I don't really know 😆 I've been using it for single-level loop since I started programming (and u/v/w or x/y/z when multiple loops are combined).

Could be amusing to run a search on the github repos to find what variable name is preferred by developpers :)

Copy link
Member Author

@arcanis arcanis Aug 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking on stack exchange, it seems that literally noone else is using t as a variable name ... I wonder how I came to that. Meh ¯\(ツ)/¯

Copy link
Member

@BYK BYK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rejecting mostly for the /tmp usage and omitting Windows when using os.tmpdir().

@@ -70,3 +70,20 @@ test('--mutex network', async () => {
execa(command, ['add', 'foo'].concat(args), options),
]);
});

test('cache folder fallback', async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no expect statements here and it is not clear what this test is testing. My guess is "don't fail when preferred cache folder cannot be used" but it'd be nice to make that more visible. We should also, probably make sure the warning is printed.

src/cli/index.js Outdated
@@ -60,6 +60,7 @@ export function main({
'--modules-folder <path>',
'rather than installing modules into the node_modules folder relative to the cwd, output them here',
);
commander.option('--preferred-cache-folder <path>', 'specify a custom folder to store the yarn cache if possible');
commander.option('--cache-folder <path>', 'specify a custom folder to store the yarn cache');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be update the description of this to make the difference between --preferred-cache-folder and this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hallo und weg

src/config.js Outdated
try {
await fs.mkdirp(tentativeCacheFolder);
// eslint-disable-next-line
await fs.access(tentativeCacheFolder, fs.constants.R_OK | fs.constants.W_OK | fs.constants.X_OK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to check for execution? (fs.constants.X_OK)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't traverse folders if they are not executable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, we may wanna add a comment about this since I'm not sure if "You can't traverse folders if they are not executable." is common knowledge. What do you think?

@@ -174,6 +174,11 @@ const messages = {
workspaceNameMandatory: 'Missing name in workspace at $0, ignoring.',
workspaceNameDuplicate: 'There are more than one workspace with name $0',

cacheFolderSkipped: 'Skipping preferred cache folder $0 because it is not writable.',
cacheFolderMissing:
"Yarn hasn't been able to find a cache folder. Please use an explicit --cache-folder option to tell it what location to use, or make one of the preferred locations writable.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yarn hasn't been able to find a cache folder.

find a cache folder it can use

Please use an explicit --cache-folder option

use the explicit --cache-folder option

cacheFolderSkipped: 'Skipping preferred cache folder $0 because it is not writable.',
cacheFolderMissing:
"Yarn hasn't been able to find a cache folder. Please use an explicit --cache-folder option to tell it what location to use, or make one of the preferred locations writable.",
cacheFolderSelected: 'Selected the next writable cache folder in the list, will be $0.',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to make this a verbose message and say "Using cache folder: $0".

src/constants.js Outdated

if (process.platform !== 'win32') {
preferredCacheDirectories.push(path.join(os.tmpdir(), '.yarn-cache'));
preferredCacheDirectories.push('/tmp/.yarn-cache');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this anymore.

src/constants.js Outdated
preferredCacheDirectories.push(getDirectory('cache'));

if (process.platform !== 'win32') {
preferredCacheDirectories.push(path.join(os.tmpdir(), '.yarn-cache'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why omit Windows??

src/util/fs.js Outdated
@@ -13,6 +13,12 @@ const globModule = require('glob');
const os = require('os');
const path = require('path');

export const constants = typeof fs.constants !== 'undefined' ? fs.constants : {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment about the reason for this switch please. (Node 4/Node 5+)

Copy link
Member

@BYK BYK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some points to follow up on but I think it is good overall.

Needs fixing the CI failure on Windows before the merge.


const options = {cwd};

{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand these custom blocks. Why do we need them and how do they work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The let variables declared inside are lexically bound. It allows me to reuse the variable name and underlying that those two blocks are checking the same thing. It could as well be a function, but since there are only two invocations I figured it was simpler to just duplicate the part of the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if you can make this a function since it would make the code easier to follow. I don't remember seen usage unnamed, custom blocks in a project I have worked on and it would really surprise me as it did right now. I understand that we can leverage the block-scoping with let and const but I think this makes the code less maintainable and harder to reason about.

expect(stderrOutput.toString()).not.toMatch(/Skipping preferred cache folder/);
}

await fs.chmod(cacheFolder, 0o000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like these should be two separate test cases:

describe('cache folder fallback', () => {
  test('use --preferred-cache-folder when available', async () => {
    ...
  });

  test('fallback to default cache when --preferred-cache-folder is not available', async () => {
    ...
  });
});


const [stdoutOutput, stderrOutput] = await Promise.all([stdoutPromise, stderrPromise]);

expect(stdoutOutput.toString().trim()).toEqual(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably use .toMatchSnapshot() in both cases which is easier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure snapshots are needed here since the result shouldn't ever change. I prefer to test with the explicit value because otherwise, it wouldn't test this PR (for example, the test suite fails on Windows because for some reason the fallback is used when it shouldn't. I wouldn't have seen this problem with snapshots).

@@ -55,7 +55,7 @@ export const {run, setFlags, examples} = buildSubCommands('cache', {
},

dir(config: Config, reporter: Reporter) {
reporter.log(config.cacheFolder);
reporter.log(config.cacheFolder, {force: true});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not happy with this solution since I think it needs a better overall solution as we have discussed. Let's have a TODO comment here saying "Don't use force option and fix this ASAP" or something along those lines so we don't forget?

@@ -60,7 +60,8 @@ export function main({
'--modules-folder <path>',
'rather than installing modules into the node_modules folder relative to the cwd, output them here',
);
commander.option('--cache-folder <path>', 'specify a custom folder to store the yarn cache');
commander.option('--preferred-cache-folder <path>', 'specify a custom folder to store the yarn cache if possible');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to say "prefer a custom cache folder when it is available". We can make this change in a follow-up.

src/config.js Outdated
try {
await fs.mkdirp(tentativeCacheFolder);
// eslint-disable-next-line
await fs.access(tentativeCacheFolder, fs.constants.R_OK | fs.constants.W_OK | fs.constants.X_OK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, we may wanna add a comment about this since I'm not sure if "You can't traverse folders if they are not executable." is common knowledge. What do you think?

}

if (cacheRootFolder && t > 0) {
this.reporter.warn(this.reporter.lang('cacheFolderSelected', cacheRootFolder));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: make this information an info or debug level message for all cases to signal which cache folder is being used for each run.

}

return getDirectory('cache');
preferredCacheDirectories.push(path.join(os.tmpdir(), '.yarn-cache'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably append a timestamp or something to the '.yarn-cache' name to make it unique across instances and runs? What do you think? (can be a follow-up)

@@ -195,7 +195,7 @@ export default class BaseReporter {
success(message: string) {}

// a simple log message
log(message: string) {}
log(message: string, {force = false}: {force?: boolean} = {}) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above comment: let's add a TODO discouraging using this option.

@@ -13,6 +13,15 @@ const globModule = require('glob');
const os = require('os');
const path = require('path');

export const constants =
typeof fs.constants !== 'undefined'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment about Node 4 compatibility here would be great so in the future we don't get confused about this code.

@arcanis arcanis force-pushed the tmp-cache branch 7 times, most recently from 8770d5c to 06ee79d Compare August 17, 2017 10:08
@arcanis arcanis merged commit a9e343c into yarnpkg:master Aug 17, 2017
@shaunc
Copy link

shaunc commented Sep 18, 2017

Is this potentially a security hole (as /tmp is often writable by other users)?

@BYK
Copy link
Member

BYK commented Sep 18, 2017

Is this potentially a security hole (as /tmp is often writable by other users)?

If you specify a cache folder explicitly with --cache-folder Yarn fails if it cannot use it, precisely because of this concern. For the common case, I'm not so sure.

@shaunc
Copy link

shaunc commented Sep 18, 2017

I would prefer that the default be secure and explicit config required for insecure usage, rather than the other way around. (E.g. have --no-cache-folder or some such to allow search... or maybe more sensible --cache-folder-fallback ...)

@BYK
Copy link
Member

BYK commented Sep 18, 2017

I would prefer that the default be secure and explicit config required for insecure usage, rather than the other way around. (E.g. have --no-cache-folder or some such to allow search... or maybe more sensible --cache-folder-fallback ...)

For --no-cache, follow yarnpkg/rfcs#53 please. We intend to ship that now. For the fallback part, are you suggesting we have --no-cache-fallback by default and when people enable it explicitly, use the fallback mechanisms?

@shaunc
Copy link

shaunc commented Sep 18, 2017

you suggesting we have --no-cache-fallback by default and when people enable it explicitly, use the fallback mechanisms?

Yes ... or at the very least warn unless they have explicitly enabled. IMHO if someone doesn't have default cache location writable then 1) there is a reason for it or 2) there is no reason for it, but, either way, getting the user to think about it seems a good idea. Using /tmp without the user being conscious of it seems like a hack.

@BYK
Copy link
Member

BYK commented Sep 18, 2017

Yes ... or at the very least warn unless they have explicitly enabled.

This is what happens right now.

IMHO if someone doesn't have default cache location writable then 1) there is a reason for it or 2) there is no reason for it, but, either way, getting the user to think about it seems a good idea. Using /tmp without the user being conscious of it seems like a hack.

Again, a warning is printed and we are thinking about a --strict mode, where having warnings would cause yarn to quit early with an error code. I think that combination would give you what you expect, right?

@shaunc
Copy link

shaunc commented Sep 19, 2017

Ah ... vg -- didn't realize there was an automatic warning. Yes -- "strict mode" sounds like a good idea; its documentation should point out that it has security benefits (as well as any other features -- I can imagine that over time various checks that would be a pain to configure separately could be included). Seems like it should be recommended for production builds. (The question would be whether it should be the default...)

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants