Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[node_modules] Implement transparent handling of some node_modules roots non-existence #993

Merged
merged 3 commits into from
Feb 25, 2020

Conversation

larixer
Copy link
Member

@larixer larixer commented Feb 25, 2020

What's the problem this PR addresses?

This PR adds support for common CI workflow, when user caches only some of node_modules roots, top-level node_modules root for example and "forgets" to cache them all (or it is difficult for the user to do so)

How did you fix it?

When we read previous install state, we now run fast check to see wether all the node_modules roots are present on the file system and if some of them do not - we account that the tree branch that starts from non-existent module root is not here and should be installed.

cc @nicolo-ribaudo

@larixer larixer marked this pull request as ready for review February 25, 2020 09:16
@larixer larixer requested a review from arcanis February 25, 2020 09:16
@nicolo-ribaudo
Copy link
Contributor

nicolo-ribaudo commented Feb 25, 2020

Thanks for the PR! I will check how long it takes with this patch and with a cache on Travis later today.

(or it is difficult for the user to do so)

It's almost impossible, because it isn't possible to do with globs: we would need to hard-code all the workspace, and always keep the list up to date.

I also have another idea: what if we store workspaces dependencies not inside /[workspace name]/node_modules but inside /node_modules/.yarn-workspace-dependencies/[workspace] and then hard-link them to /[workspace name]/node_modules?

Then, when running yarn install, if the .yarn-state.yml file is still valid then it's enough to re-create the missing hard-links (at most one per workspace).

This would allow storing a complete cache of all the node_modules folders.

EDIT: Maybe I could try implementing it using a plugin.

@arcanis
Copy link
Member

arcanis commented Feb 25, 2020

Then, when running yarn install, if the .yarn-state.yml file is still valid then it's enough to re-create the missing hard-links (at most one per workspace).

I'm worried this logic would be quite complicated for a somewhat uncommon use case (plus, ideally very few packages would need to be stored inside the individual workspace directories) 🤔

@larixer
Copy link
Member Author

larixer commented Feb 25, 2020

I agree with @arcanis, this will be a non-standard node_modules layout, which might be painful to support by default since, it might raise many different edge cases. So, yeah, maybe supporting it in a different plugin would be a way to go, if you want.

@nicolo-ribaudo
Copy link
Contributor

nicolo-ribaudo commented Feb 25, 2020

Ok 👍

Note that even if I might go with the custom plugin, I think that this PR is still needed since currently monorepo caches which were working with yarn 1 are broken "by default", and it's hard to understand why.

@larixer
Copy link
Member Author

larixer commented Feb 25, 2020

@nicolo-ribaudo

It's almost impossible, because it isn't possible to do with globs: we would need to hard-code all the workspace, and always keep the list up to date.

I agree, it's almost impossible to list them all with the globs, yet it's possible to have fairly good dir list to cache via:
node_modules packages/*/node_modules

Edit: actually its possible I think, one should just glob all the workspaces. For example, in the case of babel, we have:

  "workspaces": [
    "codemods/*",
    "eslint/*",
    "packages/*"
  ],

so, theoretically node_modules codemods/*/node_modules eslint/*/node_modules packages/*/node_modules, should cover all the node_modules dirs, isn't it?

This PR will gives some extra confidence, so it's nice to have in any case.

@larixer larixer requested a review from arcanis February 25, 2020 11:34
@nicolo-ribaudo
Copy link
Contributor

Oh, what I meant is "Travis CI doesn't support globs" 😅
Also, they stated that they aren't going to support them.

@larixer
Copy link
Member Author

larixer commented Feb 25, 2020

@nicolo-ribaudo Oh, well, in this case I would have cached on Travis at least some of these:

➜  babel git:(yarn-2-compat) ✗ du -hs node_modules codemods/*/node_modules eslint/*/node_modules packages/*/node_modules | sort -hr | head -10
579M	node_modules
261M	eslint/babel-eslint-shared-fixtures/node_modules
259M	packages/babel-preset-env/node_modules
207M	packages/babel-standalone/node_modules
 72M	packages/babel-plugin-transform-flow-strip-types/node_modules
 70M	packages/babel-plugin-transform-flow-comments/node_modules
 70M	packages/babel-plugin-proposal-export-namespace-from/node_modules
 65M	packages/babel-plugin-proposal-nullish-coalescing-operator/node_modules
 62M	packages/babel-plugin-transform-modules-systemjs/node_modules
 62M	packages/babel-plugin-proposal-optional-chaining/node_modules

@nicolo-ribaudo
Copy link
Contributor

Good idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants