Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use wax for globbing #5014

Merged
merged 18 commits into from
May 24, 2023
Merged

Use wax for globbing #5014

merged 18 commits into from
May 24, 2023

Conversation

arlyon
Copy link
Contributor

@arlyon arlyon commented May 18, 2023

🚨 This is still in progress, and a few tests still need to be updated. I will add commentary to the test suite where we are making changes and for what reasons. The general deviations are listed before. This is quite intensive and manual (over 100 tests that need to be understood, explained, and updated).

This is a PR against intregration/globbing

Description

The first implementation of globbing used a library called glob-match. It was chosen due to its large test suite derived from micromatch in the node ecosystem. However:

  • globs are evaluated for correctness at match-time (no precompile step)
    • invalid globs are only rejected if the matcher traverses far enough into the glob to encounter the invalid syntax
    • there are still a handful of bugs
  • it is not consistent with doublestar which, while also targeting micromatch, has different behaviour in certain cases
  • it is very permissive, with lots of nonstandard syntax
  • we maintain our own fork that adds a number of features
    • unicode grapheme support
    • zero-length bracket matches
    • distinction between not matching and invalid glob

Wax Deviations From Doublestar / Micromatch

  • 🚨 Partial unicode support, but notably doesn’t match a wildcard (?) to a grapheme cluster. For example, 🇳🇴, which consists of 🇳 followed by 🇴 does not match.
  • 🚨 a[!a]b matches a/b (!!!)
  • aa**cc does not match aa/bb/cc, so to match the behaviour you must replace it with aa*/**/*/cc as well as aa*cc
  • **/ does match against a/a meaning that trailing slashes are ignored
    • similarly, */** matches against foo (the doublestar matches against nothing, and the slash is collapsed
  • !a is invalid out of the box since they don’t support ! ******and ^
  • {,a} is invalid (brackets cannot have zero-length matches)
  • a/**/**/b is not supported (consecutive doublestars are rejected rather than collapsed)
  • a*{*,b} is rejected because the capture is ambiguous (ie acc could capture a, ac, or acc)
  • test/{foo/**,bar}/baz ← this one is confusing to me
    • malformed glob expression: adjacent component boundaries / or * test/{foo/**,bar}/baz

Testing Instructions

The tests in globwalk should be exhaustive.

@arlyon arlyon requested review from a team as code owners May 18, 2023 13:35
@arlyon arlyon marked this pull request as draft May 18, 2023 13:35
@vercel
Copy link

vercel bot commented May 18, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
examples-nonmonorepo 🔄 Building (Inspect) Visit Preview 💬 Add feedback May 24, 2023 3:14pm
examples-vite-web 🔄 Building (Inspect) Visit Preview 💬 Add feedback May 24, 2023 3:14pm
9 Ignored Deployments
Name Status Preview Comments Updated (UTC)
examples-basic-web ⬜️ Ignored (Inspect) Visit Preview May 24, 2023 3:14pm
examples-cra-web ⬜️ Ignored (Inspect) Visit Preview May 24, 2023 3:14pm
examples-designsystem-docs ⬜️ Ignored (Inspect) Visit Preview May 24, 2023 3:14pm
examples-gatsby-web ⬜️ Ignored (Inspect) Visit Preview May 24, 2023 3:14pm
examples-kitchensink-blog ⬜️ Ignored (Inspect) Visit Preview May 24, 2023 3:14pm
examples-native-web ⬜️ Ignored (Inspect) Visit Preview May 24, 2023 3:14pm
examples-svelte-web ⬜️ Ignored (Inspect) Visit Preview May 24, 2023 3:14pm
examples-tailwind-web ⬜️ Ignored (Inspect) Visit Preview May 24, 2023 3:14pm
turbo-site ⬜️ Ignored (Inspect) Visit Preview May 24, 2023 3:14pm

@arlyon arlyon changed the base branch from main to integration/globbing May 18, 2023 13:37
@github-actions
Copy link
Contributor

✅ This change can build next-swc

@github-actions
Copy link
Contributor

github-actions bot commented May 18, 2023

🟢 CI successful 🟢

Thanks

@gsoltis
Copy link
Contributor

gsoltis commented May 23, 2023

Benchmark failures are from switching to main and not finding the dependency that this PR is removing. I think safe to ignore. We also are not changing turbopack code.

@arlyon arlyon marked this pull request as ready for review May 24, 2023 14:04
@arlyon
Copy link
Contributor Author

arlyon commented May 24, 2023

This is pretty much ready for inclusion. I have split out commits for each class of test case change. I have a single remaining question regarding these tests:

https://github.com/vercel/turbo/blob/3cba0f067c7dbea1b79421fdaef08652dcd7bd52/crates/turborepo-globwalk/src/lib.rs#L1162-L1190

@gsoltis we have spoken at length about explicitly handling potentially traversing out of the base path but it seems from these test cases we are expected not to support that?

@gsoltis gsoltis merged commit b3f30d4 into integration/globbing May 24, 2023
@gsoltis gsoltis deleted the globs/wax branch May 24, 2023 16:17
gsoltis pushed a commit that referenced this pull request Jun 1, 2023
Co-authored-by: Greg Soltis <Greg Soltis>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants