-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement bash (with globstar) style globbing #2522
Conversation
This is failing due to warnings and |
@@ -163,6 +163,7 @@ library | |||
Distribution.Compat.Exception | |||
Distribution.Compat.ReadP | |||
Distribution.Compiler | |||
Distribution.Glob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move under Distribution.Utils.Glob
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will do.
Some documentation and a changelog update for this would be nice. |
Do you plan to implement it? |
Ok, I'll do some documentation and a changelog update. I was planning on implementing a cabal-version constraint too, yes. Regarding that, do you think it would be ok to simply refuse to build a project if:
Presumably this could lead to cabal files which worked fine before 1.23, and would have continued to work fine after 1.23, but would cause builds to fail upon upgrading |
Implement a reduced form of GNU bash style globbing. The supported features should account for 99% of use cases. Still to do: minimum cabal-version constraint. This seems to be necessary because, for example, this commit changes the meaning of *.js, which previously would not have matched jquery.cookie.js, and now does.
* Move Distribution.Glob to Distribution.Utils.Glob * Add documentation for Distribution.Simple.Utils.matchDirFileGlob
Also, a couple of other things I've remembered:
|
GitHub isn't displaying my most recent commits, for some reason. There should be 4 showing up now, rather than 2. See master...hdgarrood:bash-globbing instead if necessary. |
Hmm, are you sure there's no way around it? This could be a blocker.
Will look at it in more detail soon. |
Re version constraints - the way I see it: (supposing this goes into 1.23, which I am aware is perhaps a little premature ;) Any package using the new globbing will have to use a minimum Cabal version constraint. If we don't do that, then bad things will happen if A authors a package with Cabal >= 1.23 and B tries to build it with Cabal < 1.23. New syntax would fail (I think), and Come to think of it, maybe we can get around this - if we're building a package with a Cabal file which does use globs, and which does not have a minimum Cabal version constraint of at least 1.23, we could get around the issue by issuing a notice or warning to the command line saying something like "Glob syntax changed in Cabal 1.23. You are currently using the old syntax. To silence this warning, do {something}. To use the new syntax, add a minimum Cabal version constraint |
With this change, do Does this change fix #2030? |
I'm pretty sure all of those fields should now accept the same syntax, yes (though I haven't been able to work out how to test that...) I believe this fixes #2030 too. |
What I think would be ideal:
|
@@ -260,6 +264,7 @@ test-suite unit-tests | |||
UnitTests.Distribution.Compat.ReadP | |||
UnitTests.Distribution.Simple.Program.Internal | |||
UnitTests.Distribution.Utils.NubList | |||
UnitTests.Distribution.Glob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should now be UnitTests.Distribution.Utils.Glob
.
matches x (CharLiteral y) = x == y | ||
matches x (Range start end) = start <= x && x <= end | ||
|
||
-- | A safe version of 'tail'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tailMay
from the safe
library. Since we can't add a dependency on safe
, let's rename it to tailMay
and move to Distribution.Simple.Utils
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah cool, didn't know that, thanks! Will do.
... and add to Distribution.Utils.Safe
@@ -0,0 +1,6 @@ | |||
module Distribution.Utils.Safe where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Distribution.Simple.Utils
already imports Distribution.Utils.Glob
, to avoid circular imports, I added tailMay
to this new module, and then also re-exported it from Distribution.Simple.Utils
. Is that ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely.
Just checking I've understood you - in this case, which semantics should be used? |
New semantics is used in all cases, but a warning should be printed when it differs from the old. |
Hi all, I must say that like some of the commenters here, I've never understood the actual motivation for restricting globs. I mean, I get that it's usually a good thing to avoid excessive complexity, but if there's a reasonable test suite that particular objection fades away. Protecting users from themselves is not a useful goal when those users are presumably expert users. If users are worried that e.g. password files (or the like) get included, they should be doing actual audits of their Continuous Integration builds and/or source distribution files (cabal sdist; check tarball for unexpected files), not relying on a general tool like Cabal to "protect them" (which it won't anyway since it doesn't actually check for password-like things). Anyway, that was my 0.02€'s worth. :) |
I think @mietek makes a very strong case for dropping globs altogether, in favor of allowing directories. I propose going even further. Let us deprecate
I think the last point is the most important. Our language (Haskell) strictly regulates allowed behavior, forcing us to organize our programs so that it is difficult to make certain classes of error. Why do we accept less from our package manager? |
"... which takes exactly one directory name..." No, please don't. This is imposes more restrictions on anyone who's developing using a simple http server just serving their local files. |
@BardurArantsson: Can you explain? I don’t see a problem with @ttuegel’s proposal. |
How? The only reason to include |
Maybe I misunderstood, but forcing a single directory for "include everything" (which is what I understood your suggestion to mean), would mean that I would have to go out of my way to force all the html, css, &c to live in a single directory. Which I may not want while I'm actually developing things. (I get that this is probably good practice in general, but it just seems like an arbitrary restriction for no good reason... other that an extreme aversion to complexity.) |
(Sorry about the immediate follow-up.) One of my use cases is actually a little bit outside the stereotypical web-like cases: I want to distribute a standard config file and some bits which are just incidental things that were easier to distribute as "raw" files instead of trying to embed them properly (as ByteStrings or such, through TH). See the .cabal file of the "hums" package: https://hackage.haskell.org/package/hums-0.7.0 ) |
@BardurArantsson Would it suffice for your use cases if we allowed specifying
to include those three directories. An aside, I just discovered that we already have a The existing behavior is definitely too complicated. If a regular Cabal contributor and a long-time Haskell user doesn't understand the behavior of (what should be) a simple field, it's too complicated. |
I think it’s reasonable to allow only a single |
I agree, I like @ttuegel's proposal a lot as well. If I was developing some kind of web thing, I wouldn't mind putting all of html, css, javascript as subdirectories of a single |
Looking at @mietek's examples, it looks like the only glob features that could be useful are
Which seems to confirm my opinion that we only need a subset of the Bash pattern syntax. The issue of |
Sorry for not getting back until now. Yes, this would be sufficient for my purposes, and I suspect most web-app-like things. (Though there may be issues if one is storing, say, high-quality SVGs together with rendered PNGs and only wants to distribute the PNGs.) And, yes, I need multiple dirs. (I still don't see why allowing full glob-syntax would be horribly bad, but whatever. I'm still vehemently opposed to glob-like syntax that doesn't adhere to anything found in an actual real shell. Principle of Least Surprise, people!) |
Closing this, because a) we seem to be at an impasse, and b) I think specifying whole directories is probably better anyway. |
I'll leave my code here, just in case anyone does want it. |
Sad to see this die by bikeshedding :(. It's still a silly bug that we can't get reliable inclusion of data files using wildcards. Didn't everyone more-or-less agree that multiple data-dirs would be sufficient? |
No, @BardurArantsson. @ttuegel proposed a new |
These are inspired by a plan described in a comment in haskell#2522, and only implement a quite limited form of recursive matching: only a single ** wildcard is accepted, it must be the final directory, and, if a ** wildcard is present, the file name must include a wildcard. Or-patterns are not implemented, for simplicity. Closes haskell#3178, haskell#2030.
These are inspired by a plan described in a comment in haskell#2522, and only implement a quite limited form of recursive matching: only a single ** wildcard is accepted, it must be the final directory, and, if a ** wildcard is present, the file name must include a wildcard. Or-patterns are not implemented, for simplicity. Closes haskell#3178, haskell#2030.
These are inspired by a plan described in a comment in haskell#2522, and only implement a quite limited form of recursive matching: only a single ** wildcard is accepted, it must be the final directory, and, if a ** wildcard is present, the file name must include a wildcard. Or-patterns are not implemented, for simplicity. Closes haskell#3178, haskell#2030.
These are inspired by a plan described in a comment in haskell#2522, and only implement a quite limited form of recursive matching: only a single ** wildcard is accepted, it must be the final directory, and, if a ** wildcard is present, the file name must include a wildcard. Or-patterns are not implemented, for simplicity. Closes haskell#3178, haskell#2030.
These are inspired by a plan described in a comment in haskell#2522, and only implement a quite limited form of recursive matching: only a single ** wildcard is accepted, it must be the final directory, and, if a ** wildcard is present, the file name must include a wildcard. Or-patterns are not implemented, for simplicity. Closes haskell#3178, haskell#2030.
These are inspired by a plan described in a comment in #2522, and only implement a quite limited form of recursive matching: only a single ** wildcard is accepted, it must be the final directory, and, if a ** wildcard is present, the file name must include a wildcard. Or-patterns are not implemented, for simplicity. Closes #3178, #2030.
Fixes #784, #2030, resolves #713, subsumes #1343, #1344. Picks up where #1975 left off.
Implement a reduced form of GNU bash style globbing. The supported
features should account for 99% of use cases.
Still to do:
necessary because, for example, this commit changes the meaning of *.js,
which previously would not have matched jquery.cookie.js, and now does.
*
and**
from appearing as the very last item of a glob pattern, to avoid including unwanted items? The documentation in this PR currently describes this limitation but I haven't implemented it yet.data-files
,extra-source-files
,extra-doc-files
(I did this manually).