-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Developers can enumerate directories and files using globbing patterns #21362
Comments
This article may come in handy: Glob Matching Can Be Simple And Fast Too |
Heh. Reading that article was exactly what prompted me to file this issue 😂 |
I think this would make a great API addition. I would like to see the implementation tied to an abstraction and not directly to the CoreFx IO classes. Then a specific implementation of the globbing API could be delivered support the CoreFx IO API while leaving it open for other uses. For example, Microsoft.Extensions.FileSystemGlobbing does a great job with this because you only have to implement two abstract classes to get it two work ( Some things I think are missing from Microsoft.Extensions.FileSystemGlobbing and would love to see:
For example, pattern expansion can be represented by braces so you can have patterns like:
which selects all Likewise, pattern exclusion would look like:
which selects all Both of these concepts are (mostly) implemented for Microsoft.Extensions.FileSystemGlobbing in Reliak.FileSystemGlobbingExtensions. There's also a really good implementation of brace expansion in Minimatch. FWIW, I've also got an implementation for Microsoft.Extensions.FileSystemGlobbing based on combining features from those two libraries here. |
Yes! |
Given the shift to CLI, both for the tooling and as a target for .NET Core apps, this would make perfect sense. |
Is there some kind of quasi standard written down for globbing behavior? Are forward slashes important (they aren't for native Windows apps like MSBuild, but are for Git on Windows probably just undone work) Does "**" literally match "everything including slash" and "*" literally matches "everything excluding slash"? I believe that's what we did in MSBuild and I think that's Git behavior also. Could "**" supported be added to the existing File IO API's without unnacceptable breaks, and is that important? I like the idea of also offering it for use in other contexts, eg., Git obviously does not always run globs over the file system. |
As for perf, of course it's faster to use the native API where possible. I believe for MSBuild we did something like: used the OS if there were no * after the last slash; otherwise cropped the pattern at the first slash after the first * then used the OS to enumerate all files below it into a list then used a regex on the results to handle any subsequent * and **. Curious what Unix shells, perl etc do. It would be easy to check. |
I don't think there's any formal specification. I guess the best bet is going by something like minimatch's test suite which seems to be pretty in-line with bash, sh, ksh etc. There's even some comments on compliance with other fnmatch/glob implementations in their README.md
When it comes to forward/backward slashes, I think it makes sense to do what |
Why is that? I assumed that in Git and Sublime on Windows this was just a hangover from their non-Windows heritage. it's certainly handy to be able to paste in a Windows path and add ** on the end or suchlike. |
How do you handle escaping if you handle both forward and backward slashes as separators? |
@JeremyKuhne this is an IO area we might want to invest in.. |
I'd also like to point out that Directory.EnumerateFiles(
folderPath,
pattern,
SearchOption.AllDirectories) isn't good at best effort attempts to enumerate files. If a path is too long or if you are unauthorized to access a file, the EnumerateFiles will throw and you won't be able to continue enumerating after that throwing file. |
Note that we're reviewing an extensibility mechanism for enumeration that will allow building globbing solutions. See #24429. |
Currently .Net Core implement DOS-like globbing for Windows and fnmatch-like globing on Unix. PowerShell implementation Case matching is needed for IntelliSense scenarios. For performance .Net Core could use low level API for enumeration:
FTS support hard/soft link cycle detection (and case matching). In PowerShell we have to implement this by inode cache and checks. |
POSIX apparently doesn't support globstar - nor does Powershell. Personally, I think it's important. |
We don't have time to implement a new globbing spec in 2.1, but I wonder how difficult it would be to add a generic MatchType.RegularExpression to MatchType.Win32 and MatchType.Dos. @JeremyKuhne ? |
Very easy. Performance would not be great until we get a span Regex implementation (as we'd have to create strings), but having one would allow us to light up existing usages when we do get one... |
Currently Bash has the support.
If we can add and later make performance optimization I vote for the adding. /cc @SteveL-MSFT |
@JeremyKuhne do you want to get the API approved for it (the enum member) then? It would have to be done by end of month. |
I'll try. We'd have to take a regex dependency to do so. |
One more implementation to your list: What currently syncProj does not support is listing of file extensions by "{cpp,h}" - and I've concluded not to support that syntax, as it can be resorted to regex match pattern (cpp|h), but then it matches either .h or .cpp whichever exists, but typical use for syncProj is to match actually both files - so .cpp and .h files, not one of them. I could add support a.{c,h} as inclusion of files a.c and a.h, but then what if we have {c*,h} - that goes into more complex direction, and for me it's easier to sort it out by actually searching files and listing them. But if you manage to write function similar to my, and even to improve it, let me know. (Without heavy class hierarchy similar to Microsoft.Extensions.FileSystemGlobbing) |
Just another interesting package: https://github.com/dazinator/DotNet.Glob |
The hope is that we are able to add to public enum MatchType
{
// These exist
Simple,
Win32,
// New
Globbing,
MSBuildGlobbing,
Regex
} Adding |
Question about custom delegate is - would we want to pass custom options to the delegate? |
I would like to see an API that checks if an input matches a glob pattern without touching the file system, similar to public class Glob
{
public Glob(string pattern);
public bool IsMatch(string input);
public static bool IsMatch(string input, string pattern);
} |
@yufeih did you consider using the |
@jozkee I'm currently using Glob. |
Should this be labeled "User Story" and have a customer-focused title in the user story form ie "PERSONA-VERB-NOUN" |
I'm finding a short-coming of
|
I'd like to start a discussion on including a file system globbing API in .NET (Core). If you look at implementations mentioned on Wikipedia, every "mainstream platform" has an entry, but not .NET.
There's quite a few (more or less) successful implementations around (see below), some even from Microsoft, but I think something as fundamental as this, should ship with the framework.
There's already partial globbing support using the following methods
Directory.GetFiles
Directory.EnumerateFiles
Directory.GetFileSystemEntries
Directory.EnumerateFileSystemEntries
Directory.GetDirectories
Directory.EnumerateDirectories
They all have a
searchPattern
argument, but it lacks support for recursive globs (**
aka. "globstar"), brace expansion etc. This can be achieved using theSearchOption
argument, but the API is hard to use when you want to support (often user-defined) recursive patterns like/src/**/*.csproj
.I'd ❤️ to hear people's opinions here...
searchPattern
in the methods mentioned above (without it being a breaking change)?Examples
And tons of other implementations...
The text was updated successfully, but these errors were encountered: