-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This commit completes the initial move of glob matching to an external crate, including fixing up cross platform support, polishing the external crate for others to use and fixing a number of bugs in the process. Fixes #87, #127, #131
- Loading branch information
1 parent
bc5accc
commit e96d930
Showing
13 changed files
with
585 additions
and
362 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,10 +3,17 @@ name = "globset" | |
version = "0.1.0" | ||
authors = ["Andrew Gallant <[email protected]>"] | ||
|
||
[lib] | ||
name = "globset" | ||
bench = false | ||
|
||
[dependencies] | ||
aho-corasick = "0.5.3" | ||
fnv = "1.0" | ||
lazy_static = "0.2" | ||
log = "0.3" | ||
memchr = "0.1" | ||
regex = "0.1.77" | ||
|
||
[dev-dependencies] | ||
glob = "0.2" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
globset | ||
======= | ||
Cross platform single glob and glob set matching. Glob set matching is the | ||
process of matching one or more glob patterns against a single candidate path | ||
simultaneously, and returning all of the globs that matched. | ||
|
||
[![Linux build status](https://api.travis-ci.org/BurntSushi/ripgrep.png)](https://travis-ci.org/BurntSushi/ripgrep) | ||
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/BurntSushi/ripgrep?svg=true)](https://ci.appveyor.com/project/BurntSushi/ripgrep) | ||
[![](https://img.shields.io/crates/v/globset.svg)](https://crates.io/crates/globset) | ||
|
||
Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org). | ||
|
||
### Documentation | ||
|
||
[https://docs.rs/globset](https://docs.rs/globset) | ||
|
||
### Usage | ||
|
||
Add this to your `Cargo.toml`: | ||
|
||
```toml | ||
[dependencies] | ||
globset = "0.1" | ||
``` | ||
|
||
and this to your crate root: | ||
|
||
```rust | ||
extern crate globset; | ||
``` | ||
|
||
### Example: one glob | ||
|
||
This example shows how to match a single glob against a single file path. | ||
|
||
```rust | ||
use globset::Glob; | ||
|
||
let glob = try!(Glob::new("*.rs")).compile_matcher(); | ||
|
||
assert!(glob.is_match("foo.rs")); | ||
assert!(glob.is_match("foo/bar.rs")); | ||
assert!(!glob.is_match("Cargo.toml")); | ||
``` | ||
|
||
### Example: configuring a glob matcher | ||
|
||
This example shows how to use a `GlobBuilder` to configure aspects of match | ||
semantics. In this example, we prevent wildcards from matching path separators. | ||
|
||
```rust | ||
use globset::GlobBuilder; | ||
|
||
let glob = try!(GlobBuilder::new("*.rs") | ||
.literal_separator(true).build()).compile_matcher(); | ||
|
||
assert!(glob.is_match("foo.rs")); | ||
assert!(!glob.is_match("foo/bar.rs")); // no longer matches | ||
assert!(!glob.is_match("Cargo.toml")); | ||
``` | ||
|
||
### Example: match multiple globs at once | ||
|
||
This example shows how to match multiple glob patterns at once. | ||
|
||
```rust | ||
use globset::{Glob, GlobSetBuilder}; | ||
|
||
let mut builder = GlobSetBuilder::new(); | ||
// A GlobBuilder can be used to configure each glob's match semantics | ||
// independently. | ||
builder.add(try!(Glob::new("*.rs"))); | ||
builder.add(try!(Glob::new("src/lib.rs"))); | ||
builder.add(try!(Glob::new("src/**/foo.rs"))); | ||
let set = try!(builder.build()); | ||
|
||
assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]); | ||
``` | ||
|
||
### Performance | ||
|
||
This crate implements globs by converting them to regular expressions, and | ||
executing them with the | ||
[`regex`](https://github.com/rust-lang-nursery/regex) | ||
crate. | ||
|
||
For single glob matching, performance of this crate should be roughly on par | ||
with the performance of the | ||
[`glob`](https://github.com/rust-lang-nursery/glob) | ||
crate. (`*_regex` correspond to benchmarks for this library while `*_glob` | ||
correspond to benchmarks for the `glob` library.) | ||
Optimizations in the `regex` crate may propel this library past `glob`, | ||
particularly when matching longer paths. | ||
|
||
``` | ||
test ext_glob ... bench: 425 ns/iter (+/- 21) | ||
test ext_regex ... bench: 175 ns/iter (+/- 10) | ||
test long_glob ... bench: 182 ns/iter (+/- 11) | ||
test long_regex ... bench: 173 ns/iter (+/- 10) | ||
test short_glob ... bench: 69 ns/iter (+/- 4) | ||
test short_regex ... bench: 83 ns/iter (+/- 2) | ||
``` | ||
|
||
The primary performance advantage of this crate is when matching multiple | ||
globs against a single path. With the `glob` crate, one must match each glob | ||
synchronously, one after the other. In this crate, many can be matched | ||
simultaneously. For example: | ||
|
||
``` | ||
test many_short_glob ... bench: 1,063 ns/iter (+/- 47) | ||
test many_short_regex_set ... bench: 186 ns/iter (+/- 11) | ||
``` | ||
|
||
### Comparison with the [`glob`](https://github.com/rust-lang-nursery/glob) crate | ||
|
||
* Supports alternate "or" globs, e.g., `*.{foo,bar}`. | ||
* Can match non-UTF-8 file paths correctly. | ||
* Supports matching multiple globs at once. | ||
* Doesn't provide a recursive directory iterator of matching file paths, | ||
although I believe this crate should grow one eventually. | ||
* Supports case insensitive and require-literal-separator match options, but | ||
**doesn't** support the require-literal-leading-dot option. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.