You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When working with language definitions, you sometimes have parts which are used across multiple patterns (e.g. names). This redundancy makes it harder to change languages and blows up the file size.
My proposal is to add a function to build patterns.
Behavior
The function build(basePattern, replacements) will take a regular expression basePattern and an object containing regular expressions (or strings) replacements.
The source of the base patterns will contain placeholders where replacements will be inserted. The placeholder will hold the key to which replacement will be used.
This is a simple text replacement, so surprises may come.
To minimize this risk, there are some restrictions and each of the replacements will be wrapped into a non-capturing group.
The source of the base pattern with all placeholders replaced plus the flags of the base pattern will form the pattern to be returned.
The flags of replacements will be ignored.
Restrictions
To minimize the risk of text replacement in the source code, there will be a bunch of restrictions to both the base pattern and the replacements:
Placeholder positioning:
Cannot be inside a char set.
Cannot be after an unescaped backslash
Replacements:
No capturing groups. They could mess with backreferences in the base expression.
No backreferences.
Examples
I will use the placeholder <<\w+>> where \w+ will be used as the key to get a replacement.
(If you have a better placeholder, please tell me. I only choose this one because it isn't regex syntax and can be easily spotted. But I don't really like it...)
build(/a<<b>>?/i,{b: /b+/i})==/a(?:b+)?/ibuild(/a<<0>>?/i,[/b+/.source])==/a(?:b+)?/ibuild(/<<0>>/m,[/^\w+/])==/(?:^\w+)/// warning: the meaning of ^ might have changedbuild(/(a)<<0>>\1/,[/(b)/])// error: replacement contains capturing groupbuild(/a<<1>>/,[/b/])// error: replacements["1"] undefined
Why regular expressions all the way?
It will be cheaper to use strings as base patterns and replacements. That's true.
But strings do not provide two things that regular expressions do.
They are easier to write.
Inline regexes are more convenient than strings and are supported by IDEs with features like syntax highlighting.
They will catch error early on.
Each of the patterns will be compiled by the browser (or node), so they are guaranteed to have correct syntax.
Of course, these things only really matter to developers but not to the end user who's computer will have to deal with additional overhead.
Minimizing overhead
To minimize the overhead created by using and checking a bunch of regular expressions, we can do multiple things:
Replace regexes with strings using gulp.
How does gulp know what patterns are replacements?
Well, just write e.g. /pattern/.source and gulp will then convert it. Of course, it will only do that for the minified version. (gulp: Inline regex source #1537)
Do restriction checks as tests.
As long as patterns are created and used in a deterministic fashion, it's ok to check all of them once using npm test.
The text was updated successfully, but these errors were encountered:
When working with language definitions, you sometimes have parts which are used across multiple patterns (e.g. names). This redundancy makes it harder to change languages and blows up the file size.
My proposal is to add a function to build patterns.
Behavior
The function
build(basePattern, replacements)
will take a regular expressionbasePattern
and an object containing regular expressions (or strings)replacements
.The source of the base patterns will contain placeholders where replacements will be inserted. The placeholder will hold the key to which replacement will be used.
This is a simple text replacement, so surprises may come.
To minimize this risk, there are some restrictions and each of the replacements will be wrapped into a non-capturing group.
The source of the base pattern with all placeholders replaced plus the flags of the base pattern will form the pattern to be returned.
The flags of replacements will be ignored.
Restrictions
To minimize the risk of text replacement in the source code, there will be a bunch of restrictions to both the base pattern and the replacements:
Placeholder positioning:
Replacements:
Examples
I will use the placeholder
<<\w+>>
where\w+
will be used as the key to get a replacement.(If you have a better placeholder, please tell me. I only choose this one because it isn't regex syntax and can be easily spotted. But I don't really like it...)
Why regular expressions all the way?
It will be cheaper to use strings as base patterns and replacements. That's true.
But strings do not provide two things that regular expressions do.
Inline regexes are more convenient than strings and are supported by IDEs with features like syntax highlighting.
Each of the patterns will be compiled by the browser (or node), so they are guaranteed to have correct syntax.
Of course, these things only really matter to developers but not to the end user who's computer will have to deal with additional overhead.
Minimizing overhead
To minimize the overhead created by using and checking a bunch of regular expressions, we can do multiple things:
How does gulp know what patterns are replacements?
Well, just write e.g.
/pattern/.source
and gulp will then convert it. Of course, it will only do that for the minified version. (gulp: Inline regex source #1537)As long as patterns are created and used in a deterministic fashion, it's ok to check all of them once using
npm test
.The text was updated successfully, but these errors were encountered: