Skip to content

Commit

Permalink
Support numbered recursion
Browse files Browse the repository at this point in the history
  • Loading branch information
slevithan committed Nov 4, 2024
1 parent f939b55 commit e417d8b
Show file tree
Hide file tree
Showing 6 changed files with 31 additions and 36 deletions.
19 changes: 7 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -580,7 +580,7 @@ Notice that nearly every feature below has at least subtle differences from Java
<td align="middle">☑️</td>
<td align="middle">☑️</td>
<td>
Many common uses supported<br>
Common uses supported<br>
</td>
</tr>
<tr valign="top">
Expand Down Expand Up @@ -772,7 +772,7 @@ Notice that nearly every feature below has at least subtle differences from Java
</tr>

<tr valign="top">
<th align="left" rowspan="3">Recursion</th>
<th align="left" rowspan="2">Recursion</th>
<td>Full pattern</td>
<td>
<code>\g&lt;0></code>,<br>
Expand All @@ -785,17 +785,12 @@ Notice that nearly every feature below has at least subtle differences from Java
</td>
</tr>
<tr valign="top">
<td>Numbered, relative</td>
<td><code>(…\g&lt;1>?…)</code>, etc.</td>
<td align="middle">❌</td>
<td align="middle">❌</td>
<td>Named, numbered, relative</td>
<td>
● Not yet supported<br>
<code>(?&lt;a>…\g&lt;a>?…)</code>,<br>
<code>(…\g&lt;1>?…)</code>,<br>
<code>(…\g&lt;-1>?…)</code>, etc.
</td>
</tr>
<tr valign="top">
<td>Named</td>
<td><code>(?&lt;a>…\g&lt;a>?…)</code>, etc.</td>
<td align="middle">☑️</td>
<td align="middle">☑️</td>
<td>
Expand Down Expand Up @@ -892,7 +887,7 @@ The table above doesn't include all aspects that Oniguruma-To-ES emulates (inclu
3. With target `ES2018`, the specific POSIX classes `[:graph:]` and `[:print:]` are an error if option `allowBestEffort` is `false`, and they use ASCII-based versions rather than the Unicode versions available for target `ES2024` and later.
4. Target `ES2018` doesn't support nested negated character classes.
5. It's not an error for *numbered* backreferences to come before their referenced group in Oniguruma, but an error is the best path for Oniguruma-To-ES because (1) most placements are mistakes and can never match (based on the Oniguruma behavior for backreferences to nonparticipating groups), (2) erroring matches the behavior of named backreferences, and (3) the edge cases where they're matchable rely on rules for backreference resetting within quantified groups that are different in JS and aren't emulatable. Note that it's not a backreference in the first place if using `\10` or higher and not as many capturing groups are defined to the left (it's an octal or identity escape).
6. Recursion depth is limited, and specified by option `maxRecursionDepth`. Any use of recursion results in an error if `maxRecursionDepth` is `null` or `allowBestEffort` is `false`. Additionally, some forms of recursion are not yet supported, including mixing recursion with backreferences, using multiple recursions in the same pattern, and recursion by group number. Because recursion is bounded, patterns that fail due to infinite recursion in Oniguruma might find a match in Oniguruma-To-ES. Future versions will detect this and throw an error.
6. The maximum recursion depth is specified by option `maxRecursionDepth`. Use of recursion results in an error if `maxRecursionDepth` is `null` or `allowBestEffort` is `false`. Some forms of recursion (mixing recursion with backreferences, and using multiple recursions in the same pattern) are not yet supported. Note that, because recursion is bounded, patterns that fail due to infinite recursion in Oniguruma might find a match in Oniguruma-To-ES. Future versions will detect this and throw an error.

## ㊗️ Unicode / mixed case-sensitivity

Expand Down
10 changes: 5 additions & 5 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
"dependencies": {
"emoji-regex-xs": "^1.0.0",
"regex": "^4.4.0",
"regex-recursion": "^4.0.0"
"regex-recursion": "^4.1.0"
},
"devDependencies": {
"esbuild": "^0.24.0",
Expand Down
26 changes: 15 additions & 11 deletions spec/match-recursion.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -25,12 +25,10 @@ describe('Recursion', () => {

describe('global', () => {
it('should match direct recursion', () => {
// Match an equal number of two different subpatterns
expect('aaabbb').toExactlyMatch(r`a\g<0>?b`);
expect('test aaaaaabbb').toFindMatch(r`a\g<0>?b`);
expect('aaabbb').toExactlyMatch(r`(?<n>a\g<0>?b)`);

// Match balanced brackets
const pattern = r`<(?:[^<>]|\g<0>)*>`;
expect([
'<>', '<<>>', '<a<b<c>d>e>', '<<<<<<a>>>bc>>>',
Expand All @@ -50,17 +48,25 @@ describe('Recursion', () => {
});

describe('numbered', () => {
// Current limitation of `regex-recursion`
it('should throw for numbered recursion', () => {
expect(() => compile(r`(a\g<1>?)`)).toThrow();
it('should match direct recursion', () => {
expect('aaa').toExactlyMatch(r`(a\g<1>?)`);
expect('aaabbb').toExactlyMatch(r`\A(a\g<1>?b)\z`);
expect('aaabb').not.toFindMatch(r`\A(a\g<1>?b)\z`);
});

it('should throw for indirect recursion', () => {
expect(() => compile(r`(a\g<2>(\g<1>?))`)).toThrow();
});
});

describe('relative numbered', () => {
// Current limitation of `regex-recursion`
it('should throw for relative numbered recursion', () => {
expect(() => compile(r`(a\g<-1>?)`)).toThrow();
it('should match direct recursion', () => {
expect('aaa').toExactlyMatch(r`(a\g<-1>?)`);
expect('aaabbb').toExactlyMatch(r`\A(a\g<-1>?b)\z`);
expect('aaabb').not.toFindMatch(r`\A(a\g<-1>?b)\z`);
});

it('should throw for indirect recursion', () => {
expect(() => compile(r`(a\g<+1>(\g<-2>?))`)).toThrow();
});

Expand All @@ -72,7 +78,6 @@ describe('Recursion', () => {

describe('named', () => {
it('should match direct recursion', () => {
// Match an equal number of two different subpatterns as the entire string
expect('aaabbb').toExactlyMatch(r`\A(?<r>a\g<r>?b)\z`);
expect('aaabb').not.toFindMatch(r`\A(?<r>a\g<r>?b)\z`);
});
Expand All @@ -84,8 +89,7 @@ describe('Recursion', () => {

// Current limitation of `regex-recursion`
it('should throw for multiple direct, non-overlapping recursions', () => {
// [TODO] `regex-recursion` has a bug and lets invalid JS syntax through so using `toRegExp` instead of `compile`
expect(() => toRegExp(r`(?<r1>a\g<r1>?)(?<r2>a\g<r2>?)`)).toThrow();
expect(() => compile(r`(?<r1>a\g<r1>?)(?<r2>a\g<r2>?)`)).toThrow();
});

it('should throw for multiple indirect, overlapping recursions', () => {
Expand Down
2 changes: 1 addition & 1 deletion src/generate.js
Original file line number Diff line number Diff line change
Expand Up @@ -395,7 +395,7 @@ function genRecursion({ref}, state) {
if (!state.allowBestEffort) {
throw new Error('Use of recursion requires option allowBestEffort');
}
// Use syntax supported by `regex-recursion`
// Using the syntax supported by `regex-recursion`
return ref === 0 ? `(?R=${rDepth})` : r`\g<${ref}&R=${rDepth}>`;
}

Expand Down
8 changes: 2 additions & 6 deletions src/transform.js
Original file line number Diff line number Diff line change
Expand Up @@ -385,8 +385,8 @@ const SecondPassVisitor = {
// `openDirectCaptures`, and rename `reffedNodesByBackreference` so it can also be used to
// track the reffed node. Like with backrefs, can then modify the `ref` in the final pass
// to use the recalculated group number. But this relies on `regex-recursion` supporting
// multiple non-overlapping recursions and recursion by number. For now, the resulting
// error is caught by `regex-recursion`
// multiple non-overlapping recursions. For now, the resulting error is caught by
// `regex-recursion`
replaceWith(createRecursion(ref));
// This node's kids have been removed from the tree, so no need to traverse them
skip();
Expand Down Expand Up @@ -662,10 +662,6 @@ function cloneCapturingGroup(obj, originMap, up, up2) {
}

function createRecursion(ref) {
if (typeof ref === 'number' && ref !== 0) {
// Limitation of `regex-recursion`; remove if future versions support
throw new Error('Unsupported recursion by number; use name instead');
}
return {
type: AstTypes.Recursion,
ref,
Expand Down

0 comments on commit e417d8b

Please sign in to comment.