diff --git a/spec.html b/spec.html
index 16f6376797..8788ceafb1 100644
--- a/spec.html
+++ b/spec.html
@@ -766,7 +766,7 @@
Grammar Notation
`9`
If the phrase “[empty]” appears as the right-hand side of a production, it indicates that the production's right-hand side contains no terminals or nonterminals.
- If the phrase “[lookahead = _seq_]” appears in the right-hand side of a production, it indicates that the production may only be used if the token sequence _seq_ is a prefix of the immediately following input token sequence. Similarly, “[lookahead ∈ _set_]”, where _set_ is a finite nonempty set of token sequences, indicates that the production may only be used if some element of _set_ is a prefix of the immediately following token sequence. For convenience, the set can also be written as a nonterminal, in which case it represents the set of all token sequences to which that nonterminal could expand. It is considered an editorial error if the nonterminal could expand to infinitely many distinct token sequences.
+ If the phrase “[lookahead = _seq_]” appears in the right-hand side of a production, it indicates that the production may only be used if the token sequence _seq_ is a prefix of the immediately following input token sequence. Similarly, “[lookahead ∈ _set_]”, where _set_ is a finite nonempty set of token sequences, indicates that the production may only be used if some element of _set_ is a prefix of the immediately following token sequence. For convenience, the set can also be written as a nonterminal, in which case it represents the set of all token sequences to which that nonterminal could expand. It is considered an editorial error if the nonterminal could expand to infinitely many distinct token sequences. [Need to loosen these restrictions for some of the productions in : maybe allow specification of _set_ via sentential form (not just a nonterminal), and allow infinite-but-regular sets.]
These conditions may be negated. “[lookahead ≠ _seq_]” indicates that the containing production may only be used if _seq_ is not a prefix of the immediately following input token sequence, and “[lookahead ∉ _set_]” indicates that the production may only be used if no element of _set_ is a prefix of the immediately following token sequence.
As an example, given the definitions:
@@ -46946,7 +46946,7 @@ Syntax
Regular Expressions Patterns
- The syntax of is modified and extended as follows. These changes introduce ambiguities that are broken by the ordering of grammar productions and by contextual information. When parsing using the following grammar, each alternative is considered only if previous production alternatives do not match.
+ The syntax of is modified and extended as follows.
This alternative pattern grammar and semantics only changes the syntax and semantics of BMP patterns. The following grammar extensions include productions parameterized with the [UnicodeMode] parameter. However, none of these extensions change the syntax of Unicode patterns recognized when parsing with the [UnicodeMode] parameter present on the goal symbol.
Syntax
@@ -46976,13 +46976,13 @@ Syntax
ExtendedAtom[N] ::
`.`
- `\` AtomEscape[~UnicodeMode, ?N]
- `\` [lookahead == `c`]
+ `\` [lookahead <! {`b`, `B`}] AtomEscape[~UnicodeMode, ?N]
+ `\` [lookahead == `c`] [lookahead <! `c` ControlLetter]
CharacterClass[~UnicodeMode]
`(` GroupSpecifier[~UnicodeMode] Disjunction[~UnicodeMode, ?N] `)`
`(` `?` `:` Disjunction[~UnicodeMode, ?N] `)`
InvalidBracedQuantifier
- ExtendedPatternCharacter
+ [lookahead <! InvalidBracedQuantifier] ExtendedPatternCharacter
InvalidBracedQuantifier ::
`{` DecimalDigits[~Sep] `}`
@@ -46994,11 +46994,15 @@ Syntax
AtomEscape[UnicodeMode, N] ::
[+UnicodeMode] DecimalEscape
- [~UnicodeMode] DecimalEscape [> but only if the CapturingGroupNumber of |DecimalEscape| is ≤ CountLeftCapturingParensWithin(the |Pattern| containing |DecimalEscape|)]
+ [~UnicodeMode] ConstrainedDecimalEscape
CharacterClassEscape[?UnicodeMode]
- CharacterEscape[?UnicodeMode, ?N]
+ [+UnicodeMode] CharacterEscape[?UnicodeMode, ?N]
+ [~UnicodeMode] [lookahead <! ConstrainedDecimalEscape] CharacterEscape[?UnicodeMode, ?N]
[+N] `k` GroupName[?UnicodeMode]
+ ConstrainedDecimalEscape ::
+ DecimalEscape [> but only if the CapturingGroupNumber of |DecimalEscape| is ≤ CountLeftCapturingParensWithin(the |Pattern| containing |DecimalEscape|)]
+
CharacterEscape[UnicodeMode, N] ::
ControlEscape
`c` ControlLetter
@@ -47006,7 +47010,7 @@ Syntax
HexEscapeSequence
RegExpUnicodeEscapeSequence[?UnicodeMode]
[~UnicodeMode] LegacyOctalEscapeSequence
- IdentityEscape[?UnicodeMode, ?N]
+ [lookahead <! HexEscapeSequence] [lookahead <! RegExpUnicodeEscapeSequence] IdentityEscape[?UnicodeMode, ?N]
IdentityEscape[UnicodeMode, N] ::
[+UnicodeMode] SyntaxCharacter
@@ -47014,20 +47018,23 @@ Syntax
[~UnicodeMode] SourceCharacterIdentityEscape[?N]
SourceCharacterIdentityEscape[N] ::
- [~N] SourceCharacter but not `c`
- [+N] SourceCharacter but not one of `c` or `k`
+ [~N] SourceCharacter but not one of `0` `1` `2` `3` `4` `5` `6` `7` `c` `f` `n` `r` `t` `v` `d` `s` `w` `D` `S` `W`
+ [+N] SourceCharacter but not one of `0` `1` `2` `3` `4` `5` `6` `7` `c` `f` `n` `r` `t` `v` `d` `s` `w` `D` `S` `W` `k`
+ `or`
+ [~N] SourceCharacter but not one of OctalDigit or ControlEscape or CharacterClassEscape or `c`
+ [+N] SourceCharacter but not one of OctalDigit or ControlEscape or CharacterClassEscape or `c` or `k`
ClassAtomNoDash[UnicodeMode, N] ::
SourceCharacter but not one of `\` or `]` or `-`
`\` ClassEscape[?UnicodeMode, ?N]
- `\` [lookahead == `c`]
+ `\` [lookahead == `c`] [lookahead <! `c` ClassControlLetter] [lookahead <! `c` ControlLetter]
ClassEscape[UnicodeMode, N] ::
`b`
[+UnicodeMode] `-`
[~UnicodeMode] `c` ClassControlLetter
CharacterClassEscape[?UnicodeMode]
- CharacterEscape[?UnicodeMode, ?N]
+ [lookahead != `b`] CharacterEscape[?UnicodeMode, ?N]
ClassControlLetter ::
DecimalDigit