diff --git a/README.md b/README.md index 607da44..fac9509 100644 --- a/README.md +++ b/README.md @@ -13,8 +13,115 @@ Oniguruma-To-ES deeply understands all of the hundreds of large and small differ ## Contents +- [Install and use](#install-and-use) +- [API](#api) - [Options](#options) -- [Unicode, mixed case-sensitivity](#unicode-mixed-case-sensitivity) +- [Unicode / mixed case-sensitivity](#unicode--mixed-case-sensitivity) + +## Install and use + +```sh +npm install oniguruma-to-es +``` + +```js +import {compile, toRegExp} from 'regex'; +``` + +In browsers: + +```html + +``` + +
+ Using a global name (no import) + +```html + + +``` +
+ +## API + +### `compile` + +Transpiles an Oniguruma regex pattern and flags to native JavaScript. + +```ts +function compile( + pattern: string, + flags?: OnigurumaFlags, + options?: CompileOptions +): { + pattern: string; + flags: string; +}; +``` + +The returned `pattern` and `flags` can be provided directly to the `RegExp` constructor. + +`OnigurumaFlags` are `i`, `m`, and `x` in any order (all optional). Oniguruma's flag `m` is equivalent to JavaScript's flag `s`. + +#### Type `CompileOptions` + +```ts +type CompileOptions = { + allowBestEffort?: boolean; + maxRecursionDepth?: number | null; + optimize?: boolean; + target?: 'ES2018' | 'ES2024' | 'ESNext'; +}; +``` + +See [Options](#options) for more details. + +### `toRegExp` + +Transpiles an Oniguruma regex pattern and flags and returns a native JavaScript `RegExp`. + +```ts +function toRegExp( + pattern: string, + flags?: string, + options?: CompileOptions +): RegExp; +``` + +Flags are any combination of Oniguruma flags `i`, `m`, and `x`, and JavaScript flags `d` and `g`. Oniguruma's flag `m` is equivalent to JavaScript's flag `s`. + +> [!TIP] +> Try it in the [demo REPL](https://slevithan.github.io/oniguruma-to-es/demo/). + +### `toOnigurumaAst` + +Generates an Oniguruma AST from an Oniguruma pattern and flags. + +```ts +function toOnigurumaAst( + pattern: string, + flags?: OnigurumaFlags +): OnigurumaAst; +``` + +### `toRegexAst` + +Generates a [`regex`](https://github.com/slevithan/regex) AST from an Oniguruma pattern and flags. + +```ts +function toRegexAst( + pattern: string, + flags?: OnigurumaFlags +): RegexAst; +``` + +`regex` syntax and behavior is a strict superset of native JavaScript `RegExp`, so the AST is very close to representing native ESNext JavaScript but with some added features (atomic groups, possessive quantifiers, recursion). The `regex` AST doesn't use some `regex` features like flag `x` or subroutines because they follow PCRE behavior and work somewhat differently than in Oniguruma. The AST represents what's needed to precisely reproduce the Oniguruma behavior. ## Options @@ -49,7 +156,7 @@ If `null`, any use of recursion throws. If an integer between `2` and `100` (and
More details -Using a higher limit is not a problem if needed. Although there can be a performance cost (generally small unless exacerbating an existing problem with superlinear backtracking), there is no effect on regexes that don't use recursion. +Using a high limit is not a problem if needed. Although there can be a performance cost (minor unless it's exacerbating an existing issue with runaway backtracking), there is no effect on regexes that don't use recursion.
### `optimize` @@ -68,7 +175,7 @@ Sets the JavaScript language version for generated patterns and flags. Later tar More details - `ES2018`: Uses JS flag `u`. - - Emulation restrictions: Character class intersection, nested negated classes, and Unicode properties added after ES2018 are not allowed. + - Emulation restrictions: Character class intersection, nested negated character classes, and Unicode properties added after ES2018 are not allowed. - Generated regexes might use ES2018 features that require Node.js 10 or a browser version released during 2018 to 2023 (in Safari's case). Minimum requirement for any regex is Node.js 6 or a 2016-era browser. - `ES2024`: Uses JS flag `v`. - No emulation restrictions. @@ -78,20 +185,20 @@ Sets the JavaScript language version for generated patterns and flags. Later tar - Generated regexes might use features that require Node.js 23 or a 2024-era browser (except Safari, which lacks support). -## Unicode, mixed case-sensitivity +## Unicode / mixed case-sensitivity Oniguruma-To-ES fully supports mixed case-sensitivity (and handles the Unicode edge cases) regardless of JavaScript [target](#target). It also restricts Unicode properties to those supported by Oniguruma and the target JavaScript version. Oniguruma-To-ES focuses on being lightweight to make it better for use in browsers. This is partly achieved by not including heavyweight Unicode character data, which imposes a couple of minor/rare restrictions: -- Character class intersection and nested negated classes are unsupported with target `ES2018`. Use target `ES2024` or later if you need support for these Oniguruma features. +- Character class intersection and nested negated character classes are unsupported with target `ES2018`. Use target `ES2024` or later if you need support for these Oniguruma features. - A handful of Unicode properties that target a specific character case (ex: `\p{Lower}`) can't be used case-insensitively in patterns that contain other characters with a specific case that are used case-sensitively. - In other words, almost every usage is fine, inluding `A\p{Lower}`, `(?i:A\p{Lower})`, `(?i:A)\p{Lower}`, `(?i:A(?-i:\p{Lower}))`, and `\w(?i:\p{Lower})`, but not `A(?i:\p{Lower})`. - Using these properties case-insensitively is basically never done intentionally, so you're unlikely to encounter this error unless it's catching a mistake. ## Similar projects -[js_regex](https://github.com/jaynetics/js_regex) transpiles [Onigmo](https://github.com/k-takata/Onigmo) regexes to JavaScript (Onigmo is a fork of Oniguruma that has slightly different syntax/behavior). js_regex is written in Ruby and relies on Ruby's Onigmo parser, which means regexes must be pre-transpiled to use them in JavaScript. In contrast, Oniguruma-To-ES is written in JavaScript, so it can be used at runtime. js_regex also produces regexes with more edge cases that don't perfectly follow Oniguruma's behavior, in addition to the Oniguruma/Onigmo differences. +[js_regex](https://github.com/jaynetics/js_regex) transpiles [Onigmo](https://github.com/k-takata/Onigmo) regexes to JavaScript (Onigmo is a fork of Oniguruma that has slightly different syntax/behavior). js_regex is written in Ruby and relies on Ruby's built-in Onigmo parser, which means regexes must be transpiled ahead of time to use them in JavaScript. In contrast, Oniguruma-To-ES is written in JavaScript, so it can be used at runtime. js_regex also produces regexes with more edge cases that don't perfectly follow Oniguruma's behavior, in addition to the Oniguruma/Onigmo differences. ## About diff --git a/dist/index.min.js b/dist/index.min.js index 8508132..7fc40c0 100644 --- a/dist/index.min.js +++ b/dist/index.min.js @@ -1,4 +1,4 @@ -var OnigurumaToES=(()=>{var ne=Object.defineProperty;var mt=Object.getOwnPropertyDescriptor;var Et=Object.getOwnPropertyNames;var wt=Object.prototype.hasOwnProperty;var At=(e,t)=>{for(var r in t)ne(e,r,{get:t[r],enumerable:!0})},St=(e,t,r,s)=>{if(t&&typeof t=="object"||typeof t=="function")for(let n of Et(t))!wt.call(e,n)&&n!==r&&ne(e,n,{get:()=>t[n],enumerable:!(s=mt(t,n))||s.enumerable});return e};var _t=e=>St(ne({},"__esModule",{value:!0}),e);var Rr={};At(Rr,{compile:()=>ke,toOnigurumaAst:()=>Ct,toRegExp:()=>vr,toRegexAst:()=>Lr});var se=String.raw,be=se`(?:\p{Emoji}\uFE0F\u20E3?|\p{Emoji_Modifier_Base}\p{Emoji_Modifier}?|\p{Emoji_Presentation})`,xe=se`\u{E0061}-\u{E007A}`,ye=()=>new RegExp(se`[\u{1F1E6}-\u{1F1FF}]{2}|\u{1F3F4}[${xe}]{2}[\u{E0030}-\u{E0039}${xe}]{1,3}\u{E007F}|${be}(?:\u200D${be})*`,"gu");var A=String.fromCodePoint,f=String.raw,q={ES2018:2018,ES2024:2024,ESNext:2025};function U(e,{enable:t,disable:r}){return{dotAll:!r?.dotAll&&!!(t?.dotAll||e.dotAll),ignoreCase:!r?.ignoreCase&&!!(t?.ignoreCase||e.ignoreCase)}}function L(e,t,r){return e.has(t)||e.set(t,r),e.get(t)}function j(e,t){return q[e]>=q[t]}function N(e,t){if(!e)throw new Error(t??"Value expected");return e}var kt=new Set([A(304),A(305)]);function oe(e){if(kt.has(e))return[e];let t=new Set,r=e.toLowerCase(),s=r.toUpperCase(),n=yt.get(r),o=xt.get(r);return[...s].length===1&&t.add(s),t.add(r),n&&t.add(n),o&&t.add(o),[...t]}var ie=new Set(["C","Other","Cc","Control","cntrl","Cf","Format","Cn","Unassigned","Co","Private_Use","Cs","Surrogate","L","Letter","LC","Cased_Letter","Ll","Lowercase_Letter","Lm","Modifier_Letter","Lo","Other_Letter","Lt","Titlecase_Letter","Lu","Uppercase_Letter","M","Mark","Combining_Mark","Mc","Spacing_Mark","Me","Enclosing_Mark","Mn","Nonspacing_Mark","N","Number","Nd","Decimal_Number","digit","Nl","Letter_Number","No","Other_Number","P","Punctuation","punct","Pc","Connector_Punctuation","Pd","Dash_Punctuation","Pe","Close_Punctuation","Pf","Final_Punctuation","Pi","Initial_Punctuation","Po","Other_Punctuation","Ps","Open_Punctuation","S","Symbol","Sc","Currency_Symbol","Sk","Modifier_Symbol","Sm","Math_Symbol","So","Other_Symbol","Z","Separator","Zl","Line_Separator","Zp","Paragraph_Separator","Zs","Space_Separator","ASCII","ASCII_Hex_Digit","AHex","Alphabetic","Alpha","Any","Assigned","Bidi_Control","Bidi_C","Bidi_Mirrored","Bidi_M","Case_Ignorable","CI","Cased","Changes_When_Casefolded","CWCF","Changes_When_Casemapped","CWCM","Changes_When_Lowercased","CWL","Changes_When_NFKC_Casefolded","CWKCF","Changes_When_Titlecased","CWT","Changes_When_Uppercased","CWU","Dash","Default_Ignorable_Code_Point","DI","Deprecated","Dep","Diacritic","Dia","Emoji","Emoji_Component","EComp","Emoji_Modifier","EMod","Emoji_Modifier_Base","EBase","Emoji_Presentation","EPres","Extended_Pictographic","ExtPict","Extender","Ext","Grapheme_Base","Gr_Base","Grapheme_Extend","Gr_Ext","Hex_Digit","Hex","IDS_Binary_Operator","IDSB","IDS_Trinary_Operator","IDST","ID_Continue","IDC","ID_Start","IDS","Ideographic","Ideo","Join_Control","Join_C","Logical_Order_Exception","LOE","Lowercase","Lower","Math","Noncharacter_Code_Point","NChar","Pattern_Syntax","Pat_Syn","Pattern_White_Space","Pat_WS","Quotation_Mark","QMark","Radical","Regional_Indicator","RI","Sentence_Terminal","STerm","Soft_Dotted","SD","Terminal_Punctuation","Term","Unified_Ideograph","UIdeo","Uppercase","Upper","Variation_Selector","VS","White_Space","space","XID_Continue","XIDC","XID_Start","XIDS"]),ce=new Map;for(let e of ie)ce.set(K(e),e);var bt=new Set(["Basic_Emoji","Emoji_Keycap_Sequence","RGI_Emoji","RGI_Emoji_Flag_Sequence","RGI_Emoji_Modifier_Sequence","RGI_Emoji_Tag_Sequence","RGI_Emoji_ZWJ_Sequence"]),ue=new Map;for(let e of bt)ue.set(K(e),e);var Fe=new Set("Dogr Dogra Gong Gunjala_Gondi Hanifi_Rohingya Maka Makasar Medefaidrin Medf Old_Sogdian Rohg Sogd Sogdian Sogo Extended_Pictographic Elym Elymaic Hmnp Nand Nandinagari Nyiakeng_Puachue_Hmong Wancho Wcho Chorasmian Chrs Diak Dives_Akuru Khitan_Small_Script Kits Yezi Yezidi EBase EComp EMod EPres ExtPict Cpmn Cypro_Minoan Old_Uyghur Ougr Tangsa Tnsa Toto Vith Vithkuqi Gara Garay Gukh Gurung_Khema Hrkt Katakana_Or_Hiragana Kawi Kirat_Rai Krai Nag_Mundari Nagm Ol_Onal Onao Sunu Sunuwar Todhri Todr Tulu_Tigalari Tutg Unknown Zzzz".split(" ")),xt=new Map([[A(223),A(7838)],[A(107),A(8490)],[A(229),A(8491)],[A(969),A(8486)]]),yt=new Map([v(453),v(456),v(459),v(498),...ae(8072,8079),...ae(8088,8095),...ae(8104,8111),v(8124),v(8140),v(8188)]),X=new Map([["alnum",f`[\p{Alpha}\p{Nd}]`],["alpha",f`\p{Alpha}`],["ascii",f`\p{ASCII}`],["blank",f`[\p{Zs}\t]`],["cntrl",f`\p{cntrl}`],["digit",f`\p{Nd}`],["graph",f`[\P{space}&&\P{cntrl}&&\P{Cn}&&\P{Cs}]`],["lower",f`\p{Lower}`],["print",f`[[\P{space}&&\P{cntrl}&&\P{Cn}&&\P{Cs}]\p{Zs}]`],["punct",f`[\p{P}\p{S}]`],["space",f`\p{space}`],["upper",f`\p{Upper}`],["word",f`[\p{Alpha}\p{M}\p{Nd}\p{Pc}]`],["xdigit",f`\p{AHex}`]]),Ne=new Set(["alnum","blank","graph","print","word","xdigit"]);function Ft(e,t){let r=[];for(let s=e;s<=t;s++)r.push(s);return r}function K(e){return e.replace(/[- _]+/g,"").toLowerCase()}function v(e){let t=A(e);return[t.toLowerCase(),t]}function ae(e,t){return Ft(e,t).map(r=>v(r))}var le=new Set(["Lower","Lowercase","Upper","Uppercase","Ll","Lowercase_Letter","Lt","Titlecase_Letter","Lu","Uppercase_Letter"]);var g={Alternator:"Alternator",Assertion:"Assertion",Backreference:"Backreference",Character:"Character",CharacterClassClose:"CharacterClassClose",CharacterClassHyphen:"CharacterClassHyphen",CharacterClassIntersector:"CharacterClassIntersector",CharacterClassOpen:"CharacterClassOpen",CharacterSet:"CharacterSet",Directive:"Directive",GroupClose:"GroupClose",GroupOpen:"GroupOpen",Subroutine:"Subroutine",Quantifier:"Quantifier",VariableLengthCharacterSet:"VariableLengthCharacterSet",EscapedNumber:"EscapedNumber"},S={any:"any",digit:"digit",hex:"hex",posix:"posix",property:"property",space:"space",word:"word"},z={keep:"keep",flags:"flags"},_={atomic:"atomic",capturing:"capturing",group:"group",lookahead:"lookahead",lookbehind:"lookbehind"},$e=new Map([["a",7],["b",8],["e",27],["f",12],["n",10],["r",13],["t",9],["v",11]]),Te="c.? | C(?:-.?)?",Le=f`[pP]\{(?:\^?[\x20\w]+\})?`,ve=f`u\{[^\}]*\}? | u\p{AHex}{0,4} | x\p{AHex}{0,2}`,Re=f`\d{1,3}`,De=f`\[\^?\]?`,Ue=/[?*+][?+]?|\{\d+(?:,\d*)?\}\??/,J=new RegExp(f` +var OnigurumaToES=(()=>{var ne=Object.defineProperty;var mt=Object.getOwnPropertyDescriptor;var Et=Object.getOwnPropertyNames;var wt=Object.prototype.hasOwnProperty;var At=(e,t)=>{for(var r in t)ne(e,r,{get:t[r],enumerable:!0})},St=(e,t,r,s)=>{if(t&&typeof t=="object"||typeof t=="function")for(let n of Et(t))!wt.call(e,n)&&n!==r&&ne(e,n,{get:()=>t[n],enumerable:!(s=mt(t,n))||s.enumerable});return e};var _t=e=>St(ne({},"__esModule",{value:!0}),e);var Rr={};At(Rr,{compile:()=>ke,toOnigurumaAst:()=>Ct,toRegExp:()=>vr,toRegexAst:()=>Lr});var se=String.raw,be=se`(?:\p{Emoji}\uFE0F\u20E3?|\p{Emoji_Modifier_Base}\p{Emoji_Modifier}?|\p{Emoji_Presentation})`,xe=se`\u{E0061}-\u{E007A}`,ye=()=>new RegExp(se`[\u{1F1E6}-\u{1F1FF}]{2}|\u{1F3F4}[${xe}]{2}[\u{E0030}-\u{E0039}${xe}]{1,3}\u{E007F}|${be}(?:\u200D${be})*`,"gu");var A=String.fromCodePoint,f=String.raw,q={ES2018:2018,ES2024:2024,ESNext:2025};function U(e,{enable:t,disable:r}){return{dotAll:!r?.dotAll&&!!(t?.dotAll||e.dotAll),ignoreCase:!r?.ignoreCase&&!!(t?.ignoreCase||e.ignoreCase)}}function L(e,t,r){return e.has(t)||e.set(t,r),e.get(t)}function j(e,t){return q[e]>=q[t]}function N(e,t){if(!e)throw new Error(t??"Value expected");return e}var kt=new Set([A(304),A(305)]);function oe(e){if(kt.has(e))return[e];let t=new Set,r=e.toLowerCase(),s=r.toUpperCase(),n=yt.get(r),o=xt.get(r);return[...s].length===1&&t.add(s),t.add(r),n&&t.add(n),o&&t.add(o),[...t]}var ie=new Set(["C","Other","Cc","Control","cntrl","Cf","Format","Cn","Unassigned","Co","Private_Use","Cs","Surrogate","L","Letter","LC","Cased_Letter","Ll","Lowercase_Letter","Lm","Modifier_Letter","Lo","Other_Letter","Lt","Titlecase_Letter","Lu","Uppercase_Letter","M","Mark","Combining_Mark","Mc","Spacing_Mark","Me","Enclosing_Mark","Mn","Nonspacing_Mark","N","Number","Nd","Decimal_Number","digit","Nl","Letter_Number","No","Other_Number","P","Punctuation","punct","Pc","Connector_Punctuation","Pd","Dash_Punctuation","Pe","Close_Punctuation","Pf","Final_Punctuation","Pi","Initial_Punctuation","Po","Other_Punctuation","Ps","Open_Punctuation","S","Symbol","Sc","Currency_Symbol","Sk","Modifier_Symbol","Sm","Math_Symbol","So","Other_Symbol","Z","Separator","Zl","Line_Separator","Zp","Paragraph_Separator","Zs","Space_Separator","ASCII","ASCII_Hex_Digit","AHex","Alphabetic","Alpha","Any","Assigned","Bidi_Control","Bidi_C","Bidi_Mirrored","Bidi_M","Case_Ignorable","CI","Cased","Changes_When_Casefolded","CWCF","Changes_When_Casemapped","CWCM","Changes_When_Lowercased","CWL","Changes_When_NFKC_Casefolded","CWKCF","Changes_When_Titlecased","CWT","Changes_When_Uppercased","CWU","Dash","Default_Ignorable_Code_Point","DI","Deprecated","Dep","Diacritic","Dia","Emoji","Emoji_Component","EComp","Emoji_Modifier","EMod","Emoji_Modifier_Base","EBase","Emoji_Presentation","EPres","Extended_Pictographic","ExtPict","Extender","Ext","Grapheme_Base","Gr_Base","Grapheme_Extend","Gr_Ext","Hex_Digit","Hex","IDS_Binary_Operator","IDSB","IDS_Trinary_Operator","IDST","ID_Continue","IDC","ID_Start","IDS","Ideographic","Ideo","Join_Control","Join_C","Logical_Order_Exception","LOE","Lowercase","Lower","Math","Noncharacter_Code_Point","NChar","Pattern_Syntax","Pat_Syn","Pattern_White_Space","Pat_WS","Quotation_Mark","QMark","Radical","Regional_Indicator","RI","Sentence_Terminal","STerm","Soft_Dotted","SD","Terminal_Punctuation","Term","Unified_Ideograph","UIdeo","Uppercase","Upper","Variation_Selector","VS","White_Space","space","XID_Continue","XIDC","XID_Start","XIDS"]),ce=new Map;for(let e of ie)ce.set(K(e),e);var bt=new Set(["Basic_Emoji","Emoji_Keycap_Sequence","RGI_Emoji","RGI_Emoji_Flag_Sequence","RGI_Emoji_Modifier_Sequence","RGI_Emoji_Tag_Sequence","RGI_Emoji_ZWJ_Sequence"]),ue=new Map;for(let e of bt)ue.set(K(e),e);var Fe=new Set("Dogr Dogra Gong Gunjala_Gondi Hanifi_Rohingya Maka Makasar Medefaidrin Medf Old_Sogdian Rohg Sogd Sogdian Sogo Extended_Pictographic Elym Elymaic Hmnp Nand Nandinagari Nyiakeng_Puachue_Hmong Wancho Wcho Chorasmian Chrs Diak Dives_Akuru Khitan_Small_Script Kits Yezi Yezidi EBase EComp EMod EPres ExtPict Cpmn Cypro_Minoan Old_Uyghur Ougr Tangsa Tnsa Toto Vith Vithkuqi Gara Garay Gukh Gurung_Khema Hrkt Katakana_Or_Hiragana Kawi Kirat_Rai Krai Nag_Mundari Nagm Ol_Onal Onao Sunu Sunuwar Todhri Todr Tulu_Tigalari Tutg Unknown Zzzz".split(" ")),xt=new Map([[A(223),A(7838)],[A(107),A(8490)],[A(229),A(8491)],[A(969),A(8486)]]),yt=new Map([v(453),v(456),v(459),v(498),...ae(8072,8079),...ae(8088,8095),...ae(8104,8111),v(8124),v(8140),v(8188)]),X=new Map([["alnum",f`[\p{Alpha}\p{Nd}]`],["alpha",f`\p{Alpha}`],["ascii",f`\p{ASCII}`],["blank",f`[\p{Zs}\t]`],["cntrl",f`\p{cntrl}`],["digit",f`\p{Nd}`],["graph",f`[\P{space}&&\P{cntrl}&&\P{Cn}&&\P{Cs}]`],["lower",f`\p{Lower}`],["print",f`[[\P{space}&&\P{cntrl}&&\P{Cn}&&\P{Cs}]\p{Zs}]`],["punct",f`[\p{P}\p{S}]`],["space",f`\p{space}`],["upper",f`\p{Upper}`],["word",f`[\p{Alpha}\p{M}\p{Nd}\p{Pc}]`],["xdigit",f`\p{AHex}`]]),Ne=new Set(["alnum","blank","graph","print","word","xdigit"]);function Ft(e,t){let r=[];for(let s=e;s<=t;s++)r.push(s);return r}function K(e){return e.replace(/[- _]+/g,"").toLowerCase()}function v(e){let t=A(e);return[t.toLowerCase(),t]}function ae(e,t){return Ft(e,t).map(r=>v(r))}var le=new Set(["Lower","Lowercase","Upper","Uppercase","Ll","Lowercase_Letter","Lt","Titlecase_Letter","Lu","Uppercase_Letter"]);var g={Alternator:"Alternator",Assertion:"Assertion",Backreference:"Backreference",Character:"Character",CharacterClassClose:"CharacterClassClose",CharacterClassHyphen:"CharacterClassHyphen",CharacterClassIntersector:"CharacterClassIntersector",CharacterClassOpen:"CharacterClassOpen",CharacterSet:"CharacterSet",Directive:"Directive",GroupClose:"GroupClose",GroupOpen:"GroupOpen",Subroutine:"Subroutine",Quantifier:"Quantifier",VariableLengthCharacterSet:"VariableLengthCharacterSet",EscapedNumber:"EscapedNumber"},S={any:"any",digit:"digit",hex:"hex",posix:"posix",property:"property",space:"space",word:"word"},z={flags:"flags",keep:"keep"},_={atomic:"atomic",capturing:"capturing",group:"group",lookahead:"lookahead",lookbehind:"lookbehind"},$e=new Map([["a",7],["b",8],["e",27],["f",12],["n",10],["r",13],["t",9],["v",11]]),Te="c.? | C(?:-.?)?",Le=f`[pP]\{(?:\^?[\x20\w]+\})?`,ve=f`u\{[^\}]*\}? | u\p{AHex}{0,4} | x\p{AHex}{0,2}`,Re=f`\d{1,3}`,De=f`\[\^?\]?`,Ue=/[?*+][?+]?|\{\d+(?:,\d*)?\}\??/,J=new RegExp(f` \\ (?: ${Te} | ${Le} diff --git a/spec/match-assertion.spec.js b/spec/match-assertion.spec.js index 8bb9f53..8e6a806 100644 --- a/spec/match-assertion.spec.js +++ b/spec/match-assertion.spec.js @@ -7,7 +7,7 @@ beforeEach(() => { }); describe('Assertion', () => { - // For kinds `lookahead` and `lookbehind`, see `match-lookaround.spec.js` + // [Note] For kinds `lookahead` and `lookbehind`, see `match-lookaround.spec.js` describe('line_end', () => { it('should match at the end of the string', () => { diff --git a/spec/match-capturing-group.spec.js b/spec/match-capturing-group.spec.js new file mode 100644 index 0000000..5c31de1 --- /dev/null +++ b/spec/match-capturing-group.spec.js @@ -0,0 +1,22 @@ +import {r} from '../src/utils.js'; +import {matchers} from './helpers/matchers.js'; + +beforeEach(() => { + jasmine.addMatchers(matchers); +}); + +// TODO: Add me + +describe('CapturingGroup', () => { + describe('named', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); + + describe('numbered', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); +}); diff --git a/spec/match-char-class-intersection.spec.js b/spec/match-char-class-intersection.spec.js new file mode 100644 index 0000000..d72e342 --- /dev/null +++ b/spec/match-char-class-intersection.spec.js @@ -0,0 +1,14 @@ +import {r} from '../src/utils.js'; +import {matchers} from './helpers/matchers.js'; + +beforeEach(() => { + jasmine.addMatchers(matchers); +}); + +// TODO: Add me + +describe('CharacterClassIntersection', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); +}); diff --git a/spec/match-char-class-range.spec.js b/spec/match-char-class-range.spec.js new file mode 100644 index 0000000..01f2d80 --- /dev/null +++ b/spec/match-char-class-range.spec.js @@ -0,0 +1,14 @@ +import {r} from '../src/utils.js'; +import {matchers} from './helpers/matchers.js'; + +beforeEach(() => { + jasmine.addMatchers(matchers); +}); + +// TODO: Add me + +describe('CharacterClassRange', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); +}); diff --git a/spec/match-char-class.spec.js b/spec/match-char-class.spec.js index 20b3cc1..554b2b4 100644 --- a/spec/match-char-class.spec.js +++ b/spec/match-char-class.spec.js @@ -61,8 +61,8 @@ describe('CharacterClass', () => { }); }); - // TODO: Rest + // TODO: Add remaining }); - // TODO: Rest + // TODO: Add remaining }); diff --git a/spec/match-char-set.spec.js b/spec/match-char-set.spec.js index 5e0b33f..1003030 100644 --- a/spec/match-char-set.spec.js +++ b/spec/match-char-set.spec.js @@ -18,5 +18,5 @@ describe('CharacterSet', () => { }); }); - // TODO: Rest + // TODO: Add remaining }); diff --git a/spec/match-directive.spec.js b/spec/match-directive.spec.js new file mode 100644 index 0000000..da5efa2 --- /dev/null +++ b/spec/match-directive.spec.js @@ -0,0 +1,22 @@ +import {r} from '../src/utils.js'; +import {matchers} from './helpers/matchers.js'; + +beforeEach(() => { + jasmine.addMatchers(matchers); +}); + +// TODO: Add me + +describe('Directive', () => { + describe('flags', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); + + describe('keep', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); +}); diff --git a/spec/match-flags.spec.js b/spec/match-flags.spec.js new file mode 100644 index 0000000..85982ad --- /dev/null +++ b/spec/match-flags.spec.js @@ -0,0 +1,14 @@ +import {r} from '../src/utils.js'; +import {matchers} from './helpers/matchers.js'; + +beforeEach(() => { + jasmine.addMatchers(matchers); +}); + +// TODO: Add me + +describe('Flags', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); +}); diff --git a/spec/match-group.spec.js b/spec/match-group.spec.js new file mode 100644 index 0000000..d1d6529 --- /dev/null +++ b/spec/match-group.spec.js @@ -0,0 +1,28 @@ +import {r} from '../src/utils.js'; +import {matchers} from './helpers/matchers.js'; + +beforeEach(() => { + jasmine.addMatchers(matchers); +}); + +// TODO: Add me + +describe('Group', () => { + describe('atomic', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); + + describe('flags', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); + + describe('noncapturing', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); +}); diff --git a/spec/match-quantifier.spec.js b/spec/match-quantifier.spec.js new file mode 100644 index 0000000..b410428 --- /dev/null +++ b/spec/match-quantifier.spec.js @@ -0,0 +1,28 @@ +import {r} from '../src/utils.js'; +import {matchers} from './helpers/matchers.js'; + +beforeEach(() => { + jasmine.addMatchers(matchers); +}); + +// TODO: Add me + +describe('Quantifier', () => { + describe('greedy', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); + + describe('lazy', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); + + describe('possessive', () => { + it('should', () => { + expect('').toExactlyMatch(r``); + }); + }); +}); diff --git a/src/tokenize.js b/src/tokenize.js index 9501ea6..81a6cfc 100644 --- a/src/tokenize.js +++ b/src/tokenize.js @@ -33,8 +33,8 @@ const TokenCharacterSetKinds = { }; const TokenDirectiveKinds = { - keep: 'keep', flags: 'flags', + keep: 'keep', }; const TokenGroupKinds = { diff --git a/src/unicode.js b/src/unicode.js index 9964b62..a56a1e1 100644 --- a/src/unicode.js +++ b/src/unicode.js @@ -174,7 +174,7 @@ const JsUnicodePropertiesPostEs2018 = new Set(( ' Cpmn Cypro_Minoan Old_Uyghur Ougr Tangsa Tnsa Toto Vith Vithkuqi' + // ES2023 scripts ' Gara Garay Gukh Gurung_Khema Hrkt Katakana_Or_Hiragana Kawi Kirat_Rai Krai Nag_Mundari Nagm Ol_Onal Onao Sunu Sunuwar Todhri Todr Tulu_Tigalari Tutg Unknown Zzzz' - // ES2024: None added except `JsUnicodePropertiesOfStrings` + // ES2024: None, but added `JsUnicodePropertiesOfStrings` ).split(' ')); const LowerToAlternativeUpperCaseMap = new Map([