-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normative: Always check regular expression flags by "flags" #2791
Conversation
This imo goes in the wrong direction - it creates more observable behavior, via the regex subclassing system that none of us want but it's too late to remove, and it may have the unfortunate side effect of causing websites using polyfills to break (I absolutely have tests, at least, that would break with this change, that rely on |
@ljharb This seems to me to reduce the amount of observable behavior? Previously |
lol i see that argument, and to be sure, the real reduced one would be checking if g and u are in the flags, fetched from the slot, rather than looking up global/unicode at all. |
I’d be stoked if we can get away with this (which as @gibson042 points out, actually seems plausible). |
@ljharb can you elaborate on the purpose of verifying that built-in |
@gibson042 The purpose of verifying it would be that it's in the spec, and thus that's the precise behavior engines are required to have. Since no engines I'm aware of have implemented this incorrectly, i don't think I have any code that would care - but an in-progress shim I'm working on for the regex symbol methods (which isn't public yet) would need to test for this PR's change, and then it would need to patch the relevant methods in every browser prior to it, even if otherwise no patch would be needed. |
It's rare for tests to verify that things don't happen, and the set of such (non-)events is unbounded (although in this particular case two get-unicode-error.js tests introduced by @jugglinmike in tc39/test262#352 do in fact attempt to do so). Regardless, except for hypothetical code that that copies the built-in Receiver that self-reports as non-global +Get "flags"
+# Built-in "flags" getter:
Get "global"
+Get "hasIndices"
+Get "global"
+Get "ignoreCase"
+Get "multiline"
+Get "dotAll"
+Get "unicode"
+Get "sticky" Receiver that self-reports as global +Get "flags"
+# Built-in "flags" getter:
Get "global"
+Get "hasIndices"
+Get "global"
+Get "ignoreCase"
+Get "multiline"
+Get "dotAll"
Get "unicode"
+Get "sticky" The responsible discrepancy seems to date back to ES2015, in which RegExp.prototype [ @@split ] reads "flags" but @@match and @@replace have conditional single-flag reads. The previous edition (ES5.1) didn't specify a |
I would expect that the very inconsistency you're trying to fix here is why this specific non-happening would be tested for, but I agree it's going to be rare in existing code. |
On a re-review of this PR and the comment thread, I'm completely convinced that this is a good change that reduces observable lookups. It will likely have two negative effects for my packages:
imo, neither of these negative effects should obstruct consensus on this PR. |
test262 PR: tc39/test262#3618 |
7f45c77
to
35b7eb2
Compare
Implement the changes from <tc39/ecma262#2791>. Depends on D156843 Differential Revision: https://phabricator.services.mozilla.com/D156844
…ter only https://bugs.webkit.org/show_bug.cgi?id=248605 Reviewed by NOBODY (OOPS!). This change implements recent spec change [1], aligning RegExp.prototype's @@match / @@replace with @@split / @@matchall to use only "flags" getter to check for flags, which is observable. Ensures that DFG watches and constant-folds all related invoked prototype getters, which was proven to be neutral on both JetStream2 and Speedometer2. [1]: tc39/ecma262#2791 * JSTests/test262/expectations.yaml: Mark 12 test cases as passing. * Source/JavaScriptCore/builtins/BuiltinNames.h: * Source/JavaScriptCore/builtins/RegExpPrototype.js: (linkTimeConstant.hasObservableSideEffectsForRegExpMatch): (linkTimeConstant.matchSlow): (overriddenName.string_appeared_here.replace): (linkTimeConstant.hasObservableSideEffectsForRegExpSplit): * Source/JavaScriptCore/builtins/StringPrototype.js: (linkTimeConstant.hasObservableSideEffectsForStringReplace): * Source/JavaScriptCore/bytecode/LinkTimeConstant.h: * Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h: (JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects): * Source/JavaScriptCore/dfg/DFGFixupPhase.cpp: (JSC::DFG::FixupPhase::addStringReplacePrimordialChecks): * Source/JavaScriptCore/runtime/JSGlobalObject.cpp: (JSC::JSGlobalObject::init): * Source/JavaScriptCore/runtime/JSGlobalObject.h: * Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h: (JSC::JSGlobalObject::regExpProtoFlagsGetter const): (JSC::JSGlobalObject::regExpProtoDotAllGetter const): (JSC::JSGlobalObject::regExpProtoHasIndicesGetter const): (JSC::JSGlobalObject::regExpProtoIgnoreCaseGetter const): (JSC::JSGlobalObject::regExpProtoMultilineGetter const): (JSC::JSGlobalObject::regExpProtoStickyGetter const):
Discussion at #2418 (comment) uncovered a strange inconsistency: unlike other methods on or related to regular expressions, RegExp.prototype[@@match] and RegExp.prototype[@@replace] bypass the "flags" property, instead reading directly (and only) from "global" and "unicode"—and doing so conditionally, checking the latter if and only if the former is coerced to boolean true. This is awkward for the introduction of v-mode regular expressions, which need to check a third flag in at least some cases and therefore require taking a position on when to get the property (which is observable).
But these are the only two methods that have such a problem... String.prototype.matchAll and String.prototype.replaceAll and RegExp.prototype[@@matchAll] and RegExp.prototype[@@split] get "flags", coerce to string, and then check the result for "u"/"g"/etc. (and the built-in RegExp.prototype.flags itself performs an unconditional Get of each specific flag property in a stable order that is compatible with the deviant operations [i.e., "global" before "unicode"]).
I would like to fix the inconsistency and render the negotiation of #2418 (comment) moot by updating
@@match
and@@replace
to align with the other methods in using "flags". This is a normative change, but one that seems likely to be web-compatible because any well-behaved regular expression analog is already required to support "flags" for the other methods, and deviation would actually require special effort—an author would need to copy the built-in@@match
and/or@@replace
but specifically remove or override the built-inflags
getter to be inconsistent with independentglobal
/unicode
access. And if we don't fix this now, then the problem is only going to get worse as more flags are added.