-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JsEscaper duplicating json_encode() ? #52
Comments
@mindplay-dk that's an interesting question which I am not sure what is the best choice. |
@harikt mainly I have two reservations about aggressively escaping all unicode characters:
As far as I can figure, the only justification for a custom implementation, is the fact that However, Here's the (UTF-8 only) function I ended up with: /**
* Escape string for use in a JavaScript string literal context, e.g.:
*
* <script>
* document.title = 'Page: <%= html(js($title)) %>';
* </script>
*
* Note the use of both `html()` and `js()` here, since there are actually two nested
* contexts in this example: a JavaScript string in an HTML context.
*
* To illustrate why HTML context isn't simply assumed, here is an example using both
* `html()` and `js()`, since the following is JavaScript in an HTML-attribute context:
*
* <button onclick="alert('Hello, <?= attr(js($username)) ?>')">
*
* @param string|mixed $value UTF-8 string (or any string-castable value)
*
* @return string Javascript-string-escaped UTF-8 string
*/
function js($value)
{
$value = (string) $value;
if ($value === "") {
return $value;
}
$str = json_encode($value, JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES);
if (json_last_error() !== JSON_ERROR_NONE) {
throw new InvalidArgumentException(json_last_error_msg());
}
return preg_replace("#'#u", "\\'", substr($str, 1, -1));
} This won't escape unicode characters or slashes - but will escape the single quote, which isn't escaped by Here's my (Codeception Cest) spec, if that's helpful: public function escapeJavascriptStrings(UnitTester $I)
{
$I->assertSame('\\\'', js('\''), "single quotes are escaped");
$I->assertSame('\\"', js('"'), "double quotes are escaped");
$I->assertSame('\\\\', js('\\'), "backslashes are escaped");
$I->assertSame('æøå', js('æøå'), "UTF-8 characters are preserved");
$I->assertSame("", js(null), "null treated as empty string");
$I->assertSame("123", js(123), "numeric values treated as strings");
} |
@mindplay-dk your functions are quite impressive. But I am not sure if there is any edge case we are missing. I was also looking at https://docs.zendframework.com/zend-escaper/escaping-javascript/ . May be there is some where a test suite against xss where we could try. Thanks again for exploring your time against the same. |
@mindplay-dk The note from @harikt regarding the Zend Escaper protections (from which the escaping in this library is derived) seems accurate to me; i.e., json_encode() by itself is insufficient. @jakejohns If you don't mind I can close this issue, or you can. |
Yes. Maybe you didn't notice the examples in the doc-block? I regard JS and HTML escaping as being different things - the The reason I decided to go this way, is because I might have a view for an If I'm inserting into another context, I make two function-calls. This takes a little getting used to, but I personally found that this leads to a lower total number of functions, combinations of which cover just about any use-case. For example, it composes well when I'm building initialization code:
And then injecting that into an HTML page:
I mean, if you care about HTML views only, yeah, you might as well go ahead and have that function escape for HTML right away - I personally like having functions that do one thing. I have a total of four helper-functions for views and don't anticipate needing any more: I mix these as needed, which isn't "safe by default" in HTML contexts, but forces me to think about contexts wrapping contexts which I find reduces the risk of blindly trusting these functions for any given scenario. I realize that's mostly a matter of taste and personal preference :-) |
Maybe I'm missing your point buried in the wall-of-text. Can you state the change you'd like to see in a sentence or two, or in a PR? |
Heh, no, I'm not asking you to change anything, I was just trying to explain my own approach. The reason I even commented in the first place was simply because I was referencing your implementation, trying to learn if there was something I was missing, or why it is you need to custom-code stuff in PHP that's already available as functions. I think we just have different goals and objectives, so you guys should probably just keep doing what you're doing :-) |
Ah so! If you have links to share on your own implementations, I'm sure we'd be interested to seem them. |
The project is closed-source (for the time being) but here's a copy/paste of the view-functions: https://gist.github.com/mindplay-dk/2e3bdeec957d6f51197b4425c5192519 It's just a tiny set of four simple functions - the documentation is almost more important than the actual implementations, which are pretty trivial, so I copy/pasted the relevant section of the documentation as well. Like I said, I am by no means suggesting you abandon your own approach, I was merely interested in sharing an alternative. Using these functions does require a little more thinking - I happen to think that's a good thing, but completely understand if that's not everyone's cup of tea :-) |
Decided to package it real quick with tests and all. Here. |
Is
JsEscaper
duplicating the behavior ofjson_encode()
?Isn't this code essentially equivalent to, say,
substr(json_encode($value), 1, -1)
?Is there any benefit to re-implementing this?
Also, is it necessary to escape multi-byte UTF-8 characters in the first place?
json_encode()
lets you disable unnecessary unicode escapes with theJSON_UNESCAPED_UNICODE
flag.For non-UTF-8 (e.g. ASCII) output, unicode escaping is required - but the likely use-case is outputting UTF-8 strings, right? So why escape everything as ASCII?
The text was updated successfully, but these errors were encountered: