-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scheduled report description double encoding error and provide way to do proper sanitization bit by bit. #7997
Conversation
$isValuePresent = isset($parametersRequest[$name]) && (is_array($parametersRequest[$name]) || strlen($parametersRequest[$name]) !== 0); | ||
|
||
$value = $isValuePresent ? $parametersRequest[$name] : $defaultValue; | ||
if ($value instanceof NoDefaultValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this logic + throwing exception could be refactored in a method used here and in Common::getRequestVar
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's just the first line that was copied from getRequestVar, but moving it makes a lot of sense. Would clear up part of getRequestVar, too. Already done locally ;)
Feedback
|
1bdae5c
to
309b7ce
Compare
@@ -464,13 +365,15 @@ public static function getRequestVar($varName, $varDefault = null, $varType = nu | |||
} | |||
} | |||
|
|||
// return isset($parametersRequest[$name]) && (is_array($parametersRequest[$name]) || strlen($parametersRequest[$name]) !== 0); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented code
Would it make sense to have Regarding putting the "blacklist" in the DI config, it works, but it's a bit "disconnected" from the controllers (i.e. I work on a controller, it's not easy to see if request parameters are escaped or not, which is a little critical). What about using an annotation for that? e.g. class Controller
{
/**
* @Escape(false)
*/
public function saveReport($name, $description, ...)
...
} (could be The other good thing that comes with that is that the Sanitizer would be entirely ignorant of Request/HTTP stuff and be a simple standalone service. |
* @return string | ||
* @throws \Exception If the value is null or a boolean. | ||
*/ | ||
public function sanitizeValue($value, $alreadyStripslashed = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about simpler names, sanitize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make this change. Note: this class should be removed completely for 3.0, so the names shouldn't matter too much.
The point of this class (which may not be mentioned in the issue description or class docs, the PR is still a WIP) is to centralize the logic so we can remove it easily in 3.0. We can call it Sanitizer, it's not really important to me what the name is. I went w/ RequestSanitizer since it is only used to sanitize query parameters in an HTTP request.
It is disconnected, and it is supposed to be disconnected. Using this class & setup, we can change one part of Piwik at a time to be "store unsanitized, sanitize on output" by adding the info to DI (maintaining existing behavior in other places and thus preventing unwanted xss vulnerabilities). Then, when we change the default handling of parameters to be unsanitized in 3.0, we can just remove the class and DI entries. If we use annotations or something else coupled with the plugins, we'd have to modify every API method and remove the useless annotation. Also using DI means the code isn't a publicly available API, which is better in this case. Finally, a quick note, this only affects APIs, so not Controllers (who don't actually get called w/ request parameters). |
8b9f845
to
4d62a83
Compare
OK I see your point (it's all only temporary). FYI I am not very confident we will be able to fix this whole sanitization issue in 3.0 (and thus remove this class), I mean it's a huge-major thing with high risk of creating security issues (XSS) while refactoring… Somehow I think it won't be as fast as we would like to. That's why I was pushing with spending more time making this more "durable", but if you are confident then I don't want to slow you down :) |
The ultimate goals is to do no sanitization on query parameters in API requests. That's a HUGE breaking change, so If we can't get to that point by 3.0, it will probably never happen. To make the process faster, maybe I can create an automated XSS test. The UI tests do this already to some extent, but it's certainly not complete. And maybe I can come up w/ a set of patterns for JS coding to avoid XSS (for example, some code builds HTML directly w/ the result of AJAX queries, this is not necessary w/ angularjs). For an XSS test, we could potentially add I'm not sure how to test having untrusted input returned from Piwik (either because it is in the DB, or due to a man in the middle attack). We could fake XHR responses... but triggering specific code paths is impossible w/ the jquery soup much of the frontend consists of. |
4d62a83
to
7fbc56c
Compare
Note to merger: This PR could potentially introduce XSS, so it would be a good idea to do manual UI tests before merging. |
Unfortunately automated tests for XSS is not realistic nor practical for now... we will need to manually check every instance of variable in all templates, and html generators, and manually sanitise the output for the context (js, html attribute, etc.). it's tedious work but needs to be done. Don't think we can do it by 3.0.0 as it's already huge scope https://github.com/piwik/piwik/milestones/3.0.0 but we shall discuss (#4231 is tentatively in 3.0.0)
I don't think we should rely on the reviewer to catch XSS but rather the original code author. You don't write in PR that you checked instances of the use of variables and check they are sanitised... did you, or is this PR really risky in this regard? |
I checked where the description property was used and where exception messages were used. Perhaps I checked every possible place, perhaps I missed something. Perhaps someone could double check. |
Ok thanks for more details, good to know 👍 |
0883260
to
7de6acc
Compare
@diosmosis as you suspected, because this PR is quite complex, it didn't get merged in time. I guess it be a possibility to fix all sanitisations problems at once in #4231 |
return $value; | ||
/** @var RequestSanitizer $requestSanitizer */ | ||
$requestSanitizer = StaticContainer::get('Piwik\Http\RequestSanitizer'); | ||
return $requestSanitizer->sanitizeValues($value, $alreadyStripslashed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check this for performance stuff and profile it? Eg we sometimes call this during bulk request hundred thousands of times. I presume StaticContainer::get
should add too much overhead but just to be sure. Same for following. Maybe we could profile a bulk request with 600 tracking requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking 600 requests takes 10.5s for me on master and ~11.5s on this branch. Investigating the cause, though likely due to StaticContainer::get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can't really fix this now. It's due to StaticContainer, but I can't fix it w/o removing StaticContainer. Which I can't do w/o removing static functions like getRequestVar. It's not possible w/o a large commit.
I suggest, before we work on 3.0, we should create some xhprof profiles for specific parts of Piwik that we don't want performance regressions in (like tracking, archiving, all websites/multisites). We post the PHP benchmark code + xhprof profiles online somewhere. Then while working on 3.0, we can re-run the benchmarks periodically and make sure that we're not regressing, or at least how to eventually fix the regression, before the 3.0 release. What do you think?
If you agree, could you list the places we shouldn't regress? You've done most of the performance tuning, I think.
…tion logic in Common.php methods to this class.
…r() so we can specify through DI which API request parameters to not sanitize.
…rough API/Request in Fixture.php.
…ports.updateReport and ScheduledReports.addReport.
…eter was in the wrong place.
…ns in the DB. Includes version bump to core (to next beta).
…sages were escaped for XML/HTML regardless of the format, so JSON output would include escaped HTML. When alerted, or displayed on the screen, this results in double encoded messages. Fix involves formatting only for XML/HTML API renderers. More tests included.
…ed in Piwik_Popover.showError and old notification.js JavaScript methods.
7de6acc
to
dde2704
Compare
Realistically I think this won't get merged. Instead we shall try to fix the issue at once everywhere, which should make things clearer and easier in the mid term. |
Fixes the double encoding error in #7987 by storing scheduled reports descriptions unsanitized and sanitizing only on outputting to HTML.
Also includes the following changes which will allow us to fix #4231 one change at a time:
Common::getRequestVar()
and instead just accessed the value manually. Moved the sanitization logic for this to a method in RequestSanitizer.Eventually, after all the input sanitization is removed in favor of output sanitization, the RequestSanitizer class should be completely removed. Should be done for 3.0, since it will change the API.
Extra Changes
$sanitize
service.Sanitization Change Workflow
With this approach, you can change one part of Piwik to use the sanitize on output approach by doing the following:
Find the API method that creates/updates the entity in question. For the API method parameter that should be stored unsanitized, blacklist it through DI config. This will remove sanitization for that specific parameter.
To blacklist an API parameter, add the method and parameter name to the
'RequestSanitizer.requestParameterSanitizeBlacklist'
array in DI config (see ScheduledReports/config/config.php for an example).Make sure the property is stored w/o sanitization. Normally, you won't have to do anything in this regard.
Make sure the property is properly escaped when outputting HTML/XML.
Test all places where the property can be displayed in the UI for XSS. (including exception messages)
Once all parts of Piwik are changed, we can make the breaking change of not sanitizing API method parameters before calling the method in 3.0.
TODO:
Refs #7987 and #4231