-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new regex config parameter version to blacklisted files and excluded directories #36360
Add new regex config parameter version to blacklisted files and excluded directories #36360
Conversation
cbbe5bc
to
b8af495
Compare
Codecov Report
@@ Coverage Diff @@
## master #36360 +/- ##
============================================
+ Coverage 64.68% 64.68% +<.01%
- Complexity 19106 19122 +16
============================================
Files 1269 1269
Lines 74728 74778 +50
Branches 1320 1320
============================================
+ Hits 48337 48372 +35
- Misses 26003 26018 +15
Partials 388 388
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An error is located in each preg_match call, regex must be surrounded by \
. Plus, as regex match is case insensitive an i
must be added to the end of the string.
In lib/private/Files/Filesystem.php:
It might be: \preg_match('/'.$item.'/i', $path_part)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You add a comment for regex variables :
- INFO: Compare pairs are converted to lowercase by default.
I'm ok for regular variables, for regex the comment might be adapted as things are not simple as a lowercase string.
Maybe :
- INFO: regex matching is forced case insensitive.
b8af495
to
fc7d14a
Compare
@tdaget thanks for the comments, both fixed |
fc7d14a
to
c5fccd8
Compare
@phil-davis can you do me a favour and take a look why CI fails. |
The unit tests fail:
Run them yourself locally: |
61ea873
to
d93880f
Compare
Tests fixed (locally tested), rebased and pushed, CI is green 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nits about spelling etc.
Somebody in product management needs to decide if this feature should be added.
d93880f
to
8c69cca
Compare
@micbar @pmaier1 adding you according the suggestion of @phil-davis |
c94d216
to
27b2d91
Compare
@micbar I have completed acceptance tests for this. The new regex features work. Somebody could give a technical review of the source code changes. @individual-it can do a review of the acceptance tests. If this is approved, then we could also cherry-pick this back into release-10.4 branch so it can get released. In making tests for the existing blacklist and excluded directories features, I discovered #36645 - in "old" "v1" chunking uploads the blocked uploads return a 507 (rather than 403). The uploads are blocked - so that existing issue is not so important. Similar behaviour happens in the new regex features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests look good
we could move them all over to a separate feature file though
I created issue #36654 for the reorganisation of feature files, because that impacts more than just the scenarios in this PR |
475020f
to
d4ac6e3
Compare
Rebased and resolved conflict |
d4ac6e3
to
001c231
Compare
upload tests for blacklisted_files_regex and excluded_directories_regex rename tests for blacklisted_files_regex and excluded_directories_regex move async tests for blacklisted_files_regex and excluded_directories_regex move folder tests for blacklisted_files_regex and excluded_directories_regex webUI tests for blacklisted_files_regex and excluded_directories_regex uploadFileAsyncUsingNewChunking tests for blacklisted_files_regex and excluded_directories_regex uploadFileUsingNew-and-OldChunking tests for blacklisted_files_regex and excluded_directories_regex
21717a0
to
339c2df
Compare
* @return string | ||
* @see https://www.php.net/manual/function.preg-last-error.php | ||
*/ | ||
private static function preg_error_text() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using camel case in method names. I do not think we need an exception in here.
} | ||
@\preg_match($newItem, null); | ||
if (self::regexValidityCheck($item, $callerMessage)) { | ||
foreach ($path as $path_part) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use camel case for variable names.
// important, because if you define an empty config value, '.htaccess' will not be added... | ||
// prevents misuse with a fake config value | ||
$blacklist_files[] = '.htaccess'; | ||
$blacklist_files = \array_unique($blacklist_files); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To prevent code duplication, we can move these assignments to a private function. But, I am not sure is it worth. It is up to writer.
$match = \array_intersect($path_parts, $excluded); | ||
if ($match) { | ||
|
||
// force that the last element (possibly the filename) is an array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is confusing, the last element of $path_parts
is string, not an array. As far as understand, what is meant here, the last element of $path_parts
should be holded on an array to easy comparison or I am misunderstanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rewritten the comment a bit. PR incoming.
* | ||
* WARNING: USE THIS ONLY IF YOU KNOW WHAT YOU ARE DOING. | ||
*/ | ||
'excluded_directories_regex' => [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not this only apply to directories? As far as I see, we are also applying it to files.
// check if the first and last character is a '/' and add one if not | ||
// terminate with i == case insensitive | ||
$newItem = $item; | ||
if (\substr($item, 0, 1) !== '/') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we use $item[0]
instead of calling substr
. Guess accessing index directly, when we know is much faster?
if (\substr($item, 0, 1) !== '/') { | ||
$newItem = '/' . $item; | ||
} | ||
if (\substr($item, -1) !== '/') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here? How about using $item[-1]
} else { | ||
$newItem = $newItem . 'i'; | ||
} | ||
@\preg_match($newItem, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we ar suppressing error using @
. Any reason to add it here?
$exclude_folders = $ed; | ||
$exclude_folders_regex = $ed; | ||
$blacklist_files = $ed; | ||
$blacklist_files_regex = $ed; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about writing it in a line say:
$exclude_folders = $exclude_folders_regex = $blacklist_files = blacklist_files_regex = $ed
. Or any other suggestion?
} | ||
|
||
// second, check if the file is blacklisted | ||
if (isset($blacklist_files)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a bad idea to set blacklist_files
to null
in the beginning and then try to check if its non null here?
return true; | ||
} | ||
} | ||
if (isset($blacklist_files_regex)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Couldn't we assign blacklist_files_regex
to null in the beginning of the method?
$blacklist_array[] = \end($path_parts); | ||
|
||
// first, check if the folder is excluded | ||
if (isset($exclude_folders)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to set the exclude_folders
to null in the beginning of the method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm - I am not even sure why this test is here - there is already code above that always sets $exclude_folders
- so the code inside this if
will always be run.
I will remove the unneeded if in all 4 places.
$match = \array_intersect($path_parts, $blacklist); | ||
if ($match) { | ||
return true; | ||
if (isset($exclude_folders_regex)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to set the exclude_folders_regex
to null in the beginning of the method?
This PR is based on the work of @tdaget at PR #36331 (closed)
The referenced PR is now closed because of rebasing issues and CI will never get green there.
I improved the original work a bit.
Description
This PR enhances the existing config parameters
blacklisted_files
andexcluded_directories
by a new regex parameter versionblacklisted_files_regex
andexcluded_directories_regex
.Related Issue
Motivation and Context
Sometimes it is necessary to exclude/blacklist files or directories to be written/renamed/scanned on a more complex parameter than a static version as we have right now. This PR allows eg. to prevent uploading Microsoft Outlook .pst files or creating / renaming / scanning directories based on a pattern defined by a regex. The existing parameters are not touched.
How Has This Been Tested?
Running local with various scenarios successfully.
Screenshots (if appropriate):
Types of changes
Checklist: