Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server side checksum verification #26655 #27097

Merged
merged 31 commits into from
Mar 10, 2017
Merged

Conversation

IljaN
Copy link
Contributor

@IljaN IljaN commented Feb 7, 2017

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@mention-bot
Copy link

@IljaN, thanks for your PR! By analyzing the history of the files in this pull request, we identified @DeepDiver1975, @PVince81 and @butonic to be potential reviewers.

Copy link
Contributor

@PVince81 PVince81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress.

Some additional things came to mind while reviewing.

*/
public function fopen($path, $mode) {
$stream = $this->getWrapperStorage()->fopen($path, $mode);
if (!self::requiresChecksum($path)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also don't need to compute the checksum if there is already a checksum value in oc_filecache and $mode is a read operation. Could save a bit of computing power on download.

$memoryStream = fopen('php://memory', 'r+');
$checksumStream = \OC\Files\Stream\Checksum::wrap($memoryStream, $path);

fwrite($checksumStream, $data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does that really work, does it write the whole data ?
needs to be tested with some external storages, I suspect that there are cases where fwrite would write only 8k blocks or so but I might be wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the fwrite docs it will write the whole string. You can limit it by a optional length parameter. Which external storages should I test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best would be to test with

  • SFTP
  • SMB
  • one of GDrive or Dropbox

In the past we had issues with GDrive because the fread from GDrive didn't always return 8k per read and it messed up the encryption wrapper which was expecting 8k blocks...

Copy link
Contributor Author

@IljaN IljaN Feb 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to the docblocks file_put_contents of the Wrapper interface (unlike the native version) only expects strings so this shouldn`t be a Problem, because we will always get a full string. ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good

return null;
}

return self::$checksums[$path];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now thinking of it, something should probably clear all the checksum values from that static array.

In the case of single PHP requests this isn't a problem. But if someone runs an occ command that reads a million files (maybe something like transfer ownership or decrypt-all), it could happen that old files still have their checksum stored in there.

One way to compensate for this is to use our CappedMemoryCache class which will auto-remove entries when full.

@@ -175,6 +175,12 @@ public static function setupFS($user = '') {
return $storage;
});

// install storage checksum wrapper
\OC\Files\Filesystem::addStorageWrapper('oc_checksum', function ($mountPoint, $storage) {
return new \OC\Files\Storage\Wrapper\Checksum(['storage' => $storage]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't install the checksum wrapper if $storage is a shared storage. See below for a check to exclude it.

The reason is that the SharedStorage for a share recipient is itself is just a wrapper around the original Storage of the owner, which itself will already have a checksum storage applied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a !$storage->isLocal() check also required? (like in the other wrapper registrations)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The isLocal was specific to the encoding wrapper which only makes sense for external storages.

In the case of checksums, we want it both on local and external storages.

@PVince81 PVince81 added this to the 10.0 milestone Feb 14, 2017
Copy link
Contributor

@PVince81 PVince81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!

$expectedChecksum = trim($request->server['HTTP_OC_CHECKSUM']);
$computedChecksums = $meta['checksum'];

return strpos($computedChecksums, $expectedChecksum) !== false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why no direct equality ?

}
$this->fileView->putFileInfo(
$this->path,
['checksum' => $partStorage->getMetaData($internalPartPath)['checksum']]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a mechanism to disable checksums ?

Maybe better be safe and make sure getMetaData actually does contain a checksum before putting it ?


foreach ($checksums as $checksum) {
// starts with $algo
if (substr($checksum, 0, strlen($algo)) === $algo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we had another separator on which we could explode here ?

What if the checksum had both algos "ABC" and "ABC1" ?
When querying "ABC" it might match "ABC1" too and deliver a piece of the wrong checksum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make it more complicated to follow the code we do explode again on ":" I will just check until the ":" to be save.

@@ -335,7 +335,7 @@ public function handleGetProperties(PropFind $propFind, \Sabre\DAV\INode $node)
});

$propFind->handle(self::CHECKSUMS_PROPERTYNAME, function() use ($node) {
$checksum = $node->getChecksum();
$checksum = $node->getChecksum('SHA1');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"sha1" up there and "SHA1" here ? maybe decide on one casing.

By the way: are checksum algo names automatically capitalized ? If not then you likely have a bug here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PVince81 PHP uses lowercase for algo everywhere but our protocol uses uppercase, the getChecksum method internally uppercases every input anyway to do the internal comparisons.

@@ -268,6 +226,52 @@ public function testExpireOldFilesShared() {
}

/**
* test expiration of files older then the max storage time defined for the trash
*/
public function testExpireOldFiles() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a weird coincidence. Some of my test was affected by this one in another location.

The reason was that a test disabled the trashbin permanently which made other tests fail. Ref: #27042 (comment)

Anyway, I'm fine leaving this as is to not waste more time with old cruddy tests.

Given user "user0" exists
And file "/chksumtst.txt" does not exist for user "user0"
And user "user0" uploads file with checksum "SHA1:f005ba11" and content "Some Text" to "/chksumtst.txt"
Then the HTTP status code should be "400"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need such tests with chunking.

Pick one of the chunking tests and add the header.

* @return bool
*/
private static function requiresChecksum($path, $mode) {
return substr($path, 0, 6) === 'files/' && $mode !== 'r';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& mode !== 'rb' (if any app is still using that...)

$memoryStream = fopen('php://memory', 'r+');
$checksumStream = \OC\Files\Stream\Checksum::wrap($memoryStream, $path);

fwrite($checksumStream, $data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good

private static $checksums;


public function __construct(array $algos = ['sha1', 'md5', 'adler32']) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, this shows up red. Does PHP 5.6 support this syntax ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does: https://3v4l.org/cJnUO, apparently it is a github bug

class Checksum extends Wrapper {

/** @var resource[] */
private $hashingContexts;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add PHPDoc, what's a hashing context here ? (I know, but explain to others)

@PVince81
Copy link
Contributor

we also need a test for the checksum from ext storage. Integration tests have a local storage but it contains no default data. Might need to change the setup a bit, can you help here @SergioBertolinSG ? Have an example existing file on local_storage at start, or better: a scenario method to put a file directly there bypassing OC, will be useful for future update detection tests.

@SergioBertolinSG
Copy link
Contributor

@@ -112,10 +110,9 @@ private function getChecksumRequirement($path, $mode) {
*/
public function onClose() {
$cache = $this->getCache();
foreach ($this->pathsInCacheWithoutChecksum as $path) {
$entry = $cache->get($path);
foreach ($this->pathsInCacheWithoutChecksum as $cacheId => $path) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PVince81 I removed all checks here, because then we can save a query when getting the cache entry. If we have a race condition the worst thing that will happen is a update which will affect 0 rows.

@PVince81
Copy link
Contributor

PVince81 commented Feb 23, 2017

@SergioBertolinSG no, because using the Webdav upload will already write a checksum into the table.
What we want is a way to have a file stored/created/existing on the external storage before ownCloud had a chance to scan it to simulate remote changes on external storage (remote as in "not through ownCloud"). Such files would not have a checksum value yet and this PR would compute one during download.

@PVince81
Copy link
Contributor

PVince81 commented Mar 1, 2017

@IljaN please rebase then use #27227 (comment) to create a file directly on the local_storage, that file will not have a checksum since we bypassed ownCloud.
Then download it. Then check that the second download has the checksum (or use propfind to read it).

@SergioBertolinSG SergioBertolinSG force-pushed the 26655_ComputeChecksum2 branch from e4bb97d to 7734f0b Compare March 2, 2017 13:27
@PVince81
Copy link
Contributor

PVince81 commented Mar 3, 2017

Bad luck or real error ? (sqlite)

16:05:36 1) Test\Files\ViewTest::testLockBasicOperation with data set #4 ('fopen', array('test.txt', 'r'), 'test.txt', 'read', 1, 1, null)
16:05:36 Failed asserting that 1 matches expected null.

@SergioBertolinSG SergioBertolinSG force-pushed the 26655_ComputeChecksum2 branch from eb22802 to b5617a3 Compare March 6, 2017 14:01
@SergioBertolinSG
Copy link
Contributor

@PVince81 that one seems to be a real error.

@PVince81
Copy link
Contributor

PVince81 commented Mar 7, 2017

One last rebase to get past the JS bug ?

Then this is good for merge 👍
Great work, guys!

@SergioBertolinSG SergioBertolinSG force-pushed the 26655_ComputeChecksum2 branch from e2c80e9 to ff6b908 Compare March 7, 2017 15:30
@IljaN
Copy link
Contributor Author

IljaN commented Mar 7, 2017

@PVince81 I think i need to look at is lock_error, it seems to happen at every jenkins run, so probably no bad luck ...

@PVince81
Copy link
Contributor

PVince81 commented Mar 9, 2017

16:33:33 1) Test\Files\ViewTest::testLockBasicOperation with data set #4 ('fopen', array('test.txt', 'r'), 'test.txt', 'read', 1, 1, null)
16:33:33 Failed asserting that 1 matches expected null.
16:33:33 
16:33:33 /var/lib/jenkins/workspace/owncloud-core_core_PR-27097-3J4F6P6NQZBTIIMSRG746A75T7KO5JXZVBVLSAJJMR5VJCNJZFKQ/tests/lib/Files/ViewTest.php:1822

@PVince81
Copy link
Contributor

PVince81 commented Mar 9, 2017

Seems you have a stray lock after fopen which is not supposed to be there.

@PVince81
Copy link
Contributor

PVince81 commented Mar 9, 2017

Disabling all tests except the fopen one solves the problem.

So one of the 3 data provider cases before are causing a side effect.

@PVince81
Copy link
Contributor

PVince81 commented Mar 9, 2017

Nope. I disabled the wrong fopen one. Having only dataset 4 enabled fails too. At least no cross-test side effects.

@PVince81
Copy link
Contributor

PVince81 commented Mar 9, 2017

Removing the checksum stream wrapper solves the issue.

Diving deeper...

* @return false|resource
*/
public function fopen($path, $mode) {
$stream = $this->getWrapperStorage()->fopen($path, $mode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check if $stream is a resource, else return it.
add a unit test for the case where fopen failed. Maybe simply $view->fopen() of a non-existing file and check result, that might be enough to cover all possible registered wrappers already.

@PVince81
Copy link
Contributor

PVince81 commented Mar 9, 2017

Fixed locking issue, it was a hidden testing issue that got revealed by checksum wrapper bug.

Fixes here: fdd2478 9c4a065

@IljaN please review/approve my two commits

@@ -55,6 +55,11 @@ class Checksum extends Wrapper {
*/
public function fopen($path, $mode) {
$stream = $this->getWrapperStorage()->fopen($path, $mode);
if (!is_resource($stream)) {
// don't wrap on error
return $stream;
Copy link
Contributor Author

@IljaN IljaN Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should fail more loudly here (Exception?) Or else we might have silent bugs, what do you think?

The rest looks good to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... silent bugs is the "norm" so far for these APIs. If we throw exceptions it risks breaking other things. One day we'll replace all these PHP-like APIs with proper APIs that throw exceptions: #13792

@IljaN
Copy link
Contributor Author

IljaN commented Mar 9, 2017

The checksum behat tests all pass locally :-/

@PVince81 PVince81 force-pushed the 26655_ComputeChecksum2 branch from 9c4a065 to eeb8f07 Compare March 9, 2017 19:03
@PVince81
Copy link
Contributor

PVince81 commented Mar 9, 2017

Rebased onto master. Bring a bag full of luck !

@PVince81
Copy link
Contributor

@IljaN @SergioBertolinSG legit failures:

22:05:29     /var/lib/jenkins/workspace/owncloud-core_core_PR-27097-3J4F6P6NQZBTIIMSRG746A75T7KO5JXZVBVLSAJJMR5VJCNJZFKQ/tests/integration/features/webdav-related.feature:410
22:05:29     /var/lib/jenkins/workspace/owncloud-core_core_PR-27097-3J4F6P6NQZBTIIMSRG746A75T7KO5JXZVBVLSAJJMR5VJCNJZFKQ/tests/integration/features/webdav-related.feature:420

@IljaN IljaN self-assigned this Mar 10, 2017
@PVince81 PVince81 merged commit c9f5fce into master Mar 10, 2017
@PVince81 PVince81 deleted the 26655_ComputeChecksum2 branch March 10, 2017 11:34
@lock
Copy link

lock bot commented Aug 3, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Aug 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants