-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize join for sharedWith by moving logic to DB #27436
Conversation
@mrow4a, thanks for your PR! By analyzing the history of the files in this pull request, we identified @PVince81, @DeepDiver1975 and @phisch to be potential reviewers. |
88ed99a
to
a20bdf9
Compare
a20bdf9
to
12b8b8b
Compare
lib/public/Share/IShareProvider.php
Outdated
@@ -136,6 +136,18 @@ public function getShareById($id, $recipientId = null); | |||
*/ | |||
public function getSharesByPath(Node $path); | |||
|
|||
|
|||
/** | |||
* Get shared with the given user for shares of all available types for this share provider within all nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean with "within" ? do you mean "for all given nodes including their subnodes" ? (and not strictly for the given nodes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm and $node
is singular, not plural but your PHPDoc talks about "all nodes".
Please clarify and adjust the parameter name if applicable.
@@ -68,8 +68,9 @@ public function __construct(IConfig $config, IManager $shareManager, ILogger $lo | |||
* @return \OCP\Files\Mount\IMountPoint[] | |||
*/ | |||
public function getMountsForUser(IUser $user, IStorageFactory $storageFactory) { | |||
$shares = $this->shareManager->getSharedWith($user->getUID(), \OCP\Share::SHARE_TYPE_USER, null, -1); | |||
$shares = array_merge($shares, $this->shareManager->getSharedWith($user->getUID(), \OCP\Share::SHARE_TYPE_GROUP, null, -1)); | |||
$requiredShareTypes = [\OCP\Share::SHARE_TYPE_USER, \OCP\Share::SHARE_TYPE_GROUP]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, indeed that makes sense
$qb = $this->dbConn->getQueryBuilder(); | ||
// Create array of sharedWith objects (target user -> $userId or group of which user is a member | ||
$user = $this->userManager->get($userId); | ||
$allGroups = $this->groupManager->getUserGroups($user, 'sharing'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if passing "sharing" here is correct.
@IljaN could mess up with the guest users ?
Some context: this function is called when retrieving the list of shares for which a SharedMount needs to be setup for the recipient. From my understanding passing "sharing" here risks excluding guest users and those users wouldn't see their received shares mounted.
Needs testing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah never mind, it's late... this was copied from the old code that already works https://github.com/owncloud/core/pull/27436/files#diff-83377463cf0d718950be360f7fcabd50L710
); | ||
|
||
$qb->andWhere($qb->expr()->in('share_with', $qb->createNamedParameter( | ||
$sharedWithUserOrGroupChunk, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will boom boom 💥 💥 if there are users and groups with the same names, say a group called "meow" and also a user called "meow" (yes, I've seen such setups before (well, setups with both user and group called the same, not setups with "meow"))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make sure to add a test for this, test first FTW 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, why, these are two separate shares of different types, isnt it? I will try it out though in unit test, good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they might be different types.
But say if there is a crazy scenario where user "user1" is in the group "meow".
With this query you'd also select the incoming shares for the user "meow" which is wrong.
The way I see the query like: and ((share_type = GROUP and share_with in ($userGroups) or (share_type = USER and share_with = $userId))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this will slow down the query, will check if this scenario really does not work with the query I implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, boom booms, will write unit test protecting this case and apply a bit slower query if there are any groups
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if the user is not member of any groups you could just discard the group filter bit completely.
Apart from the "meow" issue (see comments), the code looks good overall. |
$user = $this->userManager->get($userId); | ||
$allGroups = $this->groupManager->getUserGroups($user, 'sharing'); | ||
|
||
$sharedWithUserOrGroup = array_map(function(IGroup $group) { return $group->getGID(); }, $allGroups); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better rewrite this:
$sharedWithUserOrGroup = array_map(function(IGroup $group) {
return $group->getGID();
}, $allGroups);
->leftJoin('s', 'filecache', 'f', $qb->expr()->eq('s.file_source', 'f.fileid')) | ||
->leftJoin('f', 'storages', 'st', $qb->expr()->eq('f.storage', 'st.numeric_id')); | ||
|
||
$qb->where($qb->expr()->in( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to not keep the chained calls? I mean:
$qb->select(....)
->selectAlias(....)
->from(...)
->leftJoin(...)
->leftJoin(...)
->where(....)
->andWhere(...)
unless there is a conditional clause that needs to be inserted in the middle, I think it's better this way. If the condition can be moved to the end without harming the performance, I'd move it. If the conditional clause have to be inserted in the middle, then there is no other choice.
/** | ||
* @inheritdoc | ||
*/ | ||
public function getAllSharedWith($userId, $shareTypes, $node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment about why $sharetypes
is being ignored here is needed. I pass some share types and not only it ignores them, but also uses another type. My first thought is "this is wrong"
lib/public/Share/IManager.php
Outdated
* Filter by $node if provided | ||
* | ||
* @param string $userId | ||
* @param int[] $shareTypes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might need a better explanation... If some constants are expected, either list them or point the user to where the constants are.
da11834
to
20bc46b
Compare
* @inheritdoc | ||
*/ | ||
public function getAllSharedWith($userId, $node) { | ||
$this->getSharedWith($userId, self::SHARE_TYPE_REMOTE, $node, -1, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing return? there is also no unit test on this I assume
lib/private/Share20/Manager.php
Outdated
* @inheritdoc | ||
*/ | ||
public function getAllSharedWith($userId, $shareTypes, $node = null) { | ||
$shares = array(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[]
lib/public/Share/IManager.php
Outdated
@@ -90,7 +90,7 @@ public function moveShare(IShare $share, $recipientId); | |||
* Get all shares shared by (initiated) by the provided user for specific node IDs. | |||
* | |||
* @param string $userId | |||
* @param int[] $shareTypes | |||
* @param \OC\Share\Constants[] $shareTypes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OC namespace in an OCP interface? NO
lib/public/Share/IManager.php
Outdated
* | ||
* @param string $userId | ||
* @param \OC\Share\Constants[] $shareTypes | ||
* @param Node|null $node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add this to the argument declaration
lib/public/Share/IShareProvider.php
Outdated
@@ -94,7 +94,7 @@ public function move(\OCP\Share\IShare $share, $recipient); | |||
* Get all shares by the given user | |||
* | |||
* @param string $userId | |||
* @param int $shareType | |||
* @param \OC\Share\Constants $shareType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👎 you cannot change the definition of a public api
tests/lib/Share20/ManagerTest.php
Outdated
} | ||
|
||
public function testGetAllSharedWith() { | ||
$user = $this->createMock('\OCP\IUser'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IUser::class
conflicts and comments to be resolved |
20bc46b
to
c6d3ed8
Compare
Ready for review if tests are passing. |
IQueryBuilder::PARAM_STR_ARRAY | ||
)) | ||
), | ||
$qb->expr()->andX( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could have stored these in variables $usersCondition
and $groupCondition
to avoid duplicating the code here and in the block below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed queries a little and not applicable now
c6d3ed8
to
d92c4eb
Compare
$groupShares = []; | ||
|
||
// Check if user is member of some groups and chunk them | ||
$sharedWithGroup = array_map(function(IGroup $group) { return $group->getGID(); }, $allGroups); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjust code style here:
$sharedWithGroup = array_map(function(IGroup $group) {
return $group->getGID();
}, $allGroups);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I adjusted to the one which is in the "original function getSharedWith" :> Question means which is "code style" :D
|
||
for ($chunkNo = 0; $chunkNo < $chunkNoRequired; $chunkNo++) { | ||
// Query for user/group shares | ||
$qb = $this->dbConn->getQueryBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move all the DB logic to another function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
lib/private/Share20/Manager.php
Outdated
if (!isset($providerIdMap[$providerId])) { | ||
$providerIdMap[$providerId] = array(); | ||
} | ||
array_push($providerIdMap[$providerId], $shareType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think unless you're pushing several items, $providerIdMap[$providerId][] = $shareType
might perform better, and also is consistent with the rest of the code.
d92c4eb
to
3b93a61
Compare
83e4c2b
to
3058fd4
Compare
* @param Node|null $node | ||
* @return DB\QueryBuilder\IQueryBuilder $qb | ||
*/ | ||
public function getSharedWithUserQuery($userId, $node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created separate function to create query for user shares
* @param Node|null $node | ||
* @return DB\QueryBuilder\IQueryBuilder $qb | ||
*/ | ||
public function getSharedWithGroupQuery($groups, $node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created separate function to create query for group shares
* @param Node|null $node | ||
* @return DB\QueryBuilder\IQueryBuilder $qb | ||
*/ | ||
public function getSharedWithUserGroupQuery($groups, $userId, $node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created separate function to create query for both group and user shares
$groups = $sharedWithGroupChunks[$chunkNo]; | ||
if ($chunkNo === 0) { | ||
// First chunk, so we need to obtain user | ||
$qb = $this->getSharedWithUserGroupQuery($groups, $userId, $node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this. Why the first chunk always calls this one function but other chunks the other one ? What happens if there are more than 100 in a chunk ?
What magic guarantees that the first chunk is only about users ?
Either change the code to be more readable or adjust PHPDoc to explain why and avoid future confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or don't cheat and use two arrays, one for users and one for groups instead of putting both in one and using the first chunk with a special meaning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is that if there are both user sharess and group shares, dont do 2 queries, do one as above.
In case of chunks, you only to get user shares only once, you dont need to do that for each chunk, this is why with each next chunk you dont fetch user, but only groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these two scenarios this covered by unit tests / integration tests ? if yes then I'm fine...
75fb6d0
to
8000bab
Compare
8000bab
to
9279ed3
Compare
$cursor->closeCursor(); | ||
} else { | ||
// There are groups, query both for user and for groups | ||
$userSharesRetrieved = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PVince81 is it clearer now?
@jvillafanez @PVince81 @DeepDiver1975 more ready it wont be :> |
I have nothing against 👍 |
Thanks! Please backport to stable10. |
Never backported in my life, so how to do that properly? I need to patch stable10 with this commit? Cherry-pick ? |
@mrow4a branch off stable10, cherry-pick, retest, then send a PR with target stable10 |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
In best case it gets rid of doubled queries
$shares, $this->shareManager->getSharedWith
for both GROUP and USER shares, as in the PR body. Currently, if there are both user and group shares, we need to do 2 queries. With this PR, if we have both group and user shares, we need to do 1 query. If we have only user shares, old and new implementation will do 1 query per user.Anyways, it is big win, since this query does... 2 Left Joins! These are very expensive operations in DBMS for enterprise scale tables.