Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW Allow manipulating eagerloading queries #11140

Merged

Conversation

GuySartorelli
Copy link
Member

@GuySartorelli GuySartorelli commented Feb 14, 2024

Description

Adds a backwards-compatible way to pre-filter, pre-sort, and apply basically any other type of transformation you want to eagerloaded data.

Note that I have opted to keep the existing public API, since the alternative suggested in the issue would not be usable in templates. I tried consulting with other devs on slack per the acceptance criteria but there's really not anyone who's familiar with eagerloading yet.

Applying a limit in the callback is explicitly disallowed because it wouldn't be applied per relation list, it'd be applied to the whole eager query for reach relation.

Manual testing steps

There's benchmarking code in the linked issue.

Issues

Pull request checklist

  • The target branch is correct
  • All commits are relevant to the purpose of the PR (e.g. no debug statements, unrelated refactoring, or arbitrary linting)
    • Small amounts of additional linting are usually okay, but if it makes it hard to concentrate on the relevant changes, ask for the unrelated changes to be reverted, and submitted as a separate PR.
  • The commit messages follow our commit message guidelines
  • The PR follows our contribution guidelines
  • Code changes follow our coding conventions
  • This change is covered with tests (or tests aren't necessary for this change)
  • Any relevant User Help/Developer documentation is updated; for impactful changes, information is added to the changelog for the intended release
  • CI is green

src/ORM/DataList.php Outdated Show resolved Hide resolved
Comment on lines -1483 to +1563
$list->eagerLoadAllRelations[$item] = $item;
$list->eagerLoadAllRelations[$item] ??= null;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does two things:

  1. Default the value to null, so we can do a simply nullable check to see if there was a callback set for that chain or not
  2. Doesn't override existing values - if a callback was explicitly set, we don't want to remove it here.

Comment on lines +1550 to +1566
// Set the callback for this chain
$list->eagerLoadAllRelations[$relationChain] = $callback;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this will override a callback with null if someone adds the same chain twice, the second time without a callback. For example:

$list = $list->eagerLoad(['MyRelation.AnotherRelation' => $someCallback]);
$list = $list->eagerLoad('MyRelation.AnotherRelation');

This allows complex branching logic, where some branches can remove the filters/sorting/etc that other branches might apply.

Comment on lines 1329 to 1331
// Get the join records so we can correctly identify which children belong to which parents
// This also holds extra fields data
$joinRows = DB::query('SELECT * FROM "' . $joinTable . '" WHERE "' . $parentIDField . '" IN (' . implode(',', $parentIDs) . ') AND ' . $childIDField . ' IN (' . implode(',', $fetchedIDs) . ')');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note we're now only getting the join records that are applicable to the potentially-filtered-down set of relations.

src/ORM/DataList.php Outdated Show resolved Hide resolved
@GuySartorelli GuySartorelli force-pushed the pulls/5/eagerloading-prefilter branch from 77851f8 to befd2ad Compare February 14, 2024 20:52
@GuySartorelli GuySartorelli changed the title NEW Allow pre-filtering and pre-sorting eagerloaded data NEW Allow manipulating eagerloading queries Feb 14, 2024
@GuySartorelli GuySartorelli force-pushed the pulls/5/eagerloading-prefilter branch from befd2ad to 342f3bf Compare February 19, 2024 01:30
@GuySartorelli GuySartorelli marked this pull request as ready for review February 19, 2024 03:16
@GuySartorelli GuySartorelli force-pushed the pulls/5/eagerloading-prefilter branch from 342f3bf to 3df0768 Compare February 19, 2024 03:19
Comment on lines +1118 to +1121
// Throw exception if developers try to manipulate a has_one relation as a list
if ($this->eagerLoadAllRelations[$relationChain] !== null) {
throw new LogicException("Cannot manipulate eagerloading query for $relationType relation $relationName");
}
Copy link
Member Author

@GuySartorelli GuySartorelli Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't go $myRecord->MyHasOne()->filter(), for example. This would be the eagerloading equivalent of that if we allowed it.

Comment on lines -1281 to -1283
// Get the join records so we can correctly identify which children belong to which parents
// This also holds extra fields data
$joinRows = DB::query('SELECT * FROM "' . $joinTable . '" WHERE "' . $parentIDField . '" IN (' . implode(',', $parentIDs) . ')');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to be after fetching the records we care about, and is now filtered and sorted based on that.

Comment on lines +1337 to +1338
// Respect sort order of fetched items
. ' ORDER BY FIELD(' . $childIDField . ', ' . $fetchedIDsAsString . ')'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sorting is important here because the join rows are what we're looping through - so the order of these join rows determines the order of the records in the resultant EagerLoadedList

Comment on lines +1424 to +1432
/**
* NOTE: Do not change `DataList` to `static` in this method signature.
* Subclasses of DataList must still accept DataList arguments and return DataList!
*/
private function manipulateEagerLoadingQuery(
DataList $fetchList,
string $relationChain,
string $relationType
): DataList {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with Steve - he prefers DataList here over self.
I'm agnostic - I think the comment needs to be there for our future selves either way, and both IDEs and static analysis will ultimately treat them the same.

Comment on lines +1501 to +1503
* <code>
* $myDataList->eagerLoad('MyRelation.NestedRelation.EvenMoreNestedRelation', 'DifferentRelation')
* </code>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding <code> tags here is an unrelated improvement for API docs since I'm adding another example here.

Comment on lines +1521 to +1524
// If an array is passed in directly, treat it as though $relationChains wasn't spread.
if (count($relationChains) === 1 && is_array($relationChains[array_key_first($relationChains)])) {
$relationChains = $relationChains[array_key_first($relationChains)];
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handles the difference between passing strings (original API, useful for templates) and passing an array (enhanced API, necessary if applying manipulations)

Comment on lines +1533 to +1539
// Reject non-callable in associative array
if ($callback !== null && !is_callable($callback)) {
throw new LogicException(
'Value of associative array must be a callable.'
. 'If you don\'t want to pre-filter the list, use an indexed array.'
);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of added type safety - it would cause an error later on, but throwing out a clear message about what's wrong here will improve the debugging experience

Comment on lines 845 to 851
/**
* Get the current limit of this query.
*/
public function getLimit(): array
{
return $this->query->getLimit();
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new API is necessary for us to explicitly disallow calling limit() in an eagerloading manipulation, since at best allowing that would provide unexpected results.

Comment on lines -900 to +901
$dataList = EagerLoadObject::get()->filter($filter)->eagerLoad(...$eagerLoad);
$dataList = EagerLoadObject::get()->filter($filter)->eagerLoad($eagerLoad);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've swapped a few existing tests to pass the array directly, so we have coverage for non-associative arrays. I've left others alone to retain coverage for passing strings.

Comment on lines +1292 to +1293
$this->assertGreaterThan(1, $loop1Count);
$this->assertGreaterThan(1, $loop2Count);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change - the test was meant to have these originally which is why those variables were there.

EagerLoadObject::get()->eagerLoad(['HasManyEagerLoadObjects' => 'HasManyEagerLoadObjects']);
}

public function provideNoLimitEagerloadingQuery(): array
Copy link
Member

@emteknetnz emteknetnz Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to test/implement belongs_many_many?

many_many_through?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No harm in it, will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - all of the many_many relation types are built the same so was pretty easy to refactor for the extra types.

EagerLoadObject::get()->eagerLoad(['HasManyEagerLoadObjects' => 'HasManyEagerLoadObjects']);
}

public function provideNoLimitEagerloadingQuery(): array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public function provideNoLimitEagerloadingQuery(): array
public function provideNoLimitEagerLoadingQuery(): array

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 1336 to 1338
* @dataProvider provideNoLimitEagerloadingQuery
*/
public function testNoLimitEagerloadingQuery(string $relation, string $relationType, callable $callback): void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @dataProvider provideNoLimitEagerloadingQuery
*/
public function testNoLimitEagerloadingQuery(string $relation, string $relationType, callable $callback): void
* @dataProvider provideNoLimitEagerLoadingQuery
*/
public function testNoLimitEagerLoadingQuery(string $relation, string $relationType, callable $callback): void

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

src/ORM/DataQuery.php Outdated Show resolved Hide resolved
@@ -879,4 +879,38 @@ public function testWithUsingDataQueryAppliesRelations()
// This will throw an exception if it fails - it passes if there's no exception.
$dataQuery->execute();
}

public function provideGetLimit(): array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'd get rid of this test - see my comment on DataQuery

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@emteknetnz emteknetnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the following error when testing this:

// WITH EAGERLOADING, FILTER/SORT IN DB
// 0.1115, 0.1077, 0.1019, 0.1110 -- 0.1080s (avg) (~29% faster than without eagerloading and ~85% faster than filtering in PHP)
foreach (MyDataObject::get()->eagerLoad(['MySubDataObjects' => fn (DataList $list) => $list->filter('Title:PartialMatch', 1)->Sort('Title', 'DESC')]) as $do) {
    $do->MySubDataObjects()->toArray();
}
ERROR [Emergency]: Uncaught TypeError: {closure}(): Argument #1 ($list) must be of type DataList, SilverStripe\ORM\DataList given, called in /var/www/vendor/silverstripe/framework/src/ORM/DataList.php on line 1435
IN GET dev/build

@GuySartorelli
Copy link
Member Author

I got the following error when testing this:

The code I copied into the issue is missing a use statement - I must have been overly aggressive when tidying it up.

@GuySartorelli GuySartorelli force-pushed the pulls/5/eagerloading-prefilter branch from 3df0768 to 95c9f33 Compare February 19, 2024 21:09
src/ORM/DataQuery.php Outdated Show resolved Hide resolved
src/ORM/DataList.php Outdated Show resolved Hide resolved
@GuySartorelli GuySartorelli force-pushed the pulls/5/eagerloading-prefilter branch from 95c9f33 to 6ed0d21 Compare February 19, 2024 22:45
@emteknetnz emteknetnz merged commit 528344d into silverstripe:5 Feb 20, 2024
15 checks passed
@emteknetnz emteknetnz deleted the pulls/5/eagerloading-prefilter branch February 20, 2024 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants