-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHPORM-28 Add Scout engine to index into MongoDB Search #3205
base: 5.x
Are you sure you want to change the base?
Conversation
fa3e0a9
to
12f3323
Compare
composer.json
Outdated
@@ -70,7 +71,8 @@ | |||
"laravel": { | |||
"providers": [ | |||
"MongoDB\\Laravel\\MongoDBServiceProvider", | |||
"MongoDB\\Laravel\\MongoDBBusServiceProvider" | |||
"MongoDB\\Laravel\\MongoDBBusServiceProvider", | |||
"MongoDB\\Laravel\\Scout\\ScoutServiceProvider" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class could be placed at the root of the package and given a name consistent with the other classes.
"MongoDB\\Laravel\\Scout\\ScoutServiceProvider" | |
"MongoDB\\Laravel\\MongoDBScoutServiceProvider" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would "ScoutServiceProvider" likely conflict with other short class names? There looks to be redundancy in both suggestions here, but I'll defer to whatever you think is most consistent.
It's been a couple of weeks since I picked up the Telescope PR but IIRC I opted to use a namespace there as there would be multiple classes. Seemed preferable than keeping a service provider in a separate namespace from other classes associated with the feature.
tests/Scout/ScoutEngineTest.php
Outdated
'fuzzy' => [ | ||
'maxEdits' => 2, | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fuzzy
is required to return results for partial matches (search for "lar" must return "Laravel Framework").
tests/Scout/ScoutEngineTest.php
Outdated
'path' => [ | ||
'wildcard' => '*', | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default we don't know on which fields to search.
src/Scout/ScoutEngine.php
Outdated
|
||
return $model->getScoutModelsByIds($builder, $ids) | ||
->filter(fn (Model $model) => in_array($model->getScoutKey(), $ids)) | ||
->sortBy(fn (Model $model) => array_search($model->getScoutKey(), $ids)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is suspicious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I didn't understood why the index was retrieved from an array, but now I understand that it's intended to sort in the initial order of the results. It's clever. That's copilot that refactored this code copied from MeilisearchEngine: https://github.com/laravel/scout/blob/13cb6460eb9802ff95a28dc11e178b1abebdcc02/src/Engines/MeilisearchEngine.php#L290
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't flipping the array (i.e. each $id
becomes a key) be more efficient to derive a sort order? That would replace repetitive array_search()
calls with HashTable lookups.
*/ | ||
public function delete($models): void | ||
{ | ||
assert($models instanceof Collection, new TypeError(sprintf('Argument #1 ($models) must be of type %s, %s given', Collection::class, get_debug_type($models)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert
calls compensate for the lack of native types in the parent class.
13ff6ec
to
4647cf0
Compare
Integration tests require an atlas or local-atlas cluster to run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving an incomplete round of feedback. We'll need to schedule some time to actually go through this PR together if you need a meaningful review, as a lot of it is over my head.
} | ||
|
||
/** | ||
* This method must be overridden when the `getScoutKey` method is also overridden, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If getScoutKey()
is overridden, isn't it possible that it no longer calls getScoutKeyName()
at all? It's not clear to me why this was broken out to a separate method if it's only called from getScoutKey()
, and a user could just as easily override that on its own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getScoutKeyName
is necessary for searching original model objects from the results of the search engine.
Example: if in a SQL database, the "Scout Key" is scout_key
, a search query will return a list of objects with the scout_key
property and the following query is executed: SELECT * FROM model WHERE scout_key IN (<ids>)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That answer is over my head. My question was more basic, as I was just looking for what might actually cal getScoutKeyName()
. I think I found my answer in the Searchable trait, though.
It'd have been helpful if any overridden methods mentioned that they were overriding something from the trait (maybe using @see
with a reference to the base method).
|
||
return $builder; | ||
}, | ||
array_replace_recursive($defaultPipeline, [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but in this case you're actually just replacing $defaultPipeline[0]
since there's an equality matching between key 0
in $defaultPipeline
and the lone key in the replacement array.
Basing this on my read of array_replace_recursive
.
Does this actually need to use the recursive variant if you're really just replacing an aggregation pipeline stage in its entirety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm using array_replace_recursive
to replace the numeric key (instead of append in cas of array_merge
), and this adds the filter
key-value to $defaultPipeline['$search']['compound']
], $results->pluck('name', 'id')->all()); | ||
} | ||
|
||
public function testItCanUseBasicSearchToFetchKeys() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test seems redundant in light of testItCanUseBasicSearch()
?
*/ | ||
private function wait(Closure $callback): void | ||
{ | ||
$timeout = hrtime()[0] + self::WAIT_TIMEOUT_SEC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is likely a very minor edge case, but hrtime()
could be unsupported and return false
:
- https://github.com/php/php-src/blob/php-8.4.1/ext/standard/basic_functions.stub.php#L2186
- https://github.com/php/php-src/blob/php-8.4.1/ext/standard/hrtime.c#L49
- https://github.com/php/php-src/blob/php-8.4.1/Zend/zend_hrtime.h#L57
But if static analysis doesn't care about this then perhaps you shouldn't either.
src/Scout/ScoutEngine.php
Outdated
$mapping['fields']['__soft_deleted'] ??= [ | ||
'type' => 'number', | ||
'representation' => 'int64', | ||
'indexDoubles' => false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted this pertains to https://www.mongodb.com/docs/atlas/atlas-search/field-types/number-type/#configure-fts-field-type-field-properties
Consider including a comment that links to some docs on this, as it's not clear what this mapping means otherwise.
1b39753
to
95343b1
Compare
} | ||
|
||
/** | ||
* This method must be overridden when the `getScoutKey` method is also overridden, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That answer is over my head. My question was more basic, as I was just looking for what might actually cal getScoutKeyName()
. I think I found my answer in the Searchable trait, though.
It'd have been helpful if any overridden methods mentioned that they were overriding something from the trait (maybe using @see
with a reference to the base method).
tests/Models/User.php
Outdated
public function toSearchableArray(): array | ||
{ | ||
return [ | ||
'id' => (int) $this->id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the $id
property documented as a string in the class docblock, but cast to an integer here? Is it expected to be an ObjectId within the database that's represented as a string in a hydrated model instance?
$collection = $this->getIndexableCollection($models); | ||
|
||
if ($this->softDelete && $this->usesSoftDelete($models)) { | ||
$models->each->pushSoftDeleteMetadata(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted that this calls the method in the Searchable trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and there is no interface for "simplicity" I guess. I don't even see how I could add an assert(...)
with this sequence of magic function calls.
private function usesSoftDelete(Model|EloquentCollection $model): bool | ||
{ | ||
if ($model instanceof EloquentCollection) { | ||
$model = $model->first(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to assume that SoftDeletes is used simply because the first model has it? I assume the trait is applied on the model class and an EloquentCollection can be assumed to always contain the same class. I dunno if Eloquent supports anything like Doctrine's multi-collection inheritance, but that might make it possible for there to be various classes here (considering an edge case where only some children utilize the trait).
But even if that were the case, would a single Scout collection cover all of the classes in that hierarchy? Perhaps not.
/** | ||
* Delete a "search index", i.e. a MongoDB collection. | ||
*/ | ||
public function deleteIndex($name): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where this would be called, but should there be some validation that it's actually an index name and not just an arbitrary MongoDB collection?
One sanity check would be to ensure it starts with $this->prefix
, although that's still a bit risky per my comment below on deleteAllIndexes()
.
ee6edd2
to
3d030ba
Compare
Fix PHPORM-28
Checklist