Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW Enable ArrayList and EagerLoadedList to use search filters #10910

Conversation

GuySartorelli
Copy link
Member

@GuySartorelli GuySartorelli commented Aug 9, 2023

Comparison logic is based on the way MySQL handles comparisons and coerces various types, for parity with DataList. This means the comparisons may behaviour slightly different for projects with other database drivers, but I think that's unavoidable without either:
a) abstracting this out so that we have something like DB::matches() which then calls some method on the DBConnector instance which means either a breaking change by introducing new methods on the interface or else checking if the method exists on the DBConnector implementation and if not, falling back to some default, OR
b) making DB queries which would be silly since that's what we're trying to avoid in the first place.

See https://dev.mysql.com/doc/refman/8.1/en/type-conversion.html for MySQL type conversion rules and https://dev.mysql.com/doc/refman/8.1/en/comparison-operators.html for MySQL comparison operator functionality - but it's not all documented and a lot of this has been found through trial and error. There are even more edge cases than what's covered here that I haven't even tried to cover, and you can see some failed CI where I'm still working on matching some weird MySQL behaviours.

Options going forward

Now that I've been bashing my head against this for a bit, I'm thinking it's probably best to not try matching the MySQL behaviour this closely. We can choose not to, because:

  • This is new functionality, both for ArrayList and for EagerLoadedList, so we won't actually be making any breaking changes by not explicitly matching the results of these filters when used in DataList.
  • Matching MySQL behaviour specifically means we're likely to not be matching behaviour for other database connectors like postgresql

Before I blow away a huge chunk of this PR though, I'd like to get at least one more opinion of it - the last thing I want is to change to a simpler implementation and then be told "no, go do what MySQL does"

Option One: Don't try to match MySQL

We can provide the behaviour that seems most intuitive, and doesn't require all sorts of weird edge case handling.
For example, in exact match filter we can simply say: If it's not an exact match according to either == or === PHP comparison logic, then it isn't the same value (this is essentially what ArrayList currently does, so at least for that class we probably have to retain this behaviour).

However, not MySQL behaviour means that filters on EagerLoadedList may not provide the same results as if it was a regular DataList in some scenarios. Personally I think that's fine provided we clearly document that limitation - and it's only likely to happen in strange edge-cases anyway.
There's a decent likelyhood we'll provide the ability to pre-filter eager loaded lists anyway, which will further mitigate that limitation.

Option Two: Try to match MySQL as much as we possibly can

This is what the PR is doing so far. However, there's a lot of stuff that MySQL does when matching stuff that doesn't seem intuitive - and it's resulting in a very complex implementation. It's hard to maintain, and if MySQL changes how it does things, we have to change how we do things.

Matching MySQL behaviour specifically means we're likely to not be matching behaviour for other database connectors like postgresql, so it's going to be obtuse to someone either way.

However, matching MySQL behaviour (as closely as we can) means that filters on EagerLoadedList will provide the same results as if it was a regular DataList.

Option Three: Try to match what whatever the current database connector is does

This is basically Option Two above, but abstracted further so that it would be easy for maintainers of the postgres module to implement any weird postgres logic for this matching functionality.
To do this in a BC compatible way it means we'd need to have a method_exists() check on DB::get_conn(), and if the method doesn't exist we'd have to have a fallback, which would probably have to be the MySQL implementation.

Personally this option is where it starts getting way too messy for my liking.

Issue

Comment on lines +680 to +684
public function excludeAny(): static
{
$filters = call_user_func_array([$this, 'normaliseFilterArgs'], func_get_args());
return $this->filterOrExclude($filters, false, true);
}
Copy link
Member Author

@GuySartorelli GuySartorelli Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this isn't necessary by any means - but it's a very low effort, low risk change that helps align the ArrayList API with other lists. I think it's well worth adding.

@GuySartorelli GuySartorelli force-pushed the pulls/5/searchfilter-static-lists branch 2 times, most recently from f98ddc5 to d64ed16 Compare August 15, 2023 05:28
@GuySartorelli GuySartorelli force-pushed the pulls/5/searchfilter-static-lists branch from d64ed16 to 5cc8b9a Compare August 16, 2023 05:07
@emteknetnz
Copy link
Member

emteknetnz commented Aug 16, 2023

Alternative idea - what happens if just convert everything that goes into a search filter to a string and then feed that to mysql? That kind of resolve edge cases such as searching a string column with the int 0 (real world an invalid use-case) because now we're just using the string 0. How much stuff breaks? i.e. does mysql care if we send it an string value of an int when searching on an int column? Does the ORM already resolve this sort of thing for us?

@GuySartorelli
Copy link
Member Author

While you're right that running this in the database is technically an option, the whole point of EagerLoadedList is that you've already got the data, and we're reducing the number of database calls because of some clear performance bottleneck. I don't want to then start running database queries any time you filter - especially since the queries would either be one query per comparison (for n queries per filter operation), or else a really complex query since we'd have to feed it all the (parameterised, so it won't be super optimal) values and somehow get an identifiable set of "records" out as a result, which especially in ArrayList will be difficult.
And then there's the fact that ArrayList would now be performing database queries which just feels very wrong.

I'd much rather just not implementing this feature instead of performing these comparisons on the database.

Is one of the options in the PR description, or some other option which doesn't require querying the database, a suitable option do you think? I'm strongly leaning towards option one (just do intuitive comparisons and document the difference (which will be great for ArrayList and sometimes a little different than what DataList would give for EagerLoadedList).

@emteknetnz
Copy link
Member

Oh sorry I wasn't clear enough

My idea is in a similar vain to yours of "let's not catering to mysql weirdness" - my idea was instead of trying to "fix" ArrayList to be like mysql, instead make the interaction that DataList has with mysql more like how we'd like ArrayList to behave - e.g. never actually query mysql with the int 0 - does that make sense?

@GuySartorelli
Copy link
Member Author

GuySartorelli commented Aug 17, 2023

Ohhh right. That does make sense - but that sounds like it has potential for at least some weird regressions if not outright breaking changes. I don't think I want to do that so close to a release.
That does sound like a variation on option one though - i.e. do the intuitive thing. Would you be okay with me doing option one for this PR, and we can discuss whether-or-not we look at changing the way we interact with MySQL for DataList as a separate discussion?

If you think this warrants further discussion then lets chat in person on Monday and we can summarise the discussion here afterward.

@emteknetnz
Copy link
Member

Yeah just do option one. If you're of the opinion that we simply shouldn't do the other options then it's really the only way forward here.

@GuySartorelli
Copy link
Member Author

Closing this PR as this approach is no longer what I'll be doing. I'll link to this from the new PR once I have created one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants