-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translate SkipWhile and TakeWhile #14104
Comments
After almost creating another issue I've ended up here and thought I'd share my situation and get some more weight behind this proposal. I'm building a GraphQL service and trying to support cursor-based pagination.
What I'm finding is that those articles make it sound really easy to implement, as do a lot of blog posts, but in reality that's rarely going to work. The examples I see always assume: "You sort by CreatedAt" or "You sort by BookingID" but that fails to address two potential issues:
In my situation right now, I'm facing both issues. I'm working on a set of Bookings. A booking has:
The client would like to be able to sort by booking date (issue 1), so this would therefore be the logical cursor to use, but when the client is an admin user, the result set (across shops) will have duplicate booking dates (issue 2) as it is essentially an discrete list of arrival times for a date. So, that means booking date alone can't be used as the cursor as getting a page after a given cursor would effectively skip all the remaining pages of bookings that had that same booking date. So, the search for a cursor continues. I first though it could be a combination of booking date and booking id, but although that makes sense for the ORDER BY clause (ORDER BY BookingDate, BookingID), booking ids don't progress in line with booking dates, so the WHERE clause (WHERE BookingDate > Cursor.BookingDate AND BookingID > Cursor.BookingID) would effectively skip results if a booking with a later BookingDate has a lower BookingID than a previous page. Finally, it would seem the best cursor to use would be a ROWNUMBER generated by SQL Server. That way I can do my ORDER BY clause across multiple fields and then in the WHERE clause filter by ROWNUMBER/cursor. Ideal solution OFFSET solution Is it possible now to re-use the IQueryable I've already built up with various filters and other logic earlier, and tack on this OFFSET clause with a subquery? Is there an elegant way to do that now in EF Core pending his SkipWhile suggestion above? PK lookup first, then query results solution
But similarly to above, I want to re-use my IQueryable and tack on the ROWNUMBER into my SELECT, how can I do that elegantly? Painful solution |
The biggest issue is that I did not succeed in materializing the I'll write up another blogpost soon showing the details about my current solution, and maybe put it in a handy NuGet package. I'm currently running around on a exhibition so it might take a few days though. |
Thanks for the response. It definitely should be easier than it is, hopefully the EF Core team can prioritise it in the backlog. Would love to see your solution. I may need to do the same. I found a .ToSql() extension method that gets the raw SQL but like you said it's a bit of a nightmare to be dealing with something so raw. Theoretically I could just pop on the OFFSET with subquery in there and see what happens, but it's pretty loose haha. Update: scratch that. Few people have said it's super slow to generate the SQL, which makes sense given it's using reflection. I wish I could just generate the WHERE clause somehow though, to pop it into a raw query for the PKs and ROWNUMBER()... cc: @RahenSaeed , this will be useful once it's in place as far as EF Core goes for implementing cursor-based pagination effectively in real-world scenarios. |
@benmccallum I'll elaborate it more expansively later, but this is the gist of it: /// <summary>
/// A ggeneric method for applying cursor based pagination
/// </summary>
/// <param name="query">The base query to apply pagination to</param>
/// <param name="queryFactory">A queryfactory instance</param>
/// <param name="column">The column to use for cursor based pagination</param>
/// <param name="cursor">The cursor value</param>
/// <param name="count">The number of items to retrieve after the cursor</param>
/// <returns></returns>
public static Query CursoredOffset(
this Query query,
QueryFactory queryFactory,
string column,
string cursor,
int count)
{
var compiler = new SqlServerCompiler();
var ctx = new SqlResult
{
Query = query
};
var order = compiler.CompileOrders(ctx);
query.Clauses.RemoveAll(q => q.Component == "order");
if (string.IsNullOrWhiteSpace(order))
{
throw new Exception($"{nameof(query)} does not have an order by clause");
}
var internalQuery = queryFactory.Query()
.With("d", query)
.From("d")
.CombineRaw(order)
.CombineRaw($@"OFFSET (
SELECT TOP 1 rn
FROM(
SELECT [d].{column}, ROW_NUMBER() OVER({ order }) AS rn FROM [d]
) T
WHERE T.{column.Split('.').LastOrDefault() ?? column} = '{cursor}'
) rows
FETCH NEXT {count} ROWS ONLY");
return internalQuery;
} So few things to note here; it uses the query factory and makes life easier for me. The SqlKata query is required to have at least one order specified. If you want to retrieve the resultcount you can grab the count from the query you provided the method above. |
Thx for sharing. I've taken a slightly different approach for now as an example to our team and we're just deciding whether we want to roll that out around the place or try wait for EF Core to support something and avoid another dependency. I almost wonder how hard it would be (without any EF Core repo experience) for us to have a crack at this ourselves. My fear is that the combination of standard For my POC to my team I build the filtering query in SqlKata, then hand off to a function (that I could probably make generic if I parameterised it a bit more) that fires off the necessary queries to get the page of records' PKs with the SqlKata query and then feeding them into the EF Core query. It's super roundabout but at least I can support the whole Connection (hasNextPage, hasPreviousPage, totalCount, edges) in a way that I could make generic with a few more parameters into the query. |
Having a dig into the EF Core code, I'm thinking we could take inspiration from how they do @corstian, how would you suggest supporting reverse direction paging with your proposal. Would it just be a matter of reversing the ORDER BY expression and taking LIMIT? I guess so. I'd be keen to hear whether @smitpatel or @ajcvickers think we can support your suggested syntax or whether we'd use a specific cursor-paging extension methods, like: I might give it a crack if we have some guidance from the core team first as to how they'd like it to look. |
According to the docs it should be possible to use I think @benmccallum makes a valid point when it comes to specific cursor paging extension methods. While technically it would be totally possible to support backwards pagination using the
|
Ah, yea should've realised there was a FETCH PRIOR, didn't look at the time. I like your LINQ method names more than mine! I wonder if |
Been digging into the EF Core source code and it's a little overwhelming haha. Came unstuck on the Getting And Building the Code guide running If I can get past this then I'll be able to navigate the codebase a bit better. I want to try and find some tests that would be a good starting point to copy, paste, manipulate for what we want and then TDD this thing. Update: Update 2: I think the issue is that I can't run |
@benmccallum the docs are indeed out of date with regards to The latest version of VS 2019 is a requirement to build the source. After If you're still having issues please post back. |
Thanks @roji , I haven't tried this yet as in the end I realised it might be better for me to try implement this in 2.2 and port forward to 3.0. Will see how I go! |
I finally got to finish that post on cursor based pagination. And I found ways to paginate both forwards and backwards, with reasonable speed. TL;DR: Put the SQL statement in a CTE, select |
So I've been trying to introduce two new extension methods: PR draft linked above for where I've got to atm. So far I've been able to:
Questions:
I think next time I have a crack I need to take reference from how the Take/Skip or OrderBy implementations work, or maybe even the Where one. It seems like Re-linq Remotion is all about building a @corstian, great blog post btw! |
@benmccallum - With Relinq going away, all that processing would be unnecessary. We need to add additional methods to translate newly added methods. Currently the PR I have is much in flux. I will let you know once it starts stabilizing but it would be certainly easier to implement this than with Relinq. |
Just wondering if there is any update to this or any idea when we may see this implemented in EF Core? |
I've recently done some research into pagination, and I don't think SkipWhile is the right building block for an efficient "cursor-oriented" pagination mechanism. In a nutshell, offset or rownumber-based approaches do not use indexes, making them inherently inefficient. In addition, in many cases these approaches also lack robustness in the face of concurrent updates, as rows are skipped or duplicated (although the approach detailed above doesn't suffer from this, since the ID is used as the cursor, rather than a raw offset). For efficient cursor pagination, simple keyset pagination seems to be the best fit. In other words, assuming we're ording by an Id, pagination would simply use a WHERE clause to get rows after the last Id fetched from the previous page; compared to the above approach of an OFFSET subquery with ROW_NUMBER, this is simple and uses indexes. If more than one pagination key is being used, row value comparison can be used Assuming we've excluded the use of SkipWhile for pagination purposes, there's still the question of non-pagination uses. There are some characteristics of relational databases which limit the usefulness of SkipWhile:
So I'm leaving this issue open in the backlog to track translating SkipWhile and TakeWhile for non-pagination purposes, given non-contrived real-world scenarios which would benefit from it. |
I'm having a scenario where I think TakeWhile() would be useful which is: I'm trying to count unsuccessful user logins since last successful user login for some protection mechanism and don't see how I could achieve it without making 2 queries and applying some logic to it. Possible way would be
Without TakeWhile I would need to retrieve the date (or Id) of last successful login and then count the unsuccessful attempts in the second query. Am I missing some way to make it in one query? |
Feature Request
In order to facilitate my use case of implementing cursor based pagination I would find the LINQ
.SkipWhile()
method useful.I would imagine this expression to result in something like the following SQL statement:
Until now I have been unable to find a reusable and composable way to achieve this behaviour by using the
FromSql()
extension method for multiple models.Related Issues
#12046
The text was updated successfully, but these errors were encountered: