Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GROUP BY translated to ORDER BY? #29757

Closed
SuRGeoNix opened this issue Dec 4, 2022 · 9 comments
Closed

GROUP BY translated to ORDER BY? #29757

SuRGeoNix opened this issue Dec 4, 2022 · 9 comments
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported

Comments

@SuRGeoNix
Copy link

Hi I'm trying to translate the following SQLite query to EF Core (using TPH with eager loading):-

(I create uniqueId to avoid group by collumns with NULL values for MovieCollectionId)

SELECT *,IFNULL(mc.Id,random()) as uniqueId FROM Media as m
LEFT JOIN MovieExts me ON m.MovieExtId = me.Id
LEFT JOIN MovieCollections mc ON me.MovieCollectionId = mc.Id
WHERE m.Discriminator = 'Movie'
GROUP BY uniqueId

The below attempt failed:-

ctx.Movies
	.Select(m =>
	new
	{
		Movie = new Movie()
		{
			Id = m.Id,
			Duration = m.Duration,
			Title = m.Title,
			MovieExt = new MovieExt()
			{
				Id = m.MovieExt.Id,
				Title = m.MovieExt.Title,
				TitleEN = m.MovieExt.TitleEN,
				MovieCollection = m.MovieExt.MovieCollection
			}
		},
		UniqueHelper = m.MovieExt.MovieCollection != null ? m.MovieExt.MovieCollection.Id : EF.Functions.Random(),
	})
	.GroupBy(x => x.UniqueHelper)
	.ToList();

Produces the below raw SQL:-

SELECT CASE
  WHEN "t"."Id1" IS NOT NULL THEN CAST("t"."Id1" AS REAL)
  ELSE abs(random() / 9.2233720368547799E+18)
END, "t"."Id", "t"."Duration", "t"."Title", "t"."Id0" AS "Id", "t"."Title0" AS "Title", "t"."TitleEN", "t"."Id1", "t"."BackdropId", "t"."LandscapeId", "t"."PortraitId", "t"."TMDBId", "t"."Title1", "t"."TitleEN0", "t"."UniqueHelper"
FROM (
  SELECT "m"."Id", "m"."Duration", "m"."Title", "m0"."Id" AS "Id0", "m0"."Title" AS "Title0", "m0"."TitleEN", "m1"."Id" AS "Id1", "m1"."BackdropId", "m1"."LandscapeId", "m1"."PortraitId", "m1"."TMDBId", "m1"."Title" AS "Title1", "m1"."TitleEN" AS "TitleEN0", CASE
	  WHEN "m1"."Id" IS NOT NULL THEN CAST("m1"."Id" AS REAL)
	  ELSE abs(random() / 9.2233720368547799E+18)
  END AS "UniqueHelper"
  FROM "Media" AS "m"
  LEFT JOIN "MovieExts" AS "m0" ON "m"."MovieExtId" = "m0"."Id"
  LEFT JOIN "MovieCollections" AS "m1" ON "m0"."MovieCollectionId" = "m1"."Id"
  WHERE "m"."Discriminator" = 'Movie'
) AS "t"
ORDER BY CASE
  WHEN "t"."Id1" IS NOT NULL THEN CAST("t"."Id1" AS REAL)
  ELSE abs(random() / 9.2233720368547799E+18)
END

If this is not supported by EF Core is it possible to pass all the objects with raw SQL?

Include provider and version information
EF Core version: 7
Database provider: Microsoft.EntityFrameworkCore.Sqlite
Target framework: .NET 7.0
Operating system: Windows 10
IDE: Visual Studio 2022 17.4

@roji
Copy link
Member

roji commented Dec 4, 2022

@SuRGeoNix you are using GroupBy as the final operator in the query (see docs). Such a GroupBy cannot be translated to a SQL GROUP BY, since in SQL you need an aggregate function (e.g. Sum, Count) to "reduce" the rows back to scalar values. For more information on that, see our GroupBy docs.

Since your GroupBy is the final operator and there's not aggregate function after that, EF orders and fetchs all the rows, and then groups them on the client side.

@SuRGeoNix
Copy link
Author

Thanks for the quick replay @roji, still straggling to understand from the docs how I will execute the above query. I'm not familiar with Linq and I'm not sure what scalar and aggregate functions are :) What I've noticed though that even if Group by is consider as final operator then this GroupBy pass to the SQL (replacing the groupby from initial query) :-

.GroupBy(
x => x.UniqueHelper,
y => y,
(x, y) => x
)

And produces this SQL:-

SELECT "t"."Key"
FROM (
  SELECT CASE
	  WHEN "m1"."Id" IS NOT NULL THEN CAST("m1"."Id" AS REAL)
	  ELSE abs(random() / 9.2233720368547799E+18)
  END AS "Key"
  FROM "Media" AS "m"
  LEFT JOIN "MovieExts" AS "m0" ON "m"."MovieExtId" = "m0"."Id"
  LEFT JOIN "MovieCollections" AS "m1" ON "m0"."MovieCollectionId" = "m1"."Id"
  WHERE "m"."Discriminator" = 'Movie'
) AS "t"
GROUP BY "t"."Key"

Which is closer to what I want but not exactly. Does UniqueHelper is consider scalar?

@roji
Copy link
Member

roji commented Dec 4, 2022

That may be a specific case where final GroupBy can be translated to SQL GROUP BY, since you're only selecting out the key. That's also not a very useful query: you can express this by projecting out that key (UniqueHelper) and doing Distinct() on it.

I suggest giving our GroupBy docs a good read; that should familiarize you with how SQL GROUP BY works and what aggregate functions are.

@SuRGeoNix
Copy link
Author

SuRGeoNix commented Dec 4, 2022

Well my understanding is that EF Core doesn't support GroupBy projection without an Aggregate function which makes my case not possible to implement this way. So, do I have an alternative way with raw SQL? I was not able to find a way to fallback to raw SQL either and fill (eager loading) the objects.

(See #23439)

@roji
Copy link
Member

roji commented Dec 4, 2022

Well my understanding is that EF Core doesn't support GroupBy projection without an Aggregate function which makes my case not possible to implement this way. So, do I have an alternative way with raw SQL?

It isn't EF Core - it's just how SQL works; SQL generally requires an aggregate function after GROUP BY.

If you have actual SQL which does what you want, feel free to post it here and I can attempt to help in how to express it via LINQ with EF.

Otherwise, you can simply perform whatever projection you want at the client:

ctx.Movies
	.Select(...)
        .AsEnumerable()
	.GroupBy(x => x.UniqueHelper)
        .Select(/* anything at all */)
	.ToList();

At the point where you put the AsEnumerable, the results will be pulled back from the database, and all subsequent LINQ operators will be performed in-memory at the client. Depending on where you put it, it could cause large amounts of data to be pulled back (e.g. if you put it before a Where operator); but since your query only seems to change the shape of the data (and not filter), that seems OK.

This is covered here in our documentation.

@SuRGeoNix
Copy link
Author

I've posted my SQL (very very first thing in my first post). Retrieving all the data from the database it's not a solution (performance wise). And to be honest (no offence), I've spend a lot of time trying to find how to do things with EF core rather than doing them, I really think to just go to SQL raw directly as I just need to support SQLite for now. Thanks again for your help!

@roji
Copy link
Member

roji commented Dec 4, 2022

Apologies, I didn't see that you were using SQLite; SQLite is considerably more lax in various ways - your SQL query above doesn't work on most relational databases.

SELECT *,IFNULL(mc.Id,random()) as uniqueId FROM Media as m
LEFT JOIN MovieExts me ON m.MovieExtId = me.Id
LEFT JOIN MovieCollections mc ON me.MovieCollectionId = mc.Id
WHERE m.Discriminator = 'Movie'
GROUP BY uniqueId

However, I'm still not clear on what exactly you're trying to achieve with this query... Presumably for null IDs, the random() would ensure that uniqueId really is unique across all rows. In that case, what does the GROUP BY achieve here - what happens when you remove it?

I'd be happy to help with SQL (or anything else), but I'm guessing there's a much simpler way to achieve what you want without going through that specific SQL.

@SuRGeoNix
Copy link
Author

No problem @roji, the query brings all the movies and if they belong to a movie collection brings only one record of the collection (groups movie collections together). I will close this as you already helped!

@roji
Copy link
Member

roji commented Dec 4, 2022

Sure thing. The more typical way to write this (and a way which would work in other databases) would be to start with the movie collection, and simply use a filtered include to get only one movie for each one. Something like the following:

_ = ctx.MovieCollections
    .Include(mc => mc.Movies.Take(1))
    .ToList();

@ajcvickers ajcvickers added the closed-no-further-action The issue is closed and no further action is planned. label Dec 6, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported
Projects
None yet
Development

No branches or pull requests

3 participants