HHH-15133 - Use specified result-type to better infer "shape" of query results with implicit selections #4890

sebersole · 2022-03-19T13:23:50Z

HHH-15133 - Use specified result-type to better infer "shape" of query results with implicit selections

https://hibernate.atlassian.net/browse/HHH-15133

…y results with implicit selections

sebersole · 2022-03-19T13:29:20Z

2 things of note:

HqlInterpretation refs in QueryInterpretationCache are now expected-result-type sensitive in terms of array v. not-array
@beikov this reverted some changes you made in AbstractSharedSessionContract#createQuery(jakarta.persistence.criteria.CriteriaQuery<T>) for the TCK - at least according to the commit messages. This was another "massive commit" and I was not about to dig through it all. Pretty confident the changes I made follow the spec, but we should verify this does not cause TCK failures.

gavinking · 2022-03-19T14:36:26Z

Oh nice! Thanks!

sebersole · 2022-03-19T15:26:01Z

Another case where I cannot reproduce these failures locally. Locally, the entire build passes with H2 specifically

sebersole · 2022-03-20T18:22:48Z

Hm, actually I must have been on the wrong branch. I am able to reproduce it. Though as of yet I have no explanation why.

The only thing I can think of is that my change somehow messed up the ListResultsConsumer.UniqueSemantic used. The result of the failing query returns a duplicated row whereas the assertions expect it to be "filtered". The SQL is exactly the same.

I'll look further next week.

sebersole · 2022-03-20T19:11:19Z

So this is actually an interesting discussion. Consider this (the failing) query:

			List<Object[]> phonePayments = entityManager.createQuery(
				"select p " +
				"from Payment p, Phone ph " +
				"where p.person = ph.person ",
				Object[].class)
			.getResultList();

This comes from documentation. The outcome is wrong even though the test actually passed. The reason it is wrong is that an Object[] was not returned. It passes because it never attempts to access the results. Adding this causes a ClassCastException:

			final Object[] firstRow = phonePayments.get( 0 );

Basically, this was never supported in 6.0 at the moment. My work here actually does add support for it. So within this PR, accessing that first row as above is fine.

However the test now fails because there is currently a mismatch in terms of trying to ascertain whether the row is "duplicate". The query selects Payment. So we ultimately has a single assembler. In the ListConsumer, we ask the RowReader for the result type[1] which is the selection, here Payment. While checking for duplicates it is this JavaType descriptor for Payment used to perform the check. However, because the row is now an array the rows are not considered duplicate.

Most likely, the correct solution is to use the JavaType descriptor for the expected return type[1] rather than the result type[1] when performing the comparisons - assuming the array JavaType descriptor does a proper "deep" comparison which I have not yet verified. Wanted to allow a discussion of this situation before I play more with this.

Essentially, what is the correct outcome here considering an array per-row is being returned in terms of how we check for duplicates? I think the "the filtered result" assertion in the test (the number of results) is correct; it just did not properly check previously for what is actually returned. What say y'all? Everyone agree with that?

[1] There is a distinction between the "result type" as determined strictly based on the query compared to the "expected return type" specified when creating the Query.

sebersole · 2022-03-21T00:47:54Z

Ultimately this is the code block where this breaks down (ListResultsConsumer#consume):

			if ( uniqueSemantic != UniqueSemantic.NONE ) {
				final Class<R> resultJavaType = rowReader.getResultJavaType();
				if ( resultJavaType != null && !resultJavaType.isArray() ) {
					final EntityPersister entityDescriptor = session.getFactory()
							.getRuntimeMetamodels()
							.getMappingMetamodel()
							.findEntityDescriptor( resultJavaType );
					if ( entityDescriptor != null ) {
						uniqueRows = true;
					}
				}
			}

The only time we perform "unique-ing" is when a single entity is selected.

beikov · 2022-03-21T10:56:04Z

IMO when Tuple or Object[] is requested as query result type, we shouldn't do any deduplication/filtering, as the result type suggests that the resulting tuple is what people are interested in. If the select clause had one more item, no deduplication/filtering would happen, so not doing it in case of a single select item would be consistent.

sebersole · 2022-03-21T11:22:23Z

List<TheEntity> entities = session.createQuery( "select e from TheEntity ..." ).list();
List<Integer> ids = session.createQuery( "select e.id from TheEntity ..." ).list();

assert entities.size() != ids.size();

That does not feel strange to you? Sure feels strange to me.

dreab8 · 2022-03-21T12:33:53Z

yeap it looks strange to me!

beikov · 2022-03-21T14:30:30Z

Strange because (1) you think we shouldn't de-duplicate single SqmFrom select items or (2) because you think we should de-duplicate all single select items? IMO (2) might be surprising to many users and would be a change in behavior. De-duplicating single select item SqmFrom i.e. (1) was only introduce in 6.0 btw.

sebersole · 2022-03-21T20:51:45Z

Strange because the same query will return different number of results based on whether I select the entity or its identifier e.g.

De-duplicating single select item SqmFrom i.e. (1) was only introduce in 6.0 btw.

Since I added it, yes I am aware ;)

beikov · 2022-03-22T09:08:12Z

Strange because the same query will return different number of results based on whether I select the entity or its identifier e.g.

Are you suggesting to remove the by default de-duplication for single entity select queries or to change to de-duplicating every single select item query?

sebersole · 2022-03-22T12:59:33Z

I'm questioning whether I want to make these consistent.

I like to think of telling a user about this and how easy it is to explain a feature. Not sure this is easy to explain/understand. Let's talk about it tomorrow when we meet up

sebersole · 2022-03-28T15:32:54Z

#4917

…y results with implicit selections

beikov · 2022-03-29T17:16:36Z

...te-core/src/main/java/org/hibernate/query/internal/QueryInterpretationCacheStandardImpl.java

 			Function<String, SqmStatement<?>> creator) {
 		log.tracef( "QueryPlan#resolveHqlInterpretation( `%s` )", queryString );
 		final StatisticsImplementor statistics = statisticsSupplier.get();
 		final boolean stats = statistics.isStatisticsEnabled();
 		final long startTime = ( stats ) ? System.nanoTime() : 0L;

+		final String cacheKey;
+		if ( expectedResultType != null && expectedResultType.isArray() ) {
+			cacheKey = queryString + "_array";


As discussed, prepending "array_" would be better to avoid possible collisions.

beikov

Other than the cache key determination I think this looks good

gavinking · 2022-03-29T18:48:03Z

I'm questioning whether I want to make these consistent.

@sebersole I think what you're seeing is fine and perfectly consistent, as long as this in-memory dupe removal is only done to counteract the effect of a join fetch. Which is, I believe, the case.

OTOH, if dupe removal ever happens for anything other than a fetch join, I think we're really in trouble here. (But I don't think it does.)

sebersole · 2022-03-29T21:34:40Z

It technically does it for all entity results, regardless of whether there are collection join fetches which is of course the only kind of join fetch we care about here.

So here's the question then... Changing this up causes a lot of test-suite failures. Which at this point will affect early adopters of 6.0.

As a case-in-point, I'll work on a version of this that only does de-dup with entity results with collection join fetches. I won't change any tests. Then we can see the impact

gavinking · 2022-03-29T21:36:17Z

I'll work on a version of this that only does de-dup with entity results with collection join fetches. Then we can see the impact

OK, perfect.

…y results with implicit selections

sebersole · 2022-03-29T21:57:43Z

So this PR is finished as far as the outlined behavior. See org.hibernate.orm.test.sql.results.ResultsShapeTests for assertions regarding de-dup.

I'll start on the other promised one shortly. I plan on releasing 6.0 Final tomorrow, so let's get this sorted. Final was supposed to be today, so not stopping that train again.

sebersole · 2022-03-29T22:32:48Z

Here you go - #4924

dreab8 · 2022-03-30T10:28:43Z

hibernate-core/src/main/java/org/hibernate/query/internal/QueryOptionsImpl.java

+	public void setDeDuplicationEnabled(boolean enabled) {
+		this.deDupEnabled = enabled;
+	}
+


who is calling this method?

At the moment, no one; so we could drop it.

I added that as part of initial work for supporting https://hibernate.atlassian.net/browse/HHH-15149. But decided to delay that work until after 6.0 Final to better percolate.

HHH-15133 - Use specified result-type to better infer "shape" of quer…

015de51

…y results with implicit selections

sebersole requested review from beikov, gavinking and dreab8 March 19, 2022 13:23

HHH-15133 - Use specified result-type to better infer "shape" of quer…

8458f6e

…y results with implicit selections

beikov reviewed Mar 29, 2022

View reviewed changes

HHH-15133 - Use specified result-type to better infer "shape" of quer…

79de369

…y results with implicit selections

sebersole mentioned this pull request Mar 29, 2022

HHH-15133 - Use specified result-type to better infer "shape" of query results with implicit selections #4924

Merged

dreab8 reviewed Mar 30, 2022

View reviewed changes

sebersole closed this Mar 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HHH-15133 - Use specified result-type to better infer "shape" of query results with implicit selections #4890

HHH-15133 - Use specified result-type to better infer "shape" of query results with implicit selections #4890

sebersole commented Mar 19, 2022

sebersole commented Mar 19, 2022

gavinking commented Mar 19, 2022

sebersole commented Mar 19, 2022

sebersole commented Mar 20, 2022

sebersole commented Mar 20, 2022

sebersole commented Mar 21, 2022

beikov commented Mar 21, 2022

sebersole commented Mar 21, 2022

dreab8 commented Mar 21, 2022

beikov commented Mar 21, 2022

sebersole commented Mar 21, 2022

beikov commented Mar 22, 2022

sebersole commented Mar 22, 2022

sebersole commented Mar 28, 2022

beikov Mar 29, 2022

beikov left a comment

gavinking commented Mar 29, 2022

sebersole commented Mar 29, 2022

gavinking commented Mar 29, 2022

sebersole commented Mar 29, 2022

sebersole commented Mar 29, 2022

dreab8 Mar 30, 2022

sebersole Mar 30, 2022

HHH-15133 - Use specified result-type to better infer "shape" of query results with implicit selections #4890

HHH-15133 - Use specified result-type to better infer "shape" of query results with implicit selections #4890

Conversation

sebersole commented Mar 19, 2022

sebersole commented Mar 19, 2022

gavinking commented Mar 19, 2022

sebersole commented Mar 19, 2022

sebersole commented Mar 20, 2022

sebersole commented Mar 20, 2022

sebersole commented Mar 21, 2022

beikov commented Mar 21, 2022

sebersole commented Mar 21, 2022

dreab8 commented Mar 21, 2022

beikov commented Mar 21, 2022

sebersole commented Mar 21, 2022

beikov commented Mar 22, 2022

sebersole commented Mar 22, 2022

sebersole commented Mar 28, 2022

beikov Mar 29, 2022

Choose a reason for hiding this comment

beikov left a comment

Choose a reason for hiding this comment

gavinking commented Mar 29, 2022

sebersole commented Mar 29, 2022

gavinking commented Mar 29, 2022

sebersole commented Mar 29, 2022

sebersole commented Mar 29, 2022

dreab8 Mar 30, 2022

Choose a reason for hiding this comment

sebersole Mar 30, 2022

Choose a reason for hiding this comment