Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HHH-15133 - Use specified result-type to better infer "shape" of query results with implicit selections #4890

Closed
wants to merge 3 commits into from

Conversation

sebersole
Copy link
Member

HHH-15133 - Use specified result-type to better infer "shape" of query results with implicit selections

https://hibernate.atlassian.net/browse/HHH-15133

@sebersole
Copy link
Member Author

2 things of note:

  1. HqlInterpretation refs in QueryInterpretationCache are now expected-result-type sensitive in terms of array v. not-array
  2. @beikov this reverted some changes you made in AbstractSharedSessionContract#createQuery(jakarta.persistence.criteria.CriteriaQuery<T>) for the TCK - at least according to the commit messages. This was another "massive commit" and I was not about to dig through it all. Pretty confident the changes I made follow the spec, but we should verify this does not cause TCK failures.

@gavinking
Copy link
Member

Oh nice! Thanks!

@sebersole
Copy link
Member Author

Another case where I cannot reproduce these failures locally. Locally, the entire build passes with H2 specifically

@sebersole
Copy link
Member Author

Hm, actually I must have been on the wrong branch. I am able to reproduce it. Though as of yet I have no explanation why.

The only thing I can think of is that my change somehow messed up the ListResultsConsumer.UniqueSemantic used. The result of the failing query returns a duplicated row whereas the assertions expect it to be "filtered". The SQL is exactly the same.

I'll look further next week.

@sebersole
Copy link
Member Author

So this is actually an interesting discussion. Consider this (the failing) query:

			List<Object[]> phonePayments = entityManager.createQuery(
				"select p " +
				"from Payment p, Phone ph " +
				"where p.person = ph.person ",
				Object[].class)
			.getResultList();

This comes from documentation. The outcome is wrong even though the test actually passed. The reason it is wrong is that an Object[] was not returned. It passes because it never attempts to access the results. Adding this causes a ClassCastException:

			final Object[] firstRow = phonePayments.get( 0 );

Basically, this was never supported in 6.0 at the moment. My work here actually does add support for it. So within this PR, accessing that first row as above is fine.

However the test now fails because there is currently a mismatch in terms of trying to ascertain whether the row is "duplicate". The query selects Payment. So we ultimately has a single assembler. In the ListConsumer, we ask the RowReader for the result type[1] which is the selection, here Payment. While checking for duplicates it is this JavaType descriptor for Payment used to perform the check. However, because the row is now an array the rows are not considered duplicate.

Most likely, the correct solution is to use the JavaType descriptor for the expected return type[1] rather than the result type[1] when performing the comparisons - assuming the array JavaType descriptor does a proper "deep" comparison which I have not yet verified. Wanted to allow a discussion of this situation before I play more with this.

Essentially, what is the correct outcome here considering an array per-row is being returned in terms of how we check for duplicates? I think the "the filtered result" assertion in the test (the number of results) is correct; it just did not properly check previously for what is actually returned. What say y'all? Everyone agree with that?

[1] There is a distinction between the "result type" as determined strictly based on the query compared to the "expected return type" specified when creating the Query.

@sebersole
Copy link
Member Author

Ultimately this is the code block where this breaks down (ListResultsConsumer#consume):

			if ( uniqueSemantic != UniqueSemantic.NONE ) {
				final Class<R> resultJavaType = rowReader.getResultJavaType();
				if ( resultJavaType != null && !resultJavaType.isArray() ) {
					final EntityPersister entityDescriptor = session.getFactory()
							.getRuntimeMetamodels()
							.getMappingMetamodel()
							.findEntityDescriptor( resultJavaType );
					if ( entityDescriptor != null ) {
						uniqueRows = true;
					}
				}
			}

The only time we perform "unique-ing" is when a single entity is selected.

@beikov
Copy link
Contributor

beikov commented Mar 21, 2022

IMO when Tuple or Object[] is requested as query result type, we shouldn't do any deduplication/filtering, as the result type suggests that the resulting tuple is what people are interested in. If the select clause had one more item, no deduplication/filtering would happen, so not doing it in case of a single select item would be consistent.

@sebersole
Copy link
Member Author

List<TheEntity> entities = session.createQuery( "select e from TheEntity ..." ).list();
List<Integer> ids = session.createQuery( "select e.id from TheEntity ..." ).list();

assert entities.size() != ids.size();

That does not feel strange to you? Sure feels strange to me.

@dreab8
Copy link
Contributor

dreab8 commented Mar 21, 2022

yeap it looks strange to me!

@beikov
Copy link
Contributor

beikov commented Mar 21, 2022

Strange because (1) you think we shouldn't de-duplicate single SqmFrom select items or (2) because you think we should de-duplicate all single select items? IMO (2) might be surprising to many users and would be a change in behavior. De-duplicating single select item SqmFrom i.e. (1) was only introduce in 6.0 btw.

@sebersole
Copy link
Member Author

Strange because the same query will return different number of results based on whether I select the entity or its identifier e.g.

De-duplicating single select item SqmFrom i.e. (1) was only introduce in 6.0 btw.

Since I added it, yes I am aware ;)

@beikov
Copy link
Contributor

beikov commented Mar 22, 2022

Strange because the same query will return different number of results based on whether I select the entity or its identifier e.g.

Are you suggesting to remove the by default de-duplication for single entity select queries or to change to de-duplicating every single select item query?

@sebersole
Copy link
Member Author

I'm questioning whether I want to make these consistent.

I like to think of telling a user about this and how easy it is to explain a feature. Not sure this is easy to explain/understand. Let's talk about it tomorrow when we meet up

@sebersole
Copy link
Member Author

#4917

Function<String, SqmStatement<?>> creator) {
log.tracef( "QueryPlan#resolveHqlInterpretation( `%s` )", queryString );
final StatisticsImplementor statistics = statisticsSupplier.get();
final boolean stats = statistics.isStatisticsEnabled();
final long startTime = ( stats ) ? System.nanoTime() : 0L;

final String cacheKey;
if ( expectedResultType != null && expectedResultType.isArray() ) {
cacheKey = queryString + "_array";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, prepending "array_" would be better to avoid possible collisions.

Copy link
Contributor

@beikov beikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the cache key determination I think this looks good

@gavinking
Copy link
Member

I'm questioning whether I want to make these consistent.

@sebersole I think what you're seeing is fine and perfectly consistent, as long as this in-memory dupe removal is only done to counteract the effect of a join fetch. Which is, I believe, the case.

OTOH, if dupe removal ever happens for anything other than a fetch join, I think we're really in trouble here. (But I don't think it does.)

@sebersole
Copy link
Member Author

It technically does it for all entity results, regardless of whether there are collection join fetches which is of course the only kind of join fetch we care about here.

So here's the question then... Changing this up causes a lot of test-suite failures. Which at this point will affect early adopters of 6.0.

As a case-in-point, I'll work on a version of this that only does de-dup with entity results with collection join fetches. I won't change any tests. Then we can see the impact

@gavinking
Copy link
Member

I'll work on a version of this that only does de-dup with entity results with collection join fetches. Then we can see the impact

OK, perfect.

@sebersole
Copy link
Member Author

So this PR is finished as far as the outlined behavior. See org.hibernate.orm.test.sql.results.ResultsShapeTests for assertions regarding de-dup.

I'll start on the other promised one shortly. I plan on releasing 6.0 Final tomorrow, so let's get this sorted. Final was supposed to be today, so not stopping that train again.

@sebersole
Copy link
Member Author

Here you go - #4924

public void setDeDuplicationEnabled(boolean enabled) {
this.deDupEnabled = enabled;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who is calling this method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, no one; so we could drop it.

I added that as part of initial work for supporting https://hibernate.atlassian.net/browse/HHH-15149. But decided to delay that work until after 6.0 Final to better percolate.

@sebersole sebersole closed this Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants