Use deterministic map in equality inference #16440

sopel39 · 2023-03-08T16:02:47Z

No description provided.

martint · 2023-03-08T16:06:48Z

core/trino-main/src/main/java/io/trino/sql/planner/EqualityInference.java

@@ -211,7 +212,7 @@ public static EqualityInference newInstance(Metadata metadata, Collection<Expres
        Collection<Set<Expression>> equivalentClasses = equalities.getEquivalentClasses();

        // Map every expression to the set of equivalent expressions
-        Map<Expression, Set<Expression>> byExpression = new HashMap<>();
+        Map<Expression, Set<Expression>> byExpression = new LinkedHashMap<>();


We don't want plan generation to be non-deterministic. Other collections in this class are already deterministic

I guess my question is why would this cause the plan to be non-deterministic? Where does the iteration order matter in this method?

As far as I can tell, none of the plan tests are flaky because of this.

There is enumeration below:

// For every non-derived expression, extract the sub-expressions and see if they can be rewritten as other expressions. If so, // use this new information to update the known equalities. Set<Expression> derivedExpressions = new LinkedHashSet<>(); for (Expression expression : byExpression.keySet()) {

if you have cos(x)=0, x=y, cos(y)=0 then visit order might matter to what will be considered a derivedExpressions expression.

Can you add a test that would fail (sporadically) due to this?

I was thinking about something like:

@Test(invocationCount = 100) public void testDerivedExpressionDeterminism() { EqualityInference inference = EqualityInference.newInstance( metadata, equals(nameReference("a"), add("b", "z")), equals(nameReference("b"), nameReference("d")), equals(nameReference("a"), add("d", "z"))); // generated equalities should be deterministic assertEquals( inference.generateEqualitiesPartitionedBy(symbols("a", "b", "z", "d")).getScopeEqualities(), ImmutableList.of(equals(nameReference("a"), add("b", "z")), equals(nameReference("b"), nameReference("d")))); }

but generateEqualitiesPartitionedBy doesn't filter derived expressions when rewriting so it kind of work.

It might work deterministically with current code, but I'm not 100% sure. Hence I would land it anyway

Use deterministic map in equality inference

a79fd68

sopel39 requested a review from findepi March 8, 2023 16:02

cla-bot bot added the cla-signed label Mar 8, 2023

martint reviewed Mar 8, 2023

View reviewed changes

martint approved these changes Mar 8, 2023

View reviewed changes

sopel39 merged commit f113d8c into trinodb:master Mar 9, 2023

sopel39 deleted the ks/determ branch March 9, 2023 09:43

github-actions bot added this to the 411 milestone Mar 9, 2023

colebow mentioned this pull request Mar 14, 2023

Add Trino 411 release notes #16552

Merged

colebow added the no-release-notes This pull request does not require release notes entry label Mar 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use deterministic map in equality inference #16440

Use deterministic map in equality inference #16440

sopel39 commented Mar 8, 2023

martint Mar 8, 2023

sopel39 Mar 8, 2023 •

edited

Loading

martint Mar 8, 2023

sopel39 Mar 8, 2023

martint Mar 8, 2023

sopel39 Mar 8, 2023

Use deterministic map in equality inference #16440

Use deterministic map in equality inference #16440

Conversation

sopel39 commented Mar 8, 2023

martint Mar 8, 2023

Choose a reason for hiding this comment

sopel39 Mar 8, 2023 • edited Loading

Choose a reason for hiding this comment

martint Mar 8, 2023

Choose a reason for hiding this comment

sopel39 Mar 8, 2023

Choose a reason for hiding this comment

martint Mar 8, 2023

Choose a reason for hiding this comment

sopel39 Mar 8, 2023

Choose a reason for hiding this comment

sopel39 Mar 8, 2023 •

edited

Loading