Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use deterministic map in equality inference #16440

Merged
merged 1 commit into from
Mar 9, 2023

Conversation

sopel39
Copy link
Member

@sopel39 sopel39 commented Mar 8, 2023

No description provided.

@sopel39 sopel39 requested a review from findepi March 8, 2023 16:02
@cla-bot cla-bot bot added the cla-signed label Mar 8, 2023
@@ -211,7 +212,7 @@ public static EqualityInference newInstance(Metadata metadata, Collection<Expres
Collection<Set<Expression>> equivalentClasses = equalities.getEquivalentClasses();

// Map every expression to the set of equivalent expressions
Map<Expression, Set<Expression>> byExpression = new HashMap<>();
Map<Expression, Set<Expression>> byExpression = new LinkedHashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Member Author

@sopel39 sopel39 Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want plan generation to be non-deterministic. Other collections in this class are already deterministic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question is why would this cause the plan to be non-deterministic? Where does the iteration order matter in this method?

As far as I can tell, none of the plan tests are flaky because of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is enumeration below:

        // For every non-derived expression, extract the sub-expressions and see if they can be rewritten as other expressions. If so,
        // use this new information to update the known equalities.
        Set<Expression> derivedExpressions = new LinkedHashSet<>();
        for (Expression expression : byExpression.keySet()) {

if you have cos(x)=0, x=y, cos(y)=0 then visit order might matter to what will be considered a derivedExpressions expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test that would fail (sporadically) due to this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about something like:

    @Test(invocationCount = 100)
    public void testDerivedExpressionDeterminism()
    {
        EqualityInference inference = EqualityInference.newInstance(
                metadata,
                equals(nameReference("a"), add("b", "z")),
                equals(nameReference("b"), nameReference("d")),
                equals(nameReference("a"), add("d", "z")));
        // generated equalities should be deterministic
        assertEquals(
                inference.generateEqualitiesPartitionedBy(symbols("a", "b", "z", "d")).getScopeEqualities(),
                ImmutableList.of(equals(nameReference("a"), add("b", "z")), equals(nameReference("b"), nameReference("d"))));
    }

but generateEqualitiesPartitionedBy doesn't filter derived expressions when rewriting so it kind of work.

It might work deterministically with current code, but I'm not 100% sure. Hence I would land it anyway

@sopel39 sopel39 merged commit f113d8c into trinodb:master Mar 9, 2023
@sopel39 sopel39 deleted the ks/determ branch March 9, 2023 09:43
@github-actions github-actions bot added this to the 411 milestone Mar 9, 2023
@colebow colebow added the no-release-notes This pull request does not require release notes entry label Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed no-release-notes This pull request does not require release notes entry
Development

Successfully merging this pull request may close these issues.

3 participants