-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use MarkDistinct only in limited cases #15927
Use MarkDistinct only in limited cases #15927
Conversation
8a1a763
to
e113af3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments. I think the approach is fine (although it might hurt from NDV misestimation, but it's life). Please run benchmarks. I'm curious if we will see some gains
core/trino-main/src/main/java/io/trino/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/OptimizerConfig.java
Outdated
Show resolved
Hide resolved
...main/java/io/trino/sql/planner/iterative/rule/MultipleDistinctAggregationToMarkDistinct.java
Outdated
Show resolved
Hide resolved
...main/java/io/trino/sql/planner/iterative/rule/MultipleDistinctAggregationToMarkDistinct.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/planner/optimizations/TestAddExchangesPlans.java
Show resolved
Hide resolved
} | ||
int numberOfThreadsInACluster = taskCountEstimator.estimateSourceDistributedTaskCount(context.getSession()) * getTaskConcurrency(context.getSession()); | ||
|
||
if (numberOfDistinctValues <= MARK_DISTINCT_MAX_OUTPUT_ROW_COUNT_MULTIPLIER * numberOfThreadsInACluster) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One disadvantage of not using MarkDistinct
is that without MarkDistinct
partial aggregations are not possible. Maybe it's not a problem, but I could imagine it might be if data is already pre-partitioned as MarkDistinct
requires
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you think of a case where this can happen (query)? I could then test it and add this case to the rule.
Can partial aggregation be pushed through MarkDistinct
? if not, the biggest advantage of partial aggregation goes away as the data is shuffled across the network anyway, right?
...main/java/io/trino/sql/planner/iterative/rule/MultipleDistinctAggregationToMarkDistinct.java
Show resolved
Hide resolved
import static io.trino.sql.planner.iterative.rule.test.PlanBuilder.expression; | ||
import static io.trino.sql.planner.plan.AggregationNode.Step.SINGLE; | ||
|
||
public class TestMultipleDistinctAggregationToMarkDistinct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We could also disable MarkDistinct
when number of stages would exceed limit, but this can be another PR and it's not that simple...
e113af3
to
b3bb96d
Compare
tpch/tpcds orc part sf1k benchamrks look good (4 to 5 % CPU + similar fo duration) but this may be variability as queries that don't have distinct aggregation also speed up (e.g. tpch q17, 18, q21 ). |
...main/java/io/trino/sql/planner/iterative/rule/MultipleDistinctAggregationToMarkDistinct.java
Outdated
Show resolved
Hide resolved
...main/java/io/trino/sql/planner/iterative/rule/MultipleDistinctAggregationToMarkDistinct.java
Show resolved
Hide resolved
The network didn't go down so I'm not sure that rule triggered |
@sopel39 I checked, there are no |
b3bb96d
to
2f8f3dc
Compare
Following the discussion, I changed the approach to |
2f8f3dc
to
2afa052
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
core/trino-main/src/main/java/io/trino/sql/planner/OptimizerConfig.java
Outdated
Show resolved
Hide resolved
...main/java/io/trino/sql/planner/iterative/rule/MultipleDistinctAggregationToMarkDistinct.java
Outdated
Show resolved
Hide resolved
2afa052
to
bd83187
Compare
bd83187
to
021a4a0
Compare
MarkDistinct is beneficial only when there is a limited parallelism in query (e.g. global distinct aggregation is handled by a single thread). For cases where there is enough parallelism, MarkDistinct hurts performance due to excessive data shuffle and introducing multiple stages in case of multiple distinct aggregations.
021a4a0
to
e245caf
Compare
|
Description
MarkDistinct is beneficial only when there is
a limited parallelism in query (e.g. global distinct aggregation is handled by a single thread).
For cases where there is enough parallelism, MarkDistinct hurts performance due to excessive data shuffle
and introducing multiple stages in case of
multiple distinct aggregations.
This PR also changes the type of
use_mark_distinct
session property andoptimizer.use-mark-distinct
config property fromBoolean
toVarchar
. The new property still acceptstrue
andfalse
values to keep backwards compatibility but those have to be specified as Varchars (e.g. single quotes in the cli)Additional context and related issues
To test different possible approaches I used queries that use different count of distinct values in a group by and 1 or 2 distinct aggregations + one non distinct aggregation.
I used three different cardinalities:
small: 502 unique values
medium: 10K unique values
high 240M unique values
The tests show that for small NDV mark distinct works better and for high NDV distinct aggregation is better.
In the medium case,
optimize_mixed_distinct_aggregations
is better for the duration at a significant CPU cost.Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(X) Release notes are required, with the following suggested text: