-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the performance of Min/Max/Count utilizing Iceberg table metrics #10974
Comments
This shouldn't be implemented as an optimizer rule. Instead, Iceberg connector should support
|
See also #10964 which could benefit with a similar concept. |
@findepi Thanks for your reply. Yes, indeed what you mentioned is the key points, I don't agree that any more, and that's why I said There two ways to do the optimization, possibly I thought:
But the two approaches are both based on the assumption: Considering the implementation complexity, we just simply implemented the easier approach, After all, the rule approach is the cheapest, while the mix-mode approach is the ideal, but need much more nuts to crack. :) |
oh, you mean base the logic on per #18, the API to use for aggregation pushdown is the |
Yep, this optimize is only for queries, based on
Thanks for the tips, but I think the mix-mode implementation is far more than this one interface. So you guys prefer the mix-mode implementation? :) |
@findepi can you please share any update on this, are we planning to use the |
I would probably start with an implementation for |
These must not be different, otherwise predicate pushdown would be totally wrong. But min/max can be inaccurate (varchars truncated, timestamps rounded).
When some files have row count in the manifest, but some do not, microplans could be useful -- #13534 |
IIRC in Delta we had to skip using min/max values for pushdown of Double types in certain situations. |
You mean this trino/plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java Lines 2543 to 2584 in f2b557f
it's not strictly because of ordering (as in |
@findepi @alexjo2144 even if the aggregation like So, shall we start with supporting aggregate push down for Iceberg and handle the case for |
@alexjo2144 I have started working on a PR for the |
Hi. I am new to iceberg and I was also thinking on similar lines , though from a different perspective. |
Quick update: |
@ahshahid thanks |
@osscm Well what i meant was that if this PR is able to provide max min support at iceberg level using the stats ( atleast in simple scenarios), then it may be possible to leverage this for using Dynamic Partition Pruning (DPP) mechanism of spark to work for non partition columns too. Right now when I tried to make use of DPP for non partition col( by modifying iceberg code), the perf degraded as the cost of DPP query is too high. But if the max/min gets evlauted using stats of manifest files, then possibly cost of dpp query for non partition cols can be brough down.. |
@ahshahid I'm afraid not, I think you are talking about the the output of Right now stats output looks like this (after running
|
Started a issue to target |
Anyway, I will post a rough implementation from my branch recently, for |
Just FYI,
I have started a PR for count agg pushdown.
Regards,
Manish
…On Fri, Feb 17, 2023 at 2:09 AM fengguangyuan ***@***.***> wrote:
Quick update: Refining some of the work in the local, then will start WIP,
as will continue to add test-cases etc..
Thanks for you first sub task of this BIG issue.
Anyway, I will post a rough implementation from my branch recently, for
min/max/count mixed queries, and hope it help us to hackle the
possibilities and impossibilities of this IDEAR.
:)
—
Reply to this email directly, view it on GitHub
<#10974 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AXQ2PYQJQ4EEUHYYLC6PW3LWX5E4RANCNFSM5NZQ4W7A>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks, that's great. It's my pleasure to know that you have been working on |
@osscm What is the status of iceberg aggregate pushdown ? Are you still working on count pushdown ? Is other aggregate function like min/max is also being worked upon by someone ? |
Ok got it. I am aware about that issue. Thanks |
FWIW #19303 should improve performance for |
Purpose
This issue is aimed to have a basic optimize rule for
min/max/count
queries on the connectors having accuratetable/partitions/columns statistics
, likeIceberg
composed of Orc/Parquet files.Reason
Nowadays, most storage engines or self-describing files are storing table level/partition level/column level statistics to supply a more effective ability of data retrieval, e.g. Iceberg, hive.
We know that Iceberg is now supporting
Orc/Parquet
files, of which table metrics are aggregated from each data file, therefore it's table metrics is trustworthy for calculatingmin(T)/max(T)/count(T)/count(*)
, no matter the stored data is written by Trino or Spark, hence we can manually construct the results, for the queries only with min/max/count aggregations, from metadata.For example, for query
select count(x) from test
, if column x has precomputed statistics with2 total rows, 0 null values and [0, 9] range
, the query could be rewritten toselect 2
, in which2
is the difference between total rows and nulls count.Conclusion
Trino should supply an optimize rule to rewrite the queries from metadata, doing the stuff like hive-2847. Obviously, this optimize rule is adaptive to the simple queries without complex syntax, such as
group by, distinct, join
etc.Now we had a basic implementation on this issue and tested on Iceberg connector (instead on all connectors considering the statistics maybe inaccurate), if Trino expect this improvement, please let me know, and it's my pleasure to make a PR.
The text was updated successfully, but these errors were encountered: