Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for time_floor function #8814

Open
jerryleooo opened this issue Aug 6, 2021 · 3 comments
Open

Support for time_floor function #8814

jerryleooo opened this issue Aug 6, 2021 · 3 comments
Labels

Comments

@jerryleooo
Copy link
Member

Druid has several query types and one of them is timeseries, in the timeseries query we need to write GROUP BY FLOOR(__time TO <unit>) or TIME_FLOOR(__time, period) to set the time granularity of aggregation.

I haven't found a good way to do such queries via Trino. #7994 seems helpful, but even it works, we may still need a similar concept/syntax in Trino so the SQL will not have a parsing error. So I am not sure which term would best describe such concept in Trino, truncation, interal, duration, or, we can have a new function just named time_floor?

@jerryleooo
Copy link
Member Author

After some research, I feel this belongs to the aggregation pushdown scope, currently the GROUP BY is represented with groupingSets, while currently in

Optional<AggregationApplicationResult<TableHandle>> applyAggregation(

They are represented with ColumnHandle, so do we have a plan to extend this?

@jerryleooo
Copy link
Member Author

jerryleooo commented Aug 12, 2021

It looks like the truncation is the correct concept, then a similar SQL would be:

select sum(view_cnt), date_trunc('day', __time) from page_view_table group by date_trunc('day', __time);

If so, then the corresponding optimization rule would be PushProjectionIntoTableScan?

@willzgw
Copy link
Contributor

willzgw commented Aug 12, 2021

It looks like the truncation is the correct concept, then a similar SQL would be:

select sum(view_cnt), date_trunc('day', __time) from page_view_table group by date_trunc('day', __time);

If so, then the corresponding optimization rule would be PushProjectionIntoTableScan?

According to these:
https://trinodb.slack.com/archives/CP1MUNEUX/p1615285437212600
https://stackoverflow.com/questions/47066024/how-to-group-time-column-into-5-second-intervals-and-count-rows-using-presto/47741138#47741138
It is a functionally feasible solution.

@findepi findepi added enhancement New feature or request syntax-needs-review labels Aug 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants