-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lag()/at()/lead()
return show offset'th row, it is not related to window frame bound
#1554
Comments
the problem might be complex, I just have a look for the output for SQL: select
`row_id` as row_id_1,
`row_id` as t1_row_id_original_0,
`val1` as val1,
lag(`val1`, 0) over t1_group1_ts_1s_5s_10 as t1_val1_window_count_2
from `t1` WINDOW
t1_group1_ts_1s_5s_10 as (partition by `group1` order by `ts` rows_range between 5s preceding and 1s preceding MAXSIZE 10); is:
here
SparkSQL do not support lag over window frame, so there is no standard from spark. But to get first or nth row in window, there is other reference: |
previous definition for |
the documents seems not correct or accurate OpenMLDB/hybridse/src/udf/default_defs/window_functions_def.cc Lines 97 to 115 in 06ccd67
|
neither the yaml case is correct. actual result should be:
|
lag()/lead()
return show offset'th row, it is not related to window frame bound
@jingchen2222 can you help take a look, and probably give some suggestions where we should look into this issue? |
@aceforeverd You're right. The lag should have returned the expression of the row preceding n offset from the current row. The current implementation of the I think it would be a tough job. select id, ts, `group`, lag(val, 1) over (partition by `group` order by ts rows between 5 preceding and current row) as l from t1; Secondly, fix the select id, ts, `group`, lag(val, 1) over (partition by `group` order by ts rows between 5 preceding and 2 preceding) as l from t1;
Things will be worse if we are trying to deal with expression as follows: |
@jingchen2222 thank you very much. @aceforeverd can you first help evaluate the workload. i will call a meeting to discuss the solution. |
sure |
Thank you ~. Since the problem is the It is some kind misleading in SQL view as |
lag()/lead()
return show offset'th row, it is not related to window frame boundlag()/at()/lead()
return show offset'th row, it is not related to window frame bound
The first thing I'd like to do is in the Planner when transform from Meanwhile SQL engine should support the syntax without window frame or even without partition by, e.g |
@jingchen2222 I can't understand the third point though
|
…rent row - logic plan: for lag/at project, it will create a new `ProjectListNode` with window frame bound to [unbound, current row] - the fix may not work in batch-request or cluster environment
Giving all lag/at function the same window frame as UNBOUND AND CURRENT ROW might introduce another problem. |
Yeah. We do not seem to have restriction for the type of |
|
yeah, it is runnable e.g |
…rent row - logic plan: for lag/at project, it will create a new `ProjectListNode` with window frame bound to [unbound, current row] - the fix may not work in batch-request or cluster environment
it will create a new rows window for lag like functions
…rent row - logic plan: for lag/at project, it will create a new `ProjectListNode` with window frame bound to [unbound, current row] - the fix may not work in batch-request or cluster environment
it will create a new rows window for lag like functions
* add a few debug log for sql parser * fix(#1554): lag results always evaluated with respect to current row
Bug Description
Expected Behavior
the case should pass. Current result is not:
Work List
lag
functionsat/lag/first_value
udf function over merged window report different result compared to those not merged #1587 )The text was updated successfully, but these errors were encountered: