Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement IGNORE NULLS for window functions #9055

Closed
6 tasks done
comphead opened this issue Jan 29, 2024 · 8 comments
Closed
6 tasks done

Implement IGNORE NULLS for window functions #9055

comphead opened this issue Jan 29, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@comphead
Copy link
Contributor

comphead commented Jan 29, 2024

Is your feature request related to a problem or challenge?

Now DataFusion lacks the IGNORE NULLS support for Window functions

DataFusion CLI v35.0.0
❯ select lag(a) ignore nulls over (order by a) from (select 'a' a union all select 'b' a);
This feature is not implemented: Null treatment in aggregate functions is not supported: IGNORE NULLS

Describe the solution you'd like

Add IGNORE NULLS support for:

Describe alternatives you've considered

No response

Additional context

No response

@comphead comphead added the enhancement New feature or request label Jan 29, 2024
@comphead
Copy link
Contributor Author

@sunchao @viirya cc

@comphead
Copy link
Contributor Author

@alamb are we okay to support such syntax in DF?

select first_value(a) ignore nulls over (partition by id order by id) from (select 1 id, 'a' a union all select 1 id, 'b' a union all select 1 id, null); 

Spark/Oracle supports it fully
PG doesn't support such syntax at all.
DuckDB supports slightly different syntax

select first_value(a ignore nulls) over (partition by id order by id) from (select 1 id, 'a' a union all select 1 id, 'b' a union all select 1 id, null); 

@viirya
Copy link
Member

viirya commented Jan 30, 2024

Note that looks like Spark syntax follows SQL 2016 standard:

https://github.com/ronsavage/SQL/blob/19215adc8639d031a44acad7873c209444b71f1f/sql-2016.ebnf#L1397

lead_or_lag_function ::=
  lead_or_lag left_paren lead_or_lag_extent
      ( comma offset ( comma default_expression )? )? right_paren
      null_treatment?
...
null_treatment ::=
  RESPECT NULLS | IGNORE NULLS

I don't find this syntax in SQL 2003.

@comphead
Copy link
Contributor Author

Thanks @viirya
Looks like currently the DF is able to parse the SQL 2016 standard syntax for this specific case, so it is probably worth to try implementing IGNORE NULLS support

@mustafasrepo cc
if you have any opinions on that?

@comphead
Copy link
Contributor Author

comphead commented Feb 2, 2024

It seems there is no defined opinion yet, I'll create a PR and we can discuss in details

@alamb
Copy link
Contributor

alamb commented Feb 2, 2024

It makes sense to me -- maybe @ozankabak or @mustafasrepo have some thoughts in this area as well

@mustafasrepo
Copy link
Contributor

mustafasrepo commented Feb 5, 2024

Thanks @viirya Looks like currently the DF is able to parse the SQL 2016 standard syntax for this specific case, so it is probably worth to try implementing IGNORE NULLS support

@mustafasrepo cc if you have any opinions on that?

I would be nice to have this feature given that we have parsing support already, and it is a standart feature. Thanks @comphead for bringing this up.

@comphead
Copy link
Contributor Author

Nice, everything is done, so closing it.
thanks @mustafasrepo @huaxingao @viirya @alamb @ozankabak

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants