Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

window functions with NULL literals in partition by and order by do not work: Internal("Sort operation is not applicable to scalar value NULL") #1194

Closed
Tracked by #1184
alamb opened this issue Oct 28, 2021 · 4 comments
Labels
bug Something isn't working datafusion Changes in the datafusion crate

Comments

@alamb
Copy link
Contributor

alamb commented Oct 28, 2021

Reproducer:

echo "1" > /tmp/foo.csv
cargo run -p datafusion-cli
CREATE EXTERNAL TABLE foo(x int)
STORED AS CSV
LOCATION '/tmp/foo.csv';
> select percent_rank() OVER (partition by x) from foo;
+----------------+
| PERCENT_RANK() |
+----------------+
| 0              |
+----------------+
1 row in set. Query took 0.013 seconds.
> select percent_rank() OVER (partition by null) from foo;
ArrowError(ExternalError(ArrowError(ExternalError(Internal("Sort operation is not applicable to scalar value NULL")))))


> select percent_rank() OVER (partition by x order by null) from foo;
ArrowError(ExternalError(ArrowError(ExternalError(Internal("Sort operation is not applicable to scalar value NULL")))))

The correct output is null

See #1184 for more details

@alamb alamb changed the title window functions with NULL literals in partition by and order by do not work window functions with NULL literals in partition by and order by do not work: Internal("Sort operation is not applicable to scalar value NULL") Oct 28, 2021
@alamb alamb added datafusion Changes in the datafusion crate bug Something isn't working labels Oct 28, 2021
@alamb
Copy link
Contributor Author

alamb commented Oct 28, 2021

#1195 may be the same underlying cause

@xudong963
Copy link
Member

#1195 may be the same underlying cause

@alamb Nice! It's the same cause.

> CREATE EXTERNAL TABLE foo(x int)
STORED AS CSV
LOCATION '/tmp/foo.csv';
0 rows in set. Query took 0.002 seconds.
> select percent_rank() OVER (partition by x) from foo;
+----------------+
| PERCENT_RANK() |
+----------------+
| 0              |
+----------------+
1 row in set. Query took 0.019 seconds.
> select percent_rank() OVER (partition by null) from foo;
+----------------+
| PERCENT_RANK() |
+----------------+
| 0              |
+----------------+
1 row in set. Query took 0.014 seconds.
> select percent_rank() OVER (partition by x order by null) from foo;
+----------------+
| PERCENT_RANK() |
+----------------+
| 0              |
+----------------+
1 row in set. Query took 0.014 seconds.
>

@xudong963
Copy link
Member

Maybe the issue can close too.

@alamb
Copy link
Contributor Author

alamb commented Oct 29, 2021

As @xudong963 mentions, closed in #1197

@alamb alamb closed this as completed Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working datafusion Changes in the datafusion crate
Projects
None yet
Development

No branches or pull requests

2 participants