Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SELECT on column with uppercase column name fails with FieldNotFound error #2978

Closed
mcassels opened this issue Jul 27, 2022 · 5 comments
Closed
Labels
bug Something isn't working

Comments

@mcassels
Copy link
Contributor

Describe the bug

select UPPERCASE_NAME from t; fails with a FieldNotFound SchemaError like this:

SchemaError(FieldNotFound { qualifier: None, name: "uppercase_name", valid_fields: Some(["t.UPPERCASE_NAME"]) })

The name in the FieldNotFound error is the lower-case version of the selected column name.

To Reproduce

Using datafusion-cli on the attached test parquet file:

❯ CREATE EXTERNAL TABLE t STORED AS PARQUET LOCATION 'test.parquet';
0 rows in set. Query took 0.004 seconds.
❯ select * from t;
+----------------+--------------------+
| UPPERCASE_NAME | not_uppercase_name |
+----------------+--------------------+
| 3              | 3                  |
| 2              | 2                  |
| 1              | 1                  |
| 0              | 0                  |
+----------------+--------------------+
4 rows in set. Query took 0.008 seconds.
❯ select not_uppercase_name from t;
+--------------------+
| not_uppercase_name |
+--------------------+
| 3                  |
| 2                  |
| 1                  |
| 0                  |
+--------------------+
4 rows in set. Query took 0.007 seconds.
❯ select UPPERCASE_NAME from t;
SchemaError(FieldNotFound { qualifier: None, name: "uppercase_name", valid_fields: Some(["t.UPPERCASE_NAME", "t.not_uppercase_name"]) })
❯ 

Expected behavior

Expected that SELECTs on columns with uppercase names would behave the same as SELECTs on columns with lowercase names.

Additional context

Test parquet to reproduce the issue:
test.parquet.zip

@mcassels mcassels added the bug Something isn't working label Jul 27, 2022
@alamb
Copy link
Contributor

alamb commented Jul 27, 2022

I think you may need to use " to surround the identifier. Does this work?

select "UPPERCASE_NAME" from t;

@mcassels
Copy link
Contributor Author

Thanks @alamb ! That does work and it looks like this behaviour is consistent with postgres as well. I'm closing this issue.

@alamb
Copy link
Contributor

alamb commented Jul 27, 2022

Thanks @alamb ! That does work and it looks like this behaviour is consistent with postgres as well. I'm closing this issue.

Yeah, it is part of the SQL standard which is somewhat archaic 🤷

@ray-andrew
Copy link

I have this problem as well. It's works to use this "fileld_name" as a quoted identifier. Is there a plan to support backtick (`) for "Quoted identifier"?

@alamb
Copy link
Contributor

alamb commented Sep 8, 2022

Is there a plan to support backtick (`) for "Quoted identifier"?

I don't know of any plans @ray-andrew -- I think the backtick is a mysql syntax where the postgres dialect, that DataFusion follows, uses "

Here is mysql:

mysql> select * from bar;
+------+
| x    |
+------+
|    1 |
| NULL |
+------+
2 rows in set (0.01 sec)

mysql> select `x` from bar;
+------+
| x    |
+------+
|    1 |
| NULL |
+------+
2 rows in set (0.00 sec)

mysql> select "x" from bar;
+---+
| x |
+---+
| x |
| x |
+---+
2 rows in set (0.00 sec)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants