Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary and brittle tests #16765

Merged
merged 1 commit into from
Mar 28, 2023
Merged

Conversation

martint
Copy link
Member

@martint martint commented Mar 28, 2023

Column names coming out of the query are not necessarily related to the column names in the table function. These tests are testing behavior that is not necessarily expected or guaranteed, so they are brittle and can break at any time.

Release notes

(x) This is not user-visible or docs only and no release notes are required.

@cla-bot cla-bot bot added the cla-signed label Mar 28, 2023
@kasiafi
Copy link
Member

kasiafi commented Mar 28, 2023

It seems that the method hasColumnNames() is only used in tests for table functions. Are you going to remove it?

@hashhar
Copy link
Member

hashhar commented Mar 28, 2023

Column names coming out of the query are not necessarily related to the column names in the table function

It might not be true for all table functions but if it doesn't remain true for query passthrough functions then the usability of those will become questionable since the expectation is that if you select some columns within the PTF the output includes columns of the same name so that you can access them in the outer query.

Or do I misunderstand your statement?

@kasiafi
Copy link
Member

kasiafi commented Mar 28, 2023

Column names coming out of the query are not necessarily related to the column names in the table function

@hashhar it is not about table functions but about usefulness of testing table function's output names by inspecting the query output. There are two reasons why we shouldn't do it I can think of:

  1. we don't follow ansi identifier semantics. The column names might change between the output of the TF, and the query output
  2. at the query output all columns have names. Within the query they might not. A TF can produce an anonymous column, but the test will see "_col0".

@hashhar
Copy link
Member

hashhar commented Mar 28, 2023

at the query output all columns have names. Within the query they might not. A TF can produce an anonymous column, but the test will see "_col0".

Good point. I hadn't considered this case.

Also the first point you mention already happens for example for H2 where column names differ in case.

Thanks for explaining the issues.

@hashhar
Copy link
Member

hashhar commented Mar 28, 2023

@martint do you think it'd be good to include the two possible cases as examples in commit message. I'm sure others would wonder as well when looking at commit in future.

@github-actions github-actions bot added the bigquery BigQuery connector label Mar 28, 2023
Column names coming out of the query are not necessarily
related to the column names in the table function. These
tests are testing behavior that is not necessarily expected
or guaranteed, so they are brittle and can break at any time.

A couple of reasons why it's problematic:
* Trino doesn't (yet) follow standard SQL identifier semantics. The
  column names might change between the output of the table function
  and the query output
* At the query output all columns have names. Within the query they
  might not. A table function can produce an anonymous column, but
  the test will see "_col0".
@martint martint merged commit 4d34ed9 into trinodb:master Mar 28, 2023
@martint martint deleted the remove-tests branch March 28, 2023 23:09
@github-actions github-actions bot added this to the 411 milestone Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery BigQuery connector cla-signed
Development

Successfully merging this pull request may close these issues.

3 participants