-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix tests for underestimating size of partitioned Glue hive tables #17782
Conversation
@raunaqmorarka please run this with secrets |
LGTM, thanks @Dith3r |
What was broken? I.e. what problem is this fix for? |
/test-with-secrets sha=8b8c7988a356de42e8250c3258b61a661d236ea4 |
// When the table has partitions, but row count statistics are set to zero, we treat this case as empty | ||
// statistics to avoid underestimation in the CBO. This scenario may be caused when other engines are | ||
// used to ingest data into partitioned hive tables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test override is in Glue tests, but comment wording doesn't sound Glue-specific.
Please explain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR which introduced this regression was about altering hive metastore response for partitioned tables when metastore returned zero row count and is consistent across all impacted implementations (excluding in memory, file). Provided link to PR in the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see similar overrides in other tests added in #17677.
shouldn't we rather change default test behavior to be what-we-want-most, and have the overrides for metastores that lag behind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per offline discussion. override is needed only for those (broken) metastores that incorrectly return rowCount=0 when table/partition has not been analyzed.
It is fix for https://github.com/trinodb/trino/actions/runs/5191815702/jobs/9360151132 which was introduced by 6028dec. Test for Glue were not triggered by PR. |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/5200127902 |
/test-with-secrets sha=692cad81648309aa5b07ca064d8be5319679dd65 |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/5203972811 |
Description
Fix tests for underestimating size of partitioned Glue hive tables introduced by #17677.
Release notes
(X) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: