Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent Hive from returning table handles for Hudi tables #16605

Conversation

homar
Copy link
Member

@homar homar commented Mar 17, 2023

Description

Additional context and related issues

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Mar 17, 2023
@github-actions github-actions bot added hive Hive connector tests:hive labels Mar 17, 2023
@findepi
Copy link
Member

findepi commented Mar 17, 2023

/test-with-secrets sha=3fbb7a66c17124359eb5bb392ac84066372a445f

@homar
Copy link
Member Author

homar commented Mar 17, 2023

@findepi I guess we can merge, cant we ?

@findepi findepi merged commit 3b16921 into trinodb:master Mar 17, 2023
@github-actions github-actions bot added this to the 411 milestone Mar 17, 2023
@colebow
Copy link
Member

colebow commented Mar 29, 2023

Does this need release notes? cc @findepi

@findepi
Copy link
Member

findepi commented Mar 31, 2023

@colebow nope

@rahil-c
Copy link

rahil-c commented May 9, 2023

@homar @findepi

Hi all, just wanted to get context for this change? Just wanted to confirm my understanding but if customers/users want to use Trino with hudi via the trino-hive connector will this change break them in newer releases, and their only option is to use trino-hudi connector?

cc @codope

@findepi
Copy link
Member

findepi commented May 10, 2023

i don't think trino-hive connector can be used to read Hudi tables. For instance, I believe it lacks necessary libraries on its classpath

@codope
Copy link
Contributor

codope commented May 10, 2023

@findepi Hudi tables are queryable using the Hive connector provided users have the hudi presto/trino bundle under the hive plugin directory. This is done through the custom input format integration. This change has been there even before Trino was forked off Presto.
This patch will be a breaking change for those who use the Hive connector to query Hudi tables. Is it possible to revert this change?

@vinothchandar
Copy link

Landing this without us having the necessary migration path to hudi-connector, will cause pain & break production queries for users out there. @bvaradar this is going to break production at Robinhood.

We are still waiting on/working through PR reviews there to get hudi-connector on parity.

@vinothchandar
Copy link

cc @electrum Can we please be notified about such breaking changes in the future, so we can be part of the conversation.

@vburenin
Copy link

@homar Trino-hudi connector is not efficient enough and can't trim the search scope using metadata or metastore efficiently resulting in basically a full table scan. We have been using hive-connector for a long time to read hudi tables and it just works for us. I had to undo this change on our side to get things back to normal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed hive Hive connector
Development

Successfully merging this pull request may close these issues.

7 participants