-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to redirect table reads from Hive to Iceberg #8340
Conversation
4d6f0f9
to
a303baa
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
throw new TrinoException(NOT_SUPPORTED, "Hive Connector doesn't support modification operations (write, DDL, comment, statistics collection, set authorization) " + | ||
"on Iceberg tables when redirection is enabled"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that redirects on modification operations are a bad idea?
then why do we support them at all?
if they are a good idea in general, why would we want to opt out here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that redirects on modification operations are a bad idea?
then why do we support them at all?
does it look better with Fail modifications for redirected tables in engine
commit? If so, I can put it as the first commit. Note that Hive connector check is still required because of procedures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it look better with
Fail modifications for redirected tables in engine
commit?
I am not very fluent with this code, but if this implements @electrum 's thinking #7606 (comment), it should go in separate PR and to be debated separately. There hopefully is nothing special about redirects in Iceberg or Hive connector to warrant DDL-specific checks in the connector code (except, maybe, procedures -- which ones?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to do it in a separate PR, was trying to gather feedback if exceptions thrown in hive redirection tests feel more natural with this change. and yes, that commit doesn't have anything specific to hive connector.
hive connector checks become sort of "illegal state checks" in non-procedure calls, but wouldn't hurt to still keep proper error message there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but wouldn't hurt to still keep proper error message there.
except that it suggests -- to the future reader -- that we're coding some connector-specific behavior.
(and I do care about future readers quite a lot:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. How about changing all connector checks to checkState
, and adding special handling for procedures to throw TrinoException?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or TrinoException here, but with a comment. up to you
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/AbstractTestHive.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/AbstractTestHive.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/AbstractTestHive.java
Outdated
Show resolved
Hide resolved
...no-product-tests/src/main/java/io/trino/tests/product/hive/TestHiveRedirectionToIceberg.java
Outdated
Show resolved
Hide resolved
a303baa
to
4cd0b46
Compare
Queries on information_schema.tables failed when the filters pointed to a specific table, and the table was redirected.
Hive Connector redirects Iceberg table reads to the configured Iceberg catalog. Co-authored by: Xingyuan Lin <[email protected]>
4cd0b46
to
bb368d1
Compare
bb368d1
to
0d7d0b6
Compare
@findepi @losipiuk @raunaqmorarka @electrum this is ready for review. |
@phd3 can you please rebase? |
@@ -72,7 +72,7 @@ public String getName() | |||
{ | |||
Session session = stateMachine.getSession(); | |||
QualifiedObjectName tableName = createQualifiedObjectName(session, statement, statement.getName()); | |||
Optional<TableHandle> tableHandle = metadata.getTableHandle(session, tableName); | |||
Optional<TableHandle> tableHandle = metadata.getOriginalTableHandle(session, tableName, Optional.of(getName())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fail modifications for redirected tables in engine
I am not convinced we should to that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here "modifications" is bad wording, meant DDLs. we already added support for DMLs in #8683
The last discussion that we had on this was #7606 (comment)
DDL operations are problematic since users need to be aware of which connector they are using, since they have different data types, partitioning, bucketing, etc. Our thought was that these are relatively rare operations by more advanced users and that trying to have hidden redirections would end up causing more confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you also feel otherwise for DDLs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users sooner or later will ask why ALTER ... ADD COLUMN (something varchar)
does not work, and I won't be able to explain to them why not. Yes, from engineer perspective, we may have harder time supporting things like column properties (maybe, or maybe not), but i would assume there are not always used, so users who do not intend to use column properties, will not accept this as a rational explanation for the limitation.
TL;DR yes, DDLs like ALTER .. ADD/DROP COLUMN should be routed as well.
@@ -2996,6 +3021,37 @@ public boolean delegateMaterializedViewRefreshToConnector(ConnectorSession sessi | |||
return hiveMaterializedViewMetadata.refreshMaterializedView(session, name); | |||
} | |||
|
|||
@Override | |||
public Optional<CatalogSchemaTableName> redirectTable(ConnectorSession session, SchemaTableName tableName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please update this to #9870?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These credits belong to @MiguelWeezardo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These credits belong to @MiguelWeezardo
thanks @ssheikin for the comment. added @MiguelWeezardo as co-author in the other PR
Seems to be superseded by #10173 |
Hive plugin changes on top of #7606
Fixes #4442