-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetch all table/view names across schemas with one Hive metastore call (when Hive views enabled) #17127
Fetch all table/view names across schemas with one Hive metastore call (when Hive views enabled) #17127
Conversation
c356de6
to
b6c1819
Compare
...trino-main/src/main/java/io/trino/connector/informationschema/InformationSchemaMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/cache/CachingHiveMetastore.java
Show resolved
Hide resolved
...in/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftMetastoreClient.java
Outdated
Show resolved
Hide resolved
de9770a
to
6c27aa7
Compare
...rino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftHiveMetastoreClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/ForwardingHiveMetastore.java
Show resolved
Hide resolved
...rino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftHiveMetastoreClient.java
Outdated
Show resolved
Hide resolved
.../trino-hive/src/test/java/io/trino/plugin/hive/metastore/cache/TestCachingHiveMetastore.java
Outdated
Show resolved
Hide resolved
.../trino-hive/src/test/java/io/trino/plugin/hive/metastore/cache/TestCachingHiveMetastore.java
Outdated
Show resolved
Hide resolved
6c27aa7
to
049acd8
Compare
@findepi Could you take a look? |
discussed offline |
de05a3b
to
143d40c
Compare
I feel I need to run
WDYT? @kokosing @Praveen2112 @findepi |
Discussed offline |
143d40c
to
d3c6433
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Need to run another round of review on TestHiveMetastoreMetadataQueriesAccessOperations
.../io/trino/plugin/hive/metastore/thrift/TestHiveMetastoreMetadataQueriesAccessOperations.java
Outdated
Show resolved
Hide resolved
.../io/trino/plugin/hive/metastore/thrift/TestHiveMetastoreMetadataQueriesAccessOperations.java
Outdated
Show resolved
Hide resolved
.../io/trino/plugin/hive/metastore/thrift/TestHiveMetastoreMetadataQueriesAccessOperations.java
Outdated
Show resolved
Hide resolved
/** | ||
* @return List of table names from all schemas or Optional.empty if operation is not supported | ||
*/ | ||
Optional<List<SchemaTableName>> getAllTables(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there is any specific reason for placing this below List<String> getAllTables(String databaseName);
can we swap the places ?
@@ -1117,6 +1145,70 @@ public void testDropTable() | |||
assertEquals(mockClient.getAccessCount(), 5); | |||
} | |||
|
|||
@Test | |||
public void testAllDatabases() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extract it as a different commit.
.../trino-hive/src/test/java/io/trino/plugin/hive/metastore/thrift/InMemoryThriftMetastore.java
Outdated
Show resolved
Hide resolved
...rino-hive/src/test/java/io/trino/plugin/hive/metastore/thrift/MockThriftMetastoreClient.java
Show resolved
Hide resolved
.../io/trino/plugin/hive/metastore/thrift/TestHiveMetastoreMetadataQueriesAccessOperations.java
Outdated
Show resolved
Hide resolved
.../io/trino/plugin/hive/metastore/thrift/TestHiveMetastoreMetadataQueriesAccessOperations.java
Show resolved
Hide resolved
d3c6433
to
e5f6141
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/HiveMetastore.java
Outdated
Show resolved
Hide resolved
List<String> getTablesWithParameter(String databaseName, String parameterKey, String parameterValue); | ||
|
||
List<String> getAllViews(String databaseName); | ||
|
||
/** | ||
* @return List of view names from all schemas or Optional.empty if operation is not supported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this include materialized views? let's document
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
// Without translateHiveViews, Hive views are represented as tables in Trino, | ||
// and they should not be returned from ThriftHiveMetastore.getAllViews() call | ||
if (!translateHiveViews) { | ||
return Optional.empty(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could remove this IF and adjust the code elsewhere.
We should follow-up on this.
.../trino-hive/src/test/java/io/trino/plugin/hive/metastore/thrift/InMemoryThriftMetastore.java
Show resolved
Hide resolved
.../io/trino/plugin/hive/metastore/thrift/TestHiveMetastoreMetadataQueriesAccessOperations.java
Show resolved
Hide resolved
the fact that TestHiveMetastoreMetadataQueriesAccessOperations.java is refactored in the commit introducing the improvement makes it hard to see what is the effect of the improvement on the invocation counts. let's maybe introduce the dataprovider in TestHiveMetastoreMetadataQueriesAccessOperations.java as a prep commit (with some TODO / commit comment saying this is just a preparation step) |
eb807cc
to
934fd05
Compare
934fd05
to
d15cdae
Compare
.../io/trino/plugin/hive/metastore/thrift/TestHiveMetastoreMetadataQueriesAccessOperations.java
Show resolved
Hide resolved
d15cdae
to
9282667
Compare
So that they don't clash with subsequent commits new variables Co-authored-by: skrzypo987 <[email protected]>
Introduce trueFalse data provider so improvement will be easy to see.
Big calls listing metadata, like `SELECT * FROM information_schema.tables`, are common in BI tools, especially at the start. Trino then fetches all schemas and iterated over them to list tables and views. With n schemas it has to make 2*n calls to the underlying metastore. For a large number of schemas this may take several minutes if metastore communication overhead is big. This commit introduces a way to fetch all this metadata in one call, using `getTableMeta` thrift call. Other metastore implementations do not support this feature. This thrift call is hidden behind an alternative call. HiveMetastore interface now features two new methods - for fetching all tables and all views. The `CachingHiveMetastore` has two additional singleton cache objects containing full list of tables and views. Co-authored-by: skrzypo987 <[email protected]>
9282667
to
b44a4a5
Compare
@huberty89 and @findepi we had to invent our own release notes entry since none was suggested in the PR description or the release notes ticket or the release notes PR. Please provide more clarity in future PRs. We ended up with
However we are not clear if this applies to all other connectors that use the Hive metastore and therefore need to add this entry to Delta Lake, Iceberg and Hudi connector sections. And we are also not sure about your mention about "when Hive views enabled" .. does that limit the scope to only the Hive connector after all? |
This optimization applies in Hive with HMS, for the rest of connectors which uses HMS I need to verify if further work is needed. |
// Without translateHiveViews, Hive views are represented as tables in Trino, | ||
// and they should not be returned from ThriftHiveMetastore.getAllViews() call | ||
if (!translateHiveViews) { | ||
return Optional.empty(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the impl of this method differs from getAllViews(String databaseName)
. that one seems to return tables with PRESTO_VIEW_FLAG
set to true when translation is disabled instead of empty result.
was this intentional? Or am I misreading the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the initial implementation was supposed to work only when translateHiveViews or when !translateHiveViews (not sure which one).
Description
This PR replaces #16847 but drops last commits because of a regression - I will try to reintroduce that with a fix in separate PR.
Additional context and related issues
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: