Proposal to Optimize Trino Hive Metastore Query Latency by Caching createMetastoreClient() #21671

yohengyang · 2024-04-23T12:55:47Z

Upon analyzing Hive and Iceberg queries in Trino, we observed a significant latency during the analyze phase in io.trino.plugin.hive.metastore.cache.CachingHiveMetastore#loadTable as shown below:

@Override
public Optional<Table> getTable(String databaseName, String tableName)
{
    try {
        return retry()
                .stopOn(NoSuchObjectException.class)
                .stopOnIllegalExceptions()
                .run("getTable", stats.getGetTable().wrap(() -> {
                    try (ThriftMetastoreClient client = createMetastoreClient()) {  // Takes 167 ms
                        return Optional.of(client.getTable(databaseName, tableName)); // Takes 20+ ms
                    }
                }));
    }
    ......
}

From the section of code above, we see that the majority of time is consumed at ThriftHiveMetastore:createMetastoreClient(), which is about 167 milliseconds, whereas the actual request to retrieve the table via ThriftMetastoreClient:getTable() consumes significantly less time - around 20+ ms.

Looking at the trace logs:

`---ts=2023-12-22 17:52:34;thread_name=Query-20231222_095234_00058_j8b4z-55744;id=d9c0;is_daemon=true;priority=5;TCCL=io.trino.server.PluginClassLoader@1a2d02d8
    `---[145.533198ms] io.trino.plugin.hive.metastore.thrift.ThriftHiveMetastore:createMetastoreClient()
        `---[145.454194ms] io.trino.plugin.hive.metastore.thrift.IdentityAwareMetastoreClientFactory:createMetastoreClientFor() #2126

We noticed that every transaction creates a new CachingHiveMetastore, meaning the acceleration effect of this cache is only effective within a single query which can't provide acceleration effect for this scenario.

Given the bottleneck at createMetastoreClient(), we're considering an optimization: caching the result of createMetastoreClient(), to avoid recreating the ThriftMetastoreClient for each request. We are interested in your thoughts on this proposed solution.

The text was updated successfully, but these errors were encountered:

findepi · 2024-05-10T19:57:42Z

cc @electrum @dain @raunaqmorarka

electrum · 2024-05-10T22:27:24Z

Are you able to tell why creation of the client takes so much time?

yohengyang · 2024-05-11T06:48:11Z

Are you able to tell why creation of the client takes so much time?

@electrum I haven't delved into the reasons yet.

We used Waggledance (version 3.7.0) as the Metastore(version 3.1.2) router; Trino is actually connected to Waggledance.
The Trino cluster, Waggledance, and Metastore are all in the same AZ, and there is no network bandwidth bottleneck.

The following is the time trace for Trino to create a connection with Waggledance, and it can be seen that the delay is almost on the Waggledance server side. I will continue to track the connection time of Waggledance.

yohengyang mentioned this issue Apr 23, 2024

Caching createMetastoreClient to Speed Up Hive/Iceberg Analyze Duration #21672

Closed

findepi added enhancement New feature or request performance labels May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal to Optimize Trino Hive Metastore Query Latency by Caching createMetastoreClient() #21671

Proposal to Optimize Trino Hive Metastore Query Latency by Caching createMetastoreClient() #21671

yohengyang commented Apr 23, 2024 •

edited

Loading

findepi commented May 10, 2024

electrum commented May 10, 2024

yohengyang commented May 11, 2024 •

edited

Loading

Proposal to Optimize Trino Hive Metastore Query Latency by Caching createMetastoreClient() #21671

Proposal to Optimize Trino Hive Metastore Query Latency by Caching createMetastoreClient() #21671

Comments

yohengyang commented Apr 23, 2024 • edited Loading

findepi commented May 10, 2024

electrum commented May 10, 2024

yohengyang commented May 11, 2024 • edited Loading

yohengyang commented Apr 23, 2024 •

edited

Loading

yohengyang commented May 11, 2024 •

edited

Loading