Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Hive Metastore API for listing partitions that allows range (and other) predicates on the partition keys #611

Open
Parth-Brahmbhatt opened this issue Apr 9, 2019 · 18 comments
Assignees
Labels
enhancement New feature or request

Comments

@Parth-Brahmbhatt
Copy link
Member

See thread https://prestosql.slack.com/archives/CFLB9AMBN/p1554842107205400.

Currently in HivePartitionManager we use getPartitionNamesByParts API from metastore which only allows a single value per partition predicate. This means any non equality partition predicate can not be pushed down to metastore which results in 2 problems:

  • Even if the user specifies a predicate his query can fail with TOO_MANY_PARTITONS error
  • Undue load on metastore service for very large tables.

In the newer version of Hive new APIs are introduced to avoid this issue. We should look into moving to get_partitions_by_expr or get_partitions_by_filter for types that are supported by these APIs.

@martint martint added the enhancement New feature or request label Apr 9, 2019
@electrum
Copy link
Member

electrum commented Apr 9, 2019

This is a good idea. Do you know what Hive version these were added in? Hopefully they’re old enough to be available everywhere, otherwise we’ll need some fallback code (catch remote “no such method” error and call the old one).

@Parth-Brahmbhatt
Copy link
Member Author

If I just go by when the API was added, then that was 9 years ago.

@luohao
Copy link
Member

luohao commented Apr 9, 2019

We have explored this a bit, but I think we should probably avoid using get_partitions_by_expr(PartitionsByExprRequest req). It's very very bad API design. It includes a Kryo serialized object in the request:

struct PartitionsByExprRequest {
...
 3: required binary expr,
...
} 

There are a few potential issues:

  • Kyro doesn't guarantee serialization format. From our experience, if the client(e.g, Presto) and server(e.g., HMS) use different versions of Kyro library, you may get deserialization error. Using Kyro to serialize objects is a bit weired to me.

  • Having a blob in the interface makes it hard to integrate with non-HMS systems. Presto only uses a few HMS APIs which makes it very easy to integrate with other schema registry. We have tried to implement a equivilant to get_partitions_by_expr in our own data discovery service but it turns out to be non-trivial.

We should probably try get_partitions_by_filter. It uses a filter string, which is more standard.

@electrum
Copy link
Member

electrum commented Apr 10, 2019

The problem with get_partitions_by_expr, get_partitions_by_filter, and get_part_specs_by_filter is that they return the full partition metadata objects, which can be huge and take too long to fetch.

With the existing get_partition_names_ps call, we see issues with timeouts and Thrift response size limits on tables with many partitions. And that's just fetching the names, which is a single string, not the huge metadata object.

Presto fetches partition names while planning, then fetches the full partition metadata iteratively during split generation as the query executes. The engine provides the connector with a Constraint containing a black box Predicate that is used to further filter the name list, which allows expressions or functions over partition columns which cannot be expressed via TupleDomain.

I now remember looking into this in the past and coming to the above conclusion that there was no better API that does what we need (list partition names with a filter). However, they recently added get_partition_values in HIVE-17466 which seems to do exactly what we need. The PartitionValuesRequest struct has a filter field.

We should check if this API is available in CDH 5 and HDP 2.

@Parth-Brahmbhatt
Copy link
Member Author

CDH 5 seems to be on Hive 1.1.0 + patches and the Jira you have linked does not seem to be in the list of patches

HDP-2.6.5 which is the most latest release of HDP-2 also only ships HIVE-2.1.0

No matter when we decide to move to a newer API I believe we will have to make it config enabled to support organization that are running on older version of hive servers so I do not see a reason not to implement this right now and make it config enabled.

@electrum
Copy link
Member

I agree, there's no reason not to implement it now. We can do it without config if we make the fallback transparent. Have a boolean flag defaulting to true to indicate it is supported, then set it to false if we get a "no such method" error from the remote Thrift server. This means the first request after a server restart will take slightly longer, but this shouldn't cause any problems.

@luohao
Copy link
Member

luohao commented Apr 11, 2019

Agreed with @electrum.

As long as we have a fallback mechanism it should not break the compatibility. Hive adopts a similar strategy where a MetaException from HMS will make it falls back to the implementation of client-side filtering.

    430         try {
    431           hasUnknownPartitions = Hive.get().getPartitionsByExpr(
    432               tab, compactExpr, conf, partitions);
    433         } catch (IMetaStoreClient.IncompatibleMetastoreException ime) {
    434           // TODO: backward compat for Hive <= 0.12. Can be removed later.
    435           LOG.warn("Metastore doesn't support getPartitionsByExpr", ime);
    436           doEvalClientSide = true;
    437         } finally {
    438           perfLogger.PerfLogEnd(CLASS_NAME, PerfLogger.PARTITION_RETRIEVING);
    439         }

@findepi
Copy link
Member

findepi commented Jun 11, 2019

get_partition_values was added in Hive 2.4, but this version is not downloadable (https://archive.apache.org/dist/hive/) and, AFAIK, we're not ready to require Hive 3.1 just yet.
I created enhancement proposal to backport get_partition_values to Hive 2.3 branch (https://issues.apache.org/jira/browse/HIVE-21859).

@findepi findepi changed the title Use get_partitions_by_expr or get_partitions_by_filter API of hive metastore. Use Hive Metastore API for listing partitions that allows range (and other) predicates on the partition keys Jun 22, 2019
@findepi
Copy link
Member

findepi commented Jun 22, 2019

get_partition_values has been merged to Hive 2.3 branch (https://issues.apache.org/jira/browse/HIVE-21859).

@dannylinden
Copy link

Here also @tooptoop4

@rash67
Copy link
Member

rash67 commented Jul 25, 2020

I'm taking a look at this right now
thanks

@rash67
Copy link
Member

rash67 commented Aug 7, 2020

After a discussion with @electrum we decided to break this into pieces

  1. Refactor HiveMetastore interface to only have a getPartitionNamesByFilter() method, including the callers and implementations.
  2. Update GlueHiveMetastore to take advantage of the new API.
  3. Update ThriftHiveMetastore to conditionally use the old or new Thrift call. (edited)

I'm tidying up a PR for #1 and will attach today for feedback (consider WIP, still testing). It can be reviewed and committed separately, or bundled with the next steps. However, if the Glue implementation is the desired feature, I would recommend implementing and testing that without #3 first.

@rash67
Copy link
Member

rash67 commented Aug 8, 2020

I'm debugging some CI test failures with this commit, but if anyone wants start providing feedback on the refactor, that would be helpful. Once I've sorted out any test issues, I'll begin work on the HiveGlueMetastore implementation that translates TupleDomain -> a filter string. I'll do the BridgingHiveMetastore/ThriftMetastore translation

the first commit after the refactor will push any Domain -> wildcard translation into each HMS impl, likely.

@rash67
Copy link
Member

rash67 commented Aug 11, 2020

I'm pushing the conversion to List into implementations now and will begin work on the Glue impl + test cases after that.

@rash67
Copy link
Member

rash67 commented Aug 12, 2020

pushed conversion of TupleDomain -> List into each HiveMetastore impl

next GlueHiveMetastore implementation + tests

@rash67
Copy link
Member

rash67 commented Aug 13, 2020

see comments in the PR for progress. The PR/diff is around 1k, but includes some copy & pasted files (serDe related code)

@rash67
Copy link
Member

rash67 commented Aug 27, 2020

I'm working on testing out the GlueHiveMetastore implementation now. I'm adding a large number of test cases for the supported types and a variety of queries (equals, in clause, ranges, etc)

@tooptoop4
Copy link
Contributor

tooptoop4 commented Oct 14, 2020

has this improvement been implemented for non-glue hive ? @rash67

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

8 participants