Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-2456] support 'show partitions' sql #3693

Merged
merged 6 commits into from
Oct 6, 2021

Conversation

YannByron
Copy link
Contributor

No description provided.

@hudi-bot
Copy link

hudi-bot commented Sep 20, 2021

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run travis re-run the last Travis build
  • @hudi-bot run azure re-run the last Azure build

@YannByron
Copy link
Contributor Author

@leesf can you have time to review this?

Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great contribution. Few comments.
@pengzhiwei2018 are you able to review this and probably all follow ups from the spark SQL work. I am seeing adoption really picking up.

@@ -40,7 +40,7 @@
protected static final String NULL_RECORDKEY_PLACEHOLDER = "__null__";
protected static final String EMPTY_RECORDKEY_PLACEHOLDER = "__empty__";

protected static final String DEFAULT_PARTITION_PATH = "default";
protected static final String DEFAULT_PARTITION_PATH = PartitionPathEncodeUtils.HUDI_DEFAULT_PARTITION_PATH;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename: HOODIE_DEFAULT_PARTITION_PATH

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -25,6 +25,8 @@
*/
public class PartitionPathEncodeUtils {

public static final String HUDI_DEFAULT_PARTITION_PATH = "default";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually lets just do DEFAULT_PARTITION_PATH

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -71,7 +73,7 @@ public static String escapePathName(String path, String defaultPath) {
if (defaultPath == null) {
//previously, when path is empty or null and no default path is specified,
// __HIVE_DEFAULT_PARTITION__ was the return value for escapePathName
return "__HIVE_DEFAULT_PARTITION__";
return HUDI_DEFAULT_PARTITION_PATH;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this change needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the partition's value is null, hudi use "default" as the part of path, rather than "HIVE_DEFAULT_PARTITION".
And I review the implement when need to encode partition path before #2645, like getRecordPartitionPath and getPartitionPath in KeyGenUtils. I think this just encode path, and use "default" when partitionPath is null or empty.
So i guess maybe this is forgotten to change.

/**
* Helper trait for all hoodie commands.
*/
trait HoodieCommand {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the interest of keeping class hierarchies tenable, can this become a helper method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is already HoodieSqlUtils class, can we move to HoodieSqlUtils?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, i'll amend it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* @param expectedAnswer the expected result in a [[Seq]] of [[Row]]s.
* @param checkToRDD whether to verify deserialization to an RDD. This runs the query twice.
*/
def checkAnswer(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is already checkAnswer method in TestHoodieSqlBase, can we reuse that and avoid copying from Spark 2.X (org.apache.spark.sql.QueryTest) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkAnswer in TestHoodieSqlBase doesn't work for this unit tests. If two sequences have the same elements, but not in the same order, it is judged to be different.
checkAnswer method in QueryTest will sort the two sequences and then compare them.

And The implements's inconsistent between Spark2 and Spark3 is the reason to copy codes rather than extend QueryTest. So I copy some codes needed here.

@YannByron
Copy link
Contributor Author

@vinothchandar @leesf did i make myself clear for PartitionPathEncodeUtils and HoodieSparkFunSuite?
I'll continue to amend other codes if these are accepted.

@@ -71,7 +73,7 @@ public static String escapePathName(String path, String defaultPath) {
if (defaultPath == null) {
//previously, when path is empty or null and no default path is specified,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the description should be updated accordingly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@leesf leesf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @vinothchandar do you have any other concern?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants