-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-3336][HUDI-FLINK] Support custom hadoop config options for flink #4699
Conversation
@hudi-bot run azure |
8b1a75a
to
8da29b1
Compare
8da29b1
to
6e9036e
Compare
@hudi-bot run azure |
@danny0405 pls review thx |
Thanks, can you explain a little why we need theses |
In the same application, the same core/hdfs-site is used |
@hudi-bot run azure |
4cc376c
to
ec269f1
Compare
You may need to change the commit title: such as: Support custom hadoop config options for flink |
ok Thanks for the review. |
@hudi-bot run azure |
5085945
to
33f419e
Compare
@hudi-bot run azure |
@danny0405 can u approval workflow and merge the pr, thx |
33f419e
to
099a340
Compare
@hudi-bot run azure |
@danny0405 can u merge the PR? thx |
/** | ||
* Collects the config options that start with specified prefix {@code prefix} into a 'key'='value' list. | ||
*/ | ||
public static Map<String, String> getHoodiePropertiesWithPrefix(Map<String, String> options, String prefix) { | ||
public static Map<String, String> getPropertiesWithPrefix(Map<String, String> options, String subprefix) { | ||
final Map<String, String> hoodieProperties = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the prefix
directly ? There is no need to prefix the option with properties.
, prefix the option with hadoop.
directly is okey.
And can we also add a too method named getHadoopOptions(Configuration conf)
here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parqurt also uses the getHoodiePropertiesWithPrefix
, but the getHoodiePropertiesWithPrefix
only use properties.
, not append parqurt.
. To ensure consistency, hadoop and parquet add prefix(properties.
) .
properties.parqurt.*
and parqurt.*
, which one you think is better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @danny0405 pls review :)
replace getPropertiesWithPrefix
with DelegatingConfiguration#toMap
of Flink
and Added validation to HoodieTableFactory
hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java
Outdated
Show resolved
Hide resolved
hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java
Outdated
Show resolved
Hide resolved
hudi-flink/src/main/java/org/apache/hudi/util/ViewStorageProperties.java
Outdated
Show resolved
Hide resolved
hudi-flink/src/test/java/org/apache/hudi/utils/TestStreamerUtil.java
Outdated
Show resolved
Hide resolved
hudi-flink/src/test/java/org/apache/hudi/utils/TestViewStorageProperties.java
Outdated
Show resolved
Hide resolved
ccfcae2
to
1cb3039
Compare
1cb3039
to
0c496e6
Compare
|
sorry ,i dont understand |
Yes, the |
got it, thx @danny0405 , i will update the PR :) |
d83f5bd
to
eb0b3c9
Compare
@danny0405 hi pls review :) |
@danny0405 can u merge the PR? |
I think a little and we should hold for this patch, people usually do not pass hadoop config options through SQL options, can you describe again your use cases again, what kind of options do you want to pass around in per-job level ? |
1, Multiple SQL Job in the same process have different hadoop config in our production environment, we can configure different jobs by |
hi @cuibo01 can rebase this pr? |
…gtoHdfsConfig2 Conflicts: hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSource.java
…t take effect
Tips
What is the purpose of the pull request
(For example: This pull request adds quick-start document.)
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.