Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUPPORT]Same primary key with different _hoodie_record_key #5812

Closed
hbgstc123 opened this issue Jun 9, 2022 · 2 comments
Closed

[SUPPORT]Same primary key with different _hoodie_record_key #5812

hbgstc123 opened this issue Jun 9, 2022 · 2 comments
Assignees
Labels
flink Issues related to flink

Comments

@hbgstc123
Copy link
Contributor

I have a table with a column video_id as primary key, and i find record with same primary key, have different record_key as shown in the picture below.

image

Steps to reproduce the behavior:
1.create table with spark sql, with tblproperties
tblproperties (
type = 'mor',
primaryKey = 'video_id'
)
2.insert historical data with spark sql
3.ingest real time incremental data with flink

config in flink ddl:
image

The _hoodie_record_key that written with spark contains a prefix "video_id:" while data written with flink doesn't

  • Hudi version : 0.11.0

  • Spark version : 3.1

  • flink version : 1.13

  • Storage (HDFS/S3/GCS..) : hdfs

  • Running on Docker? (yes/no) : no

@danny0405 danny0405 added the flink Issues related to flink label Jun 9, 2022
@danny0405 danny0405 self-assigned this Jun 9, 2022
@danny0405
Copy link
Contributor

This is a known problem, because Spark uses the ComplexAvroKeyGenerator by default even if the primary key only has one field, while flink would use SimpleAvroKeyGenerator instead when primary key fields is simple, a temporal solution is to set up the key generator for spark as SimpleAvroKeyGenerator instead manually, i would fire a fix for spark soon ~

@danny0405
Copy link
Contributor

Have fired a fix here: #5815

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flink Issues related to flink
Projects
None yet
Development

No branches or pull requests

3 participants