-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take MySQL Column Type Into Account in VStreamer #9331
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mattlord
force-pushed
the
NullBytePaddingFinalFinal
branch
10 times, most recently
from
December 8, 2021 05:19
1882a7c
to
99dc068
Compare
mattlord
changed the title
Null byte padding final final
Take MySQL Column Type Into Account in VStreamer and RBR Binlog Event Processing
Dec 8, 2021
mattlord
changed the title
Take MySQL Column Type Into Account in VStreamer and RBR Binlog Event Processing
Take MySQL Column Type Into Account in VReplication
Dec 8, 2021
mattlord
force-pushed
the
NullBytePaddingFinalFinal
branch
from
December 8, 2021 05:40
99dc068
to
ffd97a4
Compare
mattlord
force-pushed
the
NullBytePaddingFinalFinal
branch
from
December 8, 2021 06:09
ffd97a4
to
348ae55
Compare
mattlord
changed the title
Take MySQL Column Type Into Account in VReplication
Take MySQL Column Type Into Account in VStreamer
Dec 8, 2021
mattlord
force-pushed
the
NullBytePaddingFinalFinal
branch
4 times, most recently
from
December 8, 2021 16:12
f79ea0f
to
529da1b
Compare
This is required when we need to match MySQL behavior for data that requires column type information as well. For example, the binlog event metadata makes no distinction between events for a BINARY(4) column and events for a CHAR(4) column with a binary collation like utf8mb4_bin. So we need to know the underlying MySQL column type in order to handle them disctinctly -- MySQL pads (fixed length) binary columns on the right side with null bytes, but it does NOT do that for (fixed lengthed) CHARo columns, regardless of the collation. Signed-off-by: Matt Lord <[email protected]>
mattlord
force-pushed
the
NullBytePaddingFinalFinal
branch
3 times, most recently
from
December 8, 2021 21:52
83048d6
to
13ea4ca
Compare
mattlord
force-pushed
the
NullBytePaddingFinalFinal
branch
7 times, most recently
from
December 9, 2021 02:27
8fd8f3f
to
421f4d1
Compare
And use ToLower when looking for BINARY types to be safe. Signed-off-by: Matt Lord <[email protected]>
mattlord
force-pushed
the
NullBytePaddingFinalFinal
branch
from
December 9, 2021 03:39
421f4d1
to
3ff9fbd
Compare
mattlord
requested review from
harshit-gangal,
shlomi-noach and
systay
as code owners
December 9, 2021 04:03
mattlord
requested review from
deepthi
and removed request for
systay,
shlomi-noach and
harshit-gangal
December 9, 2021 04:03
rohit-nayak-ps
approved these changes
Dec 9, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
Very nice! An elegant solution to another binlog parser edge case.
deepthi
approved these changes
Dec 9, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very thorough code comments and nice test case. 💯
This was referenced Dec 10, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
In #7969 we added right side null-byte padding to
BINARY
columns after processing the binlog event in order to match the MySQL behavior and value so that we correctly calculate the keyspace IDs forBINARY
columns in vindex functions like binary_md5 and correctly apply vreplication filters using those columns.It turns out, however, that row based binlog events make no distinction between a
BINARY(4)
column and aCHAR(4)
column with a binary collation likeutf8mb4_bin
(you can see a detailed discussion here). So after #7969 we were also, incorrectly, adding right side null-byte padding toCHAR
columns with binary collations.Although we ensured we did not add more padding than the actual column would hold in #8730, we really shouldn't be adding any padding at all to
CHAR
columns in order to match the MySQL behavior (otherwise we have discrepancies as described here). But in order to do this we needed to thread the target MySQL column type info through the vstreamer and RBR binlog event processing components so that we ONLY add the padding to values for actualBINARY
columns and nothing else (CHAR
columns being the known problematic case today).Now we only add the padding if the underlying MySQL column on the target is
BINARY
and then it’s just bytes, not chars made of up N bytes, so the subsequent pad trimming based on the charset was removed.The manual test case here also now passes in this branch:
Related Issue(s)
Fixes: #9207
Checklist