-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#4134 drbbdup: Avoid 2nd scratch via max_case_encoding #5322
Conversation
Adds a new drbbdup option max_case_encoding. When the value is small, we can avoid a 2nd scratch register for aarchxx, since we can fit all encodings in immediates inside compare instructions. Adds a sanity check. Confirmed visually in the debug logs that the 2nd scratch is gone: Before: -------------------------------------------------- after instrumentation: TAG 0x0000ffff868340c0 +0 m4 @0x0000fffd428950e8 f900b781 str %x1 -> +0x0168(%x28)[8byte] +4 m4 @0x0000fffd42894da0 d53b4200 mrs %nzcv -> %x0 +8 m4 @0x0000fffd42894cd8 f900af80 str %x0 -> +0x0158(%x28)[8byte] +12 m4 @0x0000fffd42894c58 d28c1000 movz $0x6080 lsl $0x00 -> %x0 +16 m4 @0x0000fffd42894bd8 f2a85000 movk %x0 $0x4280 lsl $0x10 -> %x0 +20 m4 @0x0000fffd42894b10 f2dfffe0 movk %x0 $0xffff lsl $0x20 -> %x0 +24 m4 @0x0000fffd42894a48 f9400000 ldr (%x0)[8byte] -> %x0 +28 m4 @0x0000fffd42894e20 f9400000 <label> +28 m4 @0x0000fffd42894980 f100041f subs %x0 $0x0000000000000001 lsl $0x0000000000000000 -> %xzr +32 m4 @0x0000fffd42894900 54000001 b.ne @0x0000fffd42894fa0[8byte] -------------------------------------------------- After: -------------------------------------------------- after instrumentation: TAG 0x0000ffff9aedf0c0 +0 m4 @0x0000fffd56f400e8 d53b4200 mrs %nzcv -> %x0 +4 m4 @0x0000fffd56f3fda0 f900af80 str %x0 -> +0x0158(%x28)[8byte] +8 m4 @0x0000fffd56f3fcd8 d2821000 movz $0x1080 lsl $0x00 -> %x0 +12 m4 @0x0000fffd56f3fc58 f2aadd60 movk %x0 $0x56eb lsl $0x10 -> %x0 +16 m4 @0x0000fffd56f3fbd8 f2dfffe0 movk %x0 $0xffff lsl $0x20 -> %x0 +20 m4 @0x0000fffd56f3fb10 f9400000 ldr (%x0)[8byte] -> %x0 +24 m4 @0x0000fffd56f3fe20 f9400000 <label> +24 m4 @0x0000fffd56f3fa48 f100041f subs %x0 $0x0000000000000001 lsl $0x0000000000000000 -> %xzr +28 m4 @0x0000fffd56f3f980 54000001 b.ne @0x0000fffd56f3ffa0[8byte] -------------------------------------------------- Issue: #4134
@derekbruening Is this ready for review? |
Yes, forgot to request it. Thanks for noticing. |
You can punt the x86_64 optimisation for |
You mean adding the use of an immediate instead of a load? OK. Doesn't it not depend on the max encoding: it only depends on the current case value, which never changes? |
Yeah. Okay, max encoding is not needed for that optimisation; we can just check the size of each encoding. |
LGTM. One thing which I noticed is that |
Good catch. Done. Note that you didn't actually mark "Approve" (the "LGTM" string doesn't do it in this system). |
Adds a new drbbdup option max_case_encoding. When the value is small,
we can avoid a 2nd scratch register for aarchxx, since we can fit all
encodings in immediates inside compare instructions.
Adds a sanity check.
Confirmed visually in the debug logs that the 2nd scratch is gone:
Before:
After:
Issue: #4134