Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zignature includes bytes which are a pointer to data #23857

Open
edeca opened this issue Jan 6, 2025 · 8 comments
Open

zignature includes bytes which are a pointer to data #23857

edeca opened this issue Jan 6, 2025 · 8 comments

Comments

@edeca
Copy link
Contributor

edeca commented Jan 6, 2025

Environment

radare2 5.9.8 1 @ windows-x86-64
birth: git.5.9.8 Tue 11/19/2024__11:46:03.42
commit: 4eb49d5ad8c99eaecc8850a2f10bad407067c898
options: gpl -O? cs:5 cl:1 meson

Description

I have been generating signatures for some executables using rasign and then using them with YARA.

I noticed with some samples that certain bytes aren't masked, even though they point to data.

Test

For example, loading this sample1 and creating a signature for the following function (aaaa; s 0x4070A0; zaf):

[0x004070a0]> pd 10
/ 84: fcn.004070a0 ();
|           0x004070a0      55             push ebp
|           0x004070a1      8bec           mov ebp, esp
|           0x004070a3      33c0           xor eax, eax
|           0x004070a5      55             push ebp
|           0x004070a6      68f4704000     push 0x4070f4
|           0x004070ab      64ff30         push dword fs:[eax]
|           0x004070ae      648920         mov dword fs:[eax], esp
|           0x004070b1      ff05b8454900   inc dword [0x4945b8]    <- instruction points to data
|       ,=< 0x004070b7      752d           jne 0x4070e6
|       |   0x004070b9      b850404900     mov eax, 0x494050           ; 'P@I'

In the signature the corresponding bytes for the address 0x4945b8 are not masked:

fcn.004070a0:
  bytes: 558bec33c05568f470400064ff30648920ff05b8454900752db850404900e835c6ffffb81c42 [ snip ]
  mask:  ffffffffffffff00000000ffffffffffffffffffffffffff00ff00000000ff00000000ff0000 [ snip ]
                                               ^^^^^^^^ these bytes should be masked

I believe the instruction is inc dword ptr ds:0x4945b8, so it should be possible to detect this is a pointer to data that might change if the data moves in a subsequent compilation.

Footnotes

  1. 8632a8578ffb6350d07244460f1386c9

@edeca
Copy link
Contributor Author

edeca commented Jan 7, 2025

This can be "fixed" by calling op0_memimmhandle(..) for instructions X86_INS_INC / X86_INS_DEC.

I am not certain this is a correct fix, but ao and signatures generated with zaf now display the correct mask and bytes.

[0x004070b1]> ao
address: 0x4070b1
mask: ff0000000000
bytes: ff05b8454900

// Generate function signature

[0x004070a0]> z
fcn.004070a0:
  bytes: 558bec33c05568f470400064ff30648920ff05b8454900752db850404900e835c6ffffb81c424900e82bc6ffffb8e8434900e821c6ffffe8d8bdffffb83cf04800e856deffff33c05a595964891068fb704000c3
  mask:  ffffffffffffff00000000ffffffffffffff0000000000ff00ff00000000ff00000000ff00000000ff00000000ff00000000ff00000000ff00000000ff00000000ff00000000ffffffffffffffffff00000000ff

A quick patch is below. I can submit a pull request if this is the correct fix, or update and submit a new one.

diff --git a/libr/arch/p/x86/plugin_cs.c b/libr/arch/p/x86/plugin_cs.c
index c1fb3a744f..6a84f20528 100644
--- a/libr/arch/p/x86/plugin_cs.c
+++ b/libr/arch/p/x86/plugin_cs.c
@@ -3644,12 +3644,14 @@ static void anop(RArchSession *a, RAnalOp *op, ut64 addr, const ut8 *buf, int le
                // The CF flag is not affected. The OF, SF, ZF, AF, and PF flags
                // are set according to the result.
                op->type = R_ANAL_OP_TYPE_ADD;
+               op0_memimmhandle (op, insn, addr, regsz);
                op->val = 1;
                break;
        case X86_INS_DEC:
                // The CF flag is not affected. The OF, SF, ZF, AF, and PF flags
                // are set according to the result.
                op->type = R_ANAL_OP_TYPE_SUB;
+               op0_memimmhandle (op, insn, addr, regsz);
                op->val = 1;
                break;
        case X86_INS_NEG:

@trufae
Copy link
Collaborator

trufae commented Jan 7, 2025

Can you please submit a pr with the fix and a test checking for the mask line in ao? Maybe other instructions are affected too?

@trufae
Copy link
Collaborator

trufae commented Jan 7, 2025

i can confirm the fix. but this op0_.. call must be done in more optypes, and this can be probably generalized

@trufae
Copy link
Collaborator

trufae commented Jan 7, 2025

like this. please submit a pr with the patch and add a test using the z and ao commands to ensure this feature wont break in the future. thank you!

diff --git a/libr/arch/p/x86/plugin_cs.c b/libr/arch/p/x86/plugin_cs.c
index c1fb3a744f..f5236bf922 100644
--- a/libr/arch/p/x86/plugin_cs.c
+++ b/libr/arch/p/x86/plugin_cs.c
@@ -3645,20 +3645,24 @@ static void anop(RArchSession *a, RAnalOp *op, ut64 addr, const ut8 *buf, int le
 		// are set according to the result.
 		op->type = R_ANAL_OP_TYPE_ADD;
 		op->val = 1;
+		op0_memimmhandle (op, insn, addr, regsz);
 		break;
 	case X86_INS_DEC:
 		// The CF flag is not affected. The OF, SF, ZF, AF, and PF flags
 		// are set according to the result.
 		op->type = R_ANAL_OP_TYPE_SUB;
 		op->val = 1;
+		op0_memimmhandle (op, insn, addr, regsz);
 		break;
 	case X86_INS_NEG:
 		op->type = R_ANAL_OP_TYPE_SUB;
 		op->family = R_ANAL_OP_FAMILY_CPU;
+		op0_memimmhandle (op, insn, addr, regsz);
 		break;
 	case X86_INS_NOT:
 		op->type = R_ANAL_OP_TYPE_NOT;
 		op->family = R_ANAL_OP_FAMILY_CPU;
+		op0_memimmhandle (op, insn, addr, regsz);
 		break;
 	case X86_INS_PSUBB:
 	case X86_INS_PSUBW:
@@ -3730,8 +3734,10 @@ static void anop(RArchSession *a, RAnalOp *op, ut64 addr, const ut8 *buf, int le
 		break;
 	case X86_INS_DIV:
 		op->type = R_ANAL_OP_TYPE_DIV;
+		op0_memimmhandle (op, insn, addr, regsz);
 		break;
 	case X86_INS_IMUL:
+		op0_memimmhandle (op, insn, addr, regsz);
 		op->type = R_ANAL_OP_TYPE_MUL;
 		op->sign = true;
 		break;

@trufae
Copy link
Collaborator

trufae commented Jan 9, 2025

ping

edeca added a commit to edeca/radare2 that referenced this issue Jan 9, 2025
This sets ->ptr and other values when the operand points to memory.

Partial fix for issue radareorg#23857.
@edeca
Copy link
Contributor Author

edeca commented Jan 9, 2025

I have been through the current instructions and checked x86 reference material for all the standard ones (not MMX / SSE / AVX).

The instructions that can reference memory from operand 0/1 should now have the correct calls (where relevant).

I need to understand the tests a little better before writing one.

@trufae
Copy link
Collaborator

trufae commented Jan 9, 2025

I’m happy to merge the pr as is, we can have the tests later. But i’m happy to get more coverage and verifications on all this logic.

trufae pushed a commit that referenced this issue Jan 9, 2025
This sets ->ptr and other values when the operand points to memory.

Partial fix for issue #23857.
@edeca
Copy link
Contributor Author

edeca commented Jan 10, 2025

Some more testing:

  • x87 FPU instructions are not parsed correctly. Updates in Fixes for x86 operand parsing for x87 FPU instructions #23877
  • LEA is not parsed correctly for the case 8D 1C C5 7C C0 F7 lea ebx, unk_51F7C07C[eax*8]
  • CMPXCHG is not parsed correctly for the case F0 0F B0 25 28 07 lock cmpxchg ds:byte_51F80728, ah
  • MOV is not parsed correctly for the case 8B 0C 85 38 07 F8 mov ecx, ds:dword_51F80738[eax*4] (8B 3D 30 07 F8 51 mov edi, ds:dword_51F80730 and A2 28 07 F8 51 mov ds:byte_51F80728, al with no register is OK)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants