Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reencoded result doesnt match source code #151

Open
expl01txx opened this issue Jan 19, 2025 · 11 comments
Open

Reencoded result doesnt match source code #151

expl01txx opened this issue Jan 19, 2025 · 11 comments

Comments

@expl01txx
Copy link

expl01txx commented Jan 19, 2025

When I decode and then encode the code back (unchanged), it is different (for example sub opcode size).

code:

    auto pe_binary = LIEF::PE::Parser::parse("crackme.exe");
    
    auto text_section = pe_binary->get_section(".text");

    auto code = text_section->content();
    std::cout << std::hex << code.size() << std::endl;

    const uint64_t baseAddr = text_section->virtual_address();

    Program program(MachineMode::AMD64);
   

    Decoder decoder(program.getMode());

    x86::Assembler assembler(program);
    

    size_t bytesDecoded = 0;

    while (bytesDecoded < code.size_bytes())
    {
        const auto curAddress = baseAddr + bytesDecoded;

        const auto decoderRes = decoder.decode(code.data() + bytesDecoded, code.size() - bytesDecoded, curAddress);
        if (!decoderRes)
        {
            std::cout << "Failed to decode at " << std::hex << curAddress << ", " << decoderRes.error().getErrorName() << "\n";
            return EXIT_FAILURE;
        }

        const auto& instr = decoderRes.value();
        if (auto res = assembler.emit(instr); res != zasm::ErrorCode::None)
        {
            std::cout << "Failed to emit instruction " << std::hex << curAddress << ", " << res.getErrorName() << "\n";
        }

        bytesDecoded += instr.getLength();

    }


    Serializer serializer;
   
    serializer.serialize(program, baseAddr);

    const auto codeDump = zutils::getDisassemblyDump(serializer, program.getMode());
    std::cout << codeDump << "\n";

    auto new_code = serializer.getCode();
    auto new_code_array = std::vector<uint8_t>(new_code, new_code + serializer.getCodeSize());

    text_section->content(new_code_array);

    std::cout << std::hex << serializer.getCodeSize() << std::endl;
    
    pe_binary->optional_header().addressof_entrypoint(0x10d3);
    pe_binary->write("out.exe");

Image

Image

@hxm-cpp
Copy link
Contributor

hxm-cpp commented Jan 19, 2025

this is very normal, re-encoding may add or remove bytes prefixes. that has nothing to do with zasm. depends on zydis.

@ZehMatt
Copy link
Collaborator

ZehMatt commented Jan 19, 2025

Is it actually making the instruction bigger or smaller? The encoder typically chooses the smaller encoding, if its the opposite then its best to provide the code to easily reproduce it so I can look into it and also add it to the tests, but I imagine its most likely getting smaller.

@expl01txx
Copy link
Author

Its making code smaller, and its okay, but relative jumps broke on re-encoding

@ZehMatt
Copy link
Collaborator

ZehMatt commented Jan 20, 2025

Its making code smaller, and its okay, but relative jumps broke on re-encoding

What is broken?

@expl01txx
Copy link
Author

Relative jumps or calls addresses, its just jump on incorrect address

@ZehMatt
Copy link
Collaborator

ZehMatt commented Jan 20, 2025

Can you provide an example to reproduce this? There are various tests for this already so I need a bit more information to reproduce the issue.

@expl01txx
Copy link
Author

Issue with left program
Image

@ZehMatt
Copy link
Collaborator

ZehMatt commented Jan 20, 2025

Issue with left program Image

I'm not going to try to make out what addresses are used here by just the screenshots. You need to create minimal code that reproduces the issues which I can run and look into.

@expl01txx
Copy link
Author

I add this code in first message

@es3n1n
Copy link
Contributor

es3n1n commented Jan 20, 2025

This is pure guessing, as there aren't many details about the issue, but I think what's happening is this: While decoding the original text section's code, you're not replacing raw references to addresses with zasm's labels. When code changes its size, the referenced parts of the image aren't relocated

@ZehMatt
Copy link
Collaborator

ZehMatt commented Jan 24, 2025

I've made some changes to the encoder, check if it helps. I can't really do much with your code without having also the input data. If you provide the raw bytes or better just copy the disassembly directly from x64dbg so I can try to reproduce the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants