Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lld: append ELF program header option #72386

Open
matheusmoreira opened this issue Nov 15, 2023 · 5 comments
Open

lld: append ELF program header option #72386

matheusmoreira opened this issue Nov 15, 2023 · 5 comments
Labels

Comments

@matheusmoreira
Copy link

matheusmoreira commented Nov 15, 2023

It would be useful to have a command line option or plugin for the linker that appends an empty PT_LOAD program header table entry to ELF executables. This will greatly facillitate patching executables with new data after linking.

The Linux kernel automatically loads those segments onto memory and passes a pointer to the program header table via the auxiliary vector. This would be the perfect mechanism to allow executables to easily and efficiently access data embedded into the executable itself, even data patched in after the the binary has been compiled.

Current solutions are insufficient. objcopy can add new sections but they do not get loaded by the kernel without a PT_LOAD segment and those can only be created at link time since adding new program headers would change all offsets in the file. Linker scripts support a PHDRS command but using that disables the linker's default behavior and forces users to specify all the segments and map all the sections to them instead of letting the linker do it.

A simple --append-program-header that just adds an empty program header to the end of the table would be ideal. With that feature in place, custom tools can be written to copy arbitrary data into the ELF and then edit the placeholder's offset and size to match.

Links:

Related StackOverflow question
Binutils mailing list discussion
Equivalent mold issue

@llvmbot
Copy link
Member

llvmbot commented Nov 15, 2023

@llvm/issue-subscribers-lld-elf

Author: Matheus Moreira (matheusmoreira)

It would be useful to have a command line option or plugin for the linker that appends an empty `PT_LOAD` program header table entry to ELF executables. This will greatly facillitate patching executables with new data after linking.

The Linux kernel automatically loads those segments onto memory and passes a pointer to the program header table via the auxiliary vector. This would be the perfect mechanism to allow executables to easily and efficiently access data embedded into the executable itself, even data patched in after the the binary has been compiled.

Current solutions are insufficient. objcopy can add new sections but they do not get loaded by the kernel without a PT_LOAD segment and those can only be created at link time since adding new program headers would change all offsets in the file. Linker scripts support a PHDRS command but using that disables the linker's default behavior and forces users to specify all the segments and map all the sections to them instead of letting the linker do it.

A simple --append-program-header that just adds an empty program header to the end of the table would be ideal. With that feature in place, custom tools can be written to copy arbitrary data into the ELF and then edit the placeholder's offset and size to match.

Links:

Related StackOverflow question
Binutils mailing list discussion

@matheusmoreira
Copy link
Author

The mold linker has gained support for --spare-program-headers N option which adds N x PT_NULL program headers to the linked program. I was able to easily implement an ELF data embedding feature on top of it. Standardization of this feature among all linkers would be great.

@MaskRay
Copy link
Member

MaskRay commented Nov 28, 2023

I think Nick's reply on Binutils means some uncertainly that such a feature will be accepted in GNU ld. I share a similar thought that this perhaps should be the responsibility of the post-link tool:
https://sourceware.org/pipermail/binutils/2023-November/130646.html

patchelf implements program header relocating from which you can take inspirations. As a general post-link tool, it probably should work for every link unit, not those linked with specific linker options.

@matheusmoreira
Copy link
Author

matheusmoreira commented Nov 28, 2023

@MaskRay Yes, it certainly seems that way...

I've created the relevant issue and sent a contribution to patchelf as well.

NixOS/patchelf#533
NixOS/patchelf#534

I've also written an article about the real use case this feature enables:

I used the PT_NULL headers feature to add custom embedded code segments to a programming language interpreter executable. The code is automatically loaded by the operating system, the interpreter finds it and runs it.

The mold maintainer also suggested moving the program header table to the end of the file so that it can freely grow in size. I tried implementing it this way in my own tools but couldn't get it to work unfortunately. I will try again in the future but for now I've chosen to depend on mold.

Even in that case I think a linker implementation would be beneficial. If the linker takes care of it, there will be no need to move the table to the end of the file. The result is a smaller, more compact executable.

@MaskRay
Copy link
Member

MaskRay commented Nov 29, 2023

Even in that case I think a linker implementation would be beneficial. If the linker takes care of it, there will be no need to move the table to the end of the file. The result is a smaller, more compact executable.

This is a case whether the feature really belongs to the post-link tool. And the argument to be integrated to mold is kinda weak: it is easy and there is already a --spare-* for dynamic tags. And you can see that the argument does not apply to GNU ld...

For your use case, you may control how the linker output is linked, so you are fine with a linker option. However, in a lot of cases, the user using a post-link tool has limited control to the link process, so a smart post-link tool is naturally required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants