`lld`: append ELF program header option #72386

matheusmoreira · 2023-11-15T13:42:20Z

It would be useful to have a command line option or plugin for the linker that appends an empty PT_LOAD program header table entry to ELF executables. This will greatly facillitate patching executables with new data after linking.

The Linux kernel automatically loads those segments onto memory and passes a pointer to the program header table via the auxiliary vector. This would be the perfect mechanism to allow executables to easily and efficiently access data embedded into the executable itself, even data patched in after the the binary has been compiled.

Current solutions are insufficient. objcopy can add new sections but they do not get loaded by the kernel without a PT_LOAD segment and those can only be created at link time since adding new program headers would change all offsets in the file. Linker scripts support a PHDRS command but using that disables the linker's default behavior and forces users to specify all the segments and map all the sections to them instead of letting the linker do it.

A simple --append-program-header that just adds an empty program header to the end of the table would be ideal. With that feature in place, custom tools can be written to copy arbitrary data into the ELF and then edit the placeholder's offset and size to match.

Links:

Related StackOverflow question
Binutils mailing list discussion
Equivalent mold issue

The text was updated successfully, but these errors were encountered:

llvmbot · 2023-11-15T14:20:57Z

@llvm/issue-subscribers-lld-elf

Author: Matheus Moreira (matheusmoreira)

It would be useful to have a command line option or plugin for the linker that appends an empty `PT_LOAD` program header table entry to ELF executables. This will greatly facillitate patching executables with new data after linking.

The Linux kernel automatically loads those segments onto memory and passes a pointer to the program header table via the auxiliary vector. This would be the perfect mechanism to allow executables to easily and efficiently access data embedded into the executable itself, even data patched in after the the binary has been compiled.

Current solutions are insufficient. objcopy can add new sections but they do not get loaded by the kernel without a PT_LOAD segment and those can only be created at link time since adding new program headers would change all offsets in the file. Linker scripts support a PHDRS command but using that disables the linker's default behavior and forces users to specify all the segments and map all the sections to them instead of letting the linker do it.

A simple --append-program-header that just adds an empty program header to the end of the table would be ideal. With that feature in place, custom tools can be written to copy arbitrary data into the ELF and then edit the placeholder's offset and size to match.

Links:

Related StackOverflow question
Binutils mailing list discussion

matheusmoreira · 2023-11-24T13:56:36Z

The mold linker has gained support for --spare-program-headers N option which adds N x PT_NULL program headers to the linked program. I was able to easily implement an ELF data embedding feature on top of it. Standardization of this feature among all linkers would be great.

MaskRay · 2023-11-28T05:59:33Z

I think Nick's reply on Binutils means some uncertainly that such a feature will be accepted in GNU ld. I share a similar thought that this perhaps should be the responsibility of the post-link tool:
https://sourceware.org/pipermail/binutils/2023-November/130646.html

patchelf implements program header relocating from which you can take inspirations. As a general post-link tool, it probably should work for every link unit, not those linked with specific linker options.

matheusmoreira · 2023-11-28T11:27:20Z

@MaskRay Yes, it certainly seems that way...

I've created the relevant issue and sent a contribution to patchelf as well.

NixOS/patchelf#533
NixOS/patchelf#534

I've also written an article about the real use case this feature enables:

Self-contained Linux applications with lone lisp

I used the PT_NULL headers feature to add custom embedded code segments to a programming language interpreter executable. The code is automatically loaded by the operating system, the interpreter finds it and runs it.

The mold maintainer also suggested moving the program header table to the end of the file so that it can freely grow in size. I tried implementing it this way in my own tools but couldn't get it to work unfortunately. I will try again in the future but for now I've chosen to depend on mold.

Even in that case I think a linker implementation would be beneficial. If the linker takes care of it, there will be no need to move the table to the end of the file. The result is a smaller, more compact executable.

MaskRay · 2023-11-29T05:51:15Z

Even in that case I think a linker implementation would be beneficial. If the linker takes care of it, there will be no need to move the table to the end of the file. The result is a smaller, more compact executable.

This is a case whether the feature really belongs to the post-link tool. And the argument to be integrated to mold is kinda weak: it is easy and there is already a --spare-* for dynamic tags. And you can see that the argument does not apply to GNU ld...

For your use case, you may control how the linker output is linked, so you are fine with a linker option. However, in a lot of cases, the user using a post-link tool has limited control to the link process, so a smart post-link tool is naturally required.

github-actions bot added the lld label Nov 15, 2023

matheusmoreira mentioned this issue Nov 15, 2023

Option to append PT_LOAD segments to ELFs rui314/mold#1148

Closed

EugeneZelenko added lld:ELF and removed lld labels Nov 15, 2023

guevara mentioned this issue Jan 25, 2024

Self-contained Linux applications with lone lisp guevara/read-it-later#10702

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`lld`: append ELF program header option #72386

`lld`: append ELF program header option #72386

matheusmoreira commented Nov 15, 2023 •

edited

Loading

llvmbot commented Nov 15, 2023

matheusmoreira commented Nov 24, 2023

MaskRay commented Nov 28, 2023 •

edited

Loading

matheusmoreira commented Nov 28, 2023 •

edited

Loading

MaskRay commented Nov 29, 2023

lld: append ELF program header option #72386

lld: append ELF program header option #72386

Comments

matheusmoreira commented Nov 15, 2023 • edited Loading

llvmbot commented Nov 15, 2023

matheusmoreira commented Nov 24, 2023

MaskRay commented Nov 28, 2023 • edited Loading

matheusmoreira commented Nov 28, 2023 • edited Loading

MaskRay commented Nov 29, 2023

`lld`: append ELF program header option #72386

`lld`: append ELF program header option #72386

matheusmoreira commented Nov 15, 2023 •

edited

Loading

MaskRay commented Nov 28, 2023 •

edited

Loading

matheusmoreira commented Nov 28, 2023 •

edited

Loading