Skip to content

Generating executable files from scratch

Julianne Swinoga edited this page Nov 5, 2023 · 18 revisions

Introduction

The first step in getting the operating system to execute arbitrary assembly is to figure out what types of executable files your operating system supports. Since I would be developing my compiler on Linux, the suitable Executable and Linkable Format (ELF) was chosen. There is a veritable wealth of information on the composition of ELF files however, and since the ELF standard is very large and overarching it is not a simple matter to be able to pick and choose what is needed to get a bare minimum example working. As such, I am writing this as a compendium of all the research and piecing together that I did to be able to write YABFC.

Starting at the top

Looking through the documentation for the system standard header elf.h, there is a few given structures that we can use to set up the executable file. For certain reasons I will be using a 64 bit version of an ELF executable rather than a 32 bit version. The first few lines for setup are pretty straight forward and rigorously defined:

Elf64_Ehdr ELFHeader; // Initialize the ELF header

ELFHeader.e_ident[EI_MAG0]       = 0x7f; // Magic numbers
ELFHeader.e_ident[EI_MAG1]       = 'E';
ELFHeader.e_ident[EI_MAG2]       = 'L';
ELFHeader.e_ident[EI_MAG3]       = 'F';
ELFHeader.e_ident[EI_CLASS]      = ELFCLASS64;    // 64 bit ELF
ELFHeader.e_ident[EI_DATA]       = ELFDATA2LSB;   // little-endian
ELFHeader.e_ident[EI_VERSION]    = EV_CURRENT;    // Current version
ELFHeader.e_ident[EI_OSABI]      = ELFOSABI_SYSV; // UNIX System V ABI
ELFHeader.e_ident[EI_ABIVERSION] = 0x0;           // ABI version needs to be 0

for (int i = EI_PAD; i < EI_NIDENT; i++) ELFHeader.e_ident[i] = 0x0; // Zero padding

ELFHeader.e_type    = ET_EXEC;            // Executable file
ELFHeader.e_machine = EM_X86_64;          // AMD x86-64
ELFHeader.e_version = EV_CURRENT;         // Current version

After this, things start to get a little more complicated. We need to configure the entry point of the program, program & section header table offsets as well as header sizes. The ELF specification does not define where all the different sections are to be placed in the file as long as the memory offsets correspond to a section of memory with the correct data. An important distinction here is the difference between memory on file and program runtime memory, hereby referred to as file location (_FILE_LOC) and memory location (_MEM_LOC).

File location is the physical address offsets (offsets because the operating system abstracts the ACTUAL physical address) of the file that you are creating. This is telling the operating system where to look in your file in order to read the correct data. This data is then put into the program runtime memory (virtual address space) where it can be dynamically read by the program. Saying that, we need to start by picking a virtual address from where to base the program.

Arbitrary numbers and where to find them

I initially was planning to make a 32 bit ELF, so initial poking around on various threads lead me to the magical "somewhere above 0x8048000" number which is ~128 MiB. In 64 bit land, the 0x4000000 address seemed to be used so this is now the origin memory address used going forward. A thread that explains some of the magic that these numbers represent is here.

The memory structure of our ELF file will be as follows:

#define ORG (0x4000000) // Origin memory address

#define PGM_HEADER_TBL_LOC (sizeof(Elf64_Ehdr)) // Program header location
#define PGM_HEADER_SIZE (sizeof(Elf64_Phdr))    // Program header size
#define SEC_HEADER_SIZE (sizeof(Elf64_Shdr))    // Section header size
#define PGM_HEADER_NUM (2)                      // Number of program headers
#define SEC_HEADER_NUM (4)                      // Number of section headers

#define TEXT_FILE_LOC (PGM_HEADER_TBL_LOC + (PGM_HEADER_NUM * PGM_HEADER_SIZE)) // .text file location
#define TEXT_MEM_LOC (ORG + TEXT_FILE_LOC)                                      // .text memory location

#define ENTRY_POINT TEXT_MEM_LOC // Executable entry point

File layout:

Section Notes
ELF Header How the file is laid out
Program Header Table: Program Header 1 For the .text section
Program Header Table: Program Header 2 For the .data section
.text section x86 Assembly code
.data section General data storage
.shrtrab section String table
Section Header table: Section Header 1 Mandatory null section
Section Header table: Section Header 2 For the .text section
Section Header table: Section Header 3 For the .data section
Section Header table: Section Header 4 For the .shrtrab section

We can now start to plug some values into the ELF header on where data is located.

ELFHeader.e_entry   = ENTRY_POINT;        // Entry point of program
ELFHeader.e_phoff   = PGM_HEADER_TBL_LOC; // Program header table offset
ELFHeader.e_shoff   = 0x0;                // Section header table offset
ELFHeader.e_flags   = 0x0;                // Processor specific flags
ELFHeader.e_ehsize  = sizeof(Elf64_Ehdr); // ELF Header size

As the section header table is at the end of the file, and we don't actually have any data in the file yet, we hold off setting the section header table offset for now.

Section Sizing

Next up is to tell the ELF header how many sections we have and how big they are.

ELFHeader.e_phentsize = PGM_HEADER_SIZE; // Size of each program header
ELFHeader.e_phnum     = PGM_HEADER_NUM;  // Number of entries in program header table
ELFHeader.e_shentsize = SEC_HEADER_SIZE; // Section header size, in bytes
ELFHeader.e_shnum     = SEC_HEADER_NUM;  // Number of entries in section header

// Section header table index of the entry associated with the section name string table, to be set later
ELFHeader.e_shstrndx = SHN_UNDEF;

Program Header Table

Let's assume that we already have an array of bytes that represent out assembly code (I will talk about some gotchas with this later). We now need a program header to represent this data. As we have two segments of data (.text and .data), we need two program headers. This is fortunately relatively easy to set up.

Elf64_Phdr programHeaderText;
programHeaderText.p_type   = PT_LOAD;
programHeaderText.p_flags  = PF_R + PF_X;      // Segment permissions
programHeaderText.p_offset = TEXT_FILE_LOC;    // File offset for the contents of the segment
programHeaderText.p_vaddr  = TEXT_MEM_LOC;     // Virtual address where the segment will be loaded
programHeaderText.p_paddr  = TEXT_MEM_LOC;     // Same as p_vaddr for "reasons"
programHeaderText.p_filesz = lengthoftext;     // Length of segment in bytes
programHeaderText.p_memsz  = lengthoftext;
programHeaderText.p_align  = 0x0;              // No alignment

Elf64_Phdr programHeaderData;
programHeaderData.p_type   = PT_LOAD;
programHeaderData.p_flags  = PF_R + PF_W + PF_X;           // Segment permissions
programHeaderData.p_offset = TEXT_FILE_LOC + lengthoftext; // File offset for the contents of the segment
programHeaderData.p_vaddr  = TEXT_MEM_LOC + lengthoftext;  // Virtual address where the segment will be loaded
programHeaderData.p_paddr  = TEXT_MEM_LOC + lengthoftext;  // Same as p_vaddr for "reasons"
programHeaderData.p_filesz = lengthofdata;                 // Length of segment in bytes
programHeaderData.p_memsz  = lengthofdata;
programHeaderData.p_align  = 0x0;                          // No alignment

Notice how we define the .data section as a relative offset directly after the .text section, as well as the permissions on each segment. You cannot write to the .text section but you can read, write, as well as execute data from the .data section.

Section Header Table

The section headers are much the same thing as the program headers, just filling in the 'where' and 'how much' of the data in our file. One (semi) important piece of information is the sh_name value. This is the index of the string table of where to read the actual name of the section. For example, if the string data is composed of

"\0.text\0.data\0.shrtrab\0"

The sh_name values will be as follows:

Section sh_name
Null 0
.text 1
.data 7
.shrtrab 13

Since there is four section headers, I will only highlight the differences.

Same for all:

sectionHeaderNull.sh_link      = 0; // Currently unused
sectionHeaderNull.sh_info      = 0;
sectionHeaderNull.sh_addralign = 0;
sectionHeaderNull.sh_entsize   = 0;

Null Header

Elf64_Shdr sectionHeaderNull;
sectionHeaderNull.sh_name      = 0;
sectionHeaderNull.sh_type      = SHT_NULL;
sectionHeaderNull.sh_flags     = 0;
sectionHeaderNull.sh_addr      = 0;
sectionHeaderNull.sh_offset    = 0;
sectionHeaderNull.sh_size      = 0;

.text Header

Elf64_Shdr sectionHeaderText;
sectionHeaderText.sh_name      = 1;
sectionHeaderText.sh_type      = SHT_PROGBITS;              // Type of the segment
sectionHeaderText.sh_flags     = SHF_ALLOC + SHF_EXECINSTR; // Section permissions
sectionHeaderText.sh_addr      = TEXT_MEM_LOC;              // Section memory location
sectionHeaderText.sh_offset    = TEXT_FILE_LOC;             // Section file location
sectionHeaderText.sh_size      = lengthoftext;              // Segment size

.data Header

Elf64_Shdr sectionHeaderData;
sectionHeaderData.sh_name      = 7;
sectionHeaderData.sh_type      = SHT_PROGBITS;
sectionHeaderData.sh_flags     = SHF_ALLOC + SHF_WRITE;
sectionHeaderData.sh_addr      = TEXT_MEM_LOC + lengthoftext;
sectionHeaderData.sh_offset    = TEXT_FILE_LOC + lengthoftext;
sectionHeaderData.sh_size      = lengthofdata;

.shrtrab Header

Elf64_Shdr sectionHeaderShrtrab;
sectionHeaderShrtrab.sh_name      = 13;
sectionHeaderShrtrab.sh_type      = SHT_STRTAB;
sectionHeaderShrtrab.sh_flags     = 0;
sectionHeaderShrtrab.sh_addr      = 0;
sectionHeaderShrtrab.sh_offset    = TEXT_FILE_LOC + lengthoftext;
sectionHeaderShrtrab.sh_size      = lengthofstringtable;

Assembly is fun!

(No it's not)

Assembly is fun fast, but the tradeoff for this is that assembly is very low level (obviously). However, by building an executable from scratch, we are even more limited than normal assembly. We are not linking any libraries in this file, so we don't have any standard functions available. So in order to do any input or output, we need to use the system calls directly.

The scope of this project is not in order to generate bytes from assembly instructions, so I used an online assembler to do the translation. Here is a list of snippets which I found very useful.

Initialization code

xor rbp, rbp
mov r9, rdx
pop rsi
mov rdx, rsp
and rsp, 0xfffffffffffffff0
sub rsp, 8
uint8_t machCode[] = {0x48, 0x31, 0xED, 0x49, 0x89, 0xD1, 0x5E, 0x48, 0x89, 0xE2, 0x48, 0x83, 0xE4, 0xF0, 0x48, 0x83, 0xEC, 0x08};

This is a snippet to set up the stack, I found it somewhere but can't remember where now.

Exit code

mov rax, 1
mov rbx, 0
int 0x80
uint8_t machCode[] = {0x48, 0xC7, 0xC0, 0x01, 0x00, 0x00, 0x00, 0x48, 0xC7, 0xC3, 0x00, 0x00, 0x00, 0x00, 0xCD, 0x80};

This is a snippet to exit with exit code 0 (as controlled by rbx).

Printing

mov rdx, 1   # 1 char
mov rsi, rsp # char *buf
mov rdi, 1   # fd = stdout
mov rax, 1   # sys_write
syscall
uint8_t machCode[] = {0x48, 0xC7, 0xC2, 0x01, 0x00, 0x00, 0x00, 0x48, 0x89, 0xE6, 0x48, 0xC7, 0xC7, 0x01, 0x00, 0x00, 0x00, 0x48, 0xC7, 0xC0, 0x01, 0x00, 0x00, 0x00, 0x0F, 0x05};

This snippet caused me a lot of trouble. Up until I got this working, I was using the 32 bit assembly int 0x80 system interrupt. However after much research onto why system interrupt #1 was not printing anything, I have concluded that in x86_64, int 0x80 and syscall do wildly different things. Currently this snippet just prints 1 character that is located at the stack pointer.

Input

mov rdx, 1   # 1 char
mov rsi, rsp # char *buf
mov rdi, 0   # fd = stdin
mov rax, 0   # sys_read
syscall
uint8_t machCode[] = {0x48, 0xC7, 0xC2, 0x01, 0x00, 0x00, 0x00, 0x48, 0x89, 0xE6, 0x48, 0xC7, 0xC7, 0x00, 0x00, 0x00, 0x00, 0x48, 0xC7, 0xC0, 0x00, 0x00, 0x00, 0x00, 0x0F, 0x05};

Input is almost exactly the same as printing, only using sys_read instead of sys_write. As such, it places a character into the memory location pointed to by the stack pointer.

Conclusion

Putting this all together will give you a bare minimum executable file that the operating system will accept. As such, there are no debugging symbols and barely any metadata for a debugger to work with. All of these code snippets are slightly modified examples from my YABFC repository. If you notice anything wrong, or need any clarification just ask! This is GitHub after all 😄

List of resources