2022Q1
Date of Issue: 01st April 2022
This document describes the System V Application Binary Interface (ABI) for the Arm 64-bit architecture.
Code Model
Please check Application Binary Interface for the Arm® Architecture for the latest release of this document.
Please report defects in this specification to the issue tracker page on GitHub.
This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
Grant of Patent License. Subject to the terms and conditions of this license (both the Public License and this Patent License), each Licensor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Licensed Material, where such license applies only to those patent claims licensable by such Licensor that are necessarily infringed by their contribution(s) alone or by combination of their contribution(s) with the Licensed Material to which such contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Licensed Material or a contribution incorporated within the Licensed Material constitutes direct or contributory patent infringement, then any licenses granted to You under this license for that Licensed Material shall terminate as of the date such litigation is filed.
As identified more fully in the Licence section, this project is licensed under CC-BY-SA-4.0 along with an additional patent license. The language in the additional patent license is largely identical to that in Apache-2.0 (specifically, Section 3 of Apache-2.0 as reflected at https://www.apache.org/licenses/LICENSE-2.0) with two exceptions.
First, several changes were made related to the defined terms so as to reflect the fact that such defined terms need to align with the terminology in CC-BY-SA-4.0 rather than Apache-2.0 (e.g., changing “Work” to “Licensed Material”).
Second, the defensive termination clause was changed such that the scope of defensive termination applies to “any licenses granted to You” (rather than “any patent licenses granted to You”). This change is intended to help maintain a healthy ecosystem by providing additional protection to the community against patent litigation claims.
Contributions to this project are licensed under an inbound=outbound model such that any such contributions are licensed by the contributor under the same terms as those in the Licence section.
The text of and illustrations in this document are licensed by Arm under a Creative Commons Attribution–Share Alike 4.0 International license ("CC-BY-SA-4.0”), with an additional clause on patents. The Arm trademarks featured here are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. Please visit https://www.arm.com/company/policies/trademarks for more information about Arm’s trademarks.
Copyright (c) 2021-2022, Arm Limited and its affiliates. All rights reserved.
Contents
- 1 Preface
- 2 About this document
- 3 Scope
- 4 Software Installation
- 5 Low Level Information
- 6 Programming / Coding Examples
- 7 Code Models
- 8 Object Files
- 9 Program Linking and Dynamic Linking
- 10 Libraries
- 11 Development Environment
- 12 Fortran
- 13 C++
- 14 Linux Implementation Notes
The following support level definitions are used by the Arm ABI specifications:
- Release
- Arm considers this specification to have enough implementations, which have received sufficient testing, to verify that it is correct. The details of these criteria are dependent on the scale and complexity of the change over previous versions: small, simple changes might only require one implementation, but more complex changes require multiple independent implementations, which have been rigorously tested for cross-compatibility. Arm anticipates that future changes to this specification will be limited to typographical corrections, clarifications and compatible extensions.
- Beta
- Arm considers this specification to be complete, but existing implementations do not meet the requirements for confidence in its release quality. Arm may need to make incompatible changes if issues emerge from its implementation.
- Alpha
- The content of this specification is a draft, and Arm considers the likelihood of future incompatible changes to be significant.
This document is at Alpha release quality.
Issue | Date | Change |
---|---|---|
00Alpha | 1st November 2021 | Alpha release containing Code Model only |
This document refers to, or is referred to by, the following documents.
Ref | External reference or URL | Title |
---|---|---|
SYSVABI | Source for this document | System V Application Binary Interface (ABI) for the Arm 64-bit architecture |
ARMARM | DDI 0487 | Arm Architecture Reference Manual Armv8 for Armv8-A architecture profile |
AAPCS64 | IHI 0055 | Procedure Call Standard for the Arm 64-bit Architecture |
AAELF64 | IHI 0056 | ELF for the Arm 64-bit Architecture (AArch64). |
CPPABI64 | IHI 0059 | C++ ABI for the Arm 64-bit Architecture |
GCABI | https://itanium-cxx-abi.github.io/cxx-abi/abi.html | Generic C++ ABI |
- A32
- The instruction set named Arm in the Armv7 architecture; A32 uses 32-bit fixed-length instructions.
- A64
- The instruction set available when in AArch64 state.
- AAPCS64
- Procedure Call Standard for the Arm 64-bit Architecture (AArch64)
- AArch32
- The 32-bit general-purpose register width state of the Armv8 architecture, broadly compatible with the Armv7-A architecture.
- AArch64
- The 64-bit general-purpose register width state of the Armv8 architecture.
- ABI
Application Binary Interface:
- The specifications to which an executable must conform in order to execute in a specific execution environment. For example, the Linux ABI for the Arm Architecture.
- A particular aspect of the specifications to which independently produced relocatable files must conform in order to be statically linkable and executable. For example, the CPPABI64
- Arm-based
- ... based on the Arm architecture ...
- Floating point
- Depending on context floating point means or qualifies: (a) floating-point arithmetic conforming to IEEE 754 2008; (b) the Armv8 floating point instruction set; (c) the register set shared by (b) and the Armv8 SIMD instruction set.
- Q-o-I
- Quality of Implementation – a quality, behavior, functionality, or mechanism not required by this standard, but which might be provided by systems conforming to it. Q-o-I is often used to describe the tool-chain-specific means by which a standard requirement is met.
- SIMD
- Single Instruction Multiple Data – A term denoting or qualifying: (a) processing several data items in parallel under the control of one instruction; (b) the Arm v8 SIMD instruction set: (c) the register set shared by (b) and the Armv8 floating point instruction set.
- SIMD and floating point
- The Arm architecture’s SIMD and Floating Point architecture comprising the floating point instruction set, the SIMD instruction set and the register set shared by them.
- SVE
- The Arm architecture's Scalable Vector Extension.
- T32
- The instruction set named Thumb in the Armv7 architecture; T32 uses 16-bit and 32-bit instructions.
- VG
- The number of 64-bit “vector granules” in an SVE vector; in other words, the number of bits in an SVE vector register divided by 64.
- ILP32
- SysV-like data model where int, long int and pointer are 32-bit
- LP64
- SysV-like data model where int is 32-bit, but long int and pointer are 64-bit.
- LLP64
- Windows-like data model where int and long int are 32-bit, but long long int and pointer are 64-bit.
- SysV
- Unix System V. A variant of the Unix Operating System. Although this specification refers to SysV, many other operating systems, such as Linux or BSD use similar conventions.
- Platform
- A program execution environment such as that defined by an operating system or run- time environment. A platform defines the specific variant of the ABI and may impose additional constraints. Linux is a platform in this sense.
More specific terminology is defined when it is first used.
Except where otherwise stated the AArch64 System V ABI follows the base ABI documents in ABI for the Arm 64 bit architecture. The AArch64 System V ABI documents the places where divergence exists with respect to the base ABI and attempts to act as a unifying document to cover information in a variety of places that is of relevance to a System V implementation.
This document does not describe the software installation on an AArch64 system.
Not available for this Alpha release.
The AArch64 architecture does not allow for instructions to encode arbitrary 64 bit constants in a single instructions. Immediates or constants need to be constructed by a sequence of instructions that are defined in the Arm Architecture Reference Manual Armv8, for Armv8-A architecture profile [ARMARM]. Most instructions accept restricted immediate forms, the details of which are beyond the scope of this document. Given the range of immediates and offsets accepted by various instructions, programming on this architecture lends itself to a set of code models that define a set of constraints to allow an efficient mapping of a program to a set of machine instructions. In the following section we document the code sequences for memory addressing; and in effect document the various memory models produced for the architecture.
Absolute addressing means that the virtual addresses of instructions and statically allocated data are known at static link time. To execute properly the object must be loaded at the virtual address specified by the static linker. Critically this means that the static linker can embed these fixed, absolute addresses into the read-only, shareable code, rather than requiring run-time relocation via a Global Offset Table (GOT). Absolute addressing does not mean that PC-relative addressing cannot be used, if that is the most efficient way to generate an absolute address within the limits supported by the model.
Absolute addressing is suitable for most bare-metal code, including the Linux kernel, as well as for normal GNU/Linux executables which – while dynamically linked – are loaded at a fixed address.
In position independent code (PIC) the virtual addresses of instructions and static data are not known until dynamic link time.
The PIC model depends on an offset, known at static link time between a read-only dynamic relocation free executable segment and a read-write segment containing dynamic relocations. Code in the executable segment accesses data in the read-write segment via PC-relative addressing modes. PIC is typically used when building dynamic shared objects, where references to external variables must use indirect references via a static linker created global offset table (GOT).
A GOT generating relocation is used to inform the static linker to create the GOT entry. The address of symbol definitions that cannot be pre-empted at dynamic link time can have their address taken so no GOT generating relocation is required.
PIC can also be used to build position independent executables. A variant of PIC called PIE (position independent executable) can be used to build an executable. PIE assumes that global symbols cannot be pre-empted, which means that an indirection via the GOT is not needed.
The assembler examples in this document make use of operators to modify the addressing mode used to form the immediate value of the instruction.
The tables below describe the assembler operators that can be used to
alter the relocation directive emitted by the assembler. The typical
syntax is of the form #:<operator>:<symbol name>
Operator | Instruction | Relocation |
---|---|---|
lo12 |
add |
R_AARCH64_ADD_ABS_LO12_NC |
lo12 |
ldr,str |
R_AARCH64_LDST_ABS_LO12_NC |
abs_g0 |
mov[nz] |
R_AARCH64_MOVW_UABS_G0 |
abs_g0_s |
mov[nz] |
R_AARCH64_MOVW_SABS_G0 |
abs_g0_nc |
movk |
R_AARCH64_MOVW_UABS_G0_NC |
abs_g1 |
mov[nz] |
R_AARCH64_MOVW_UABS_G1 |
abs_g1_s |
mov[nz] |
R_AARCH64_MOVW_SABS_G1 |
abs_g1_nc |
movk |
R_AARCH64_MOVW_UABS_G1_NC |
abs_g2 |
mov[nz] |
R_AARCH64_MOVW_UABS_G2 |
abs_g2_s |
mov[nz] |
R_AARCH64_MOVW_SABS_G2 |
abs_g2_nc |
movk |
R_AARCH64_MOVW_UABS_G2_NC |
abs_g3 |
mov[nz] |
R_AARCH64_MOVW_UABS_G3 |
Operator | Instruction | Relocation |
---|---|---|
(no operator) | adrp |
R_AARCH64_ADR_PREL_PG_HI21 |
pg_hi21 |
adrp |
R_AARCH64_ADR_PREL_PG_HI21 |
pg_hi21_nc |
adrp |
R_AARCH64_ADR_PREL_PG_HI21_NC |
(no operator) | adr |
R_AARCH64_ADR_PREL_LO21 |
prel_g0 |
mov[nz] |
R_AARCH64_MOVW_PREL_G0 |
prel_g0_nc |
movk |
R_AARCH64_MOVW_PREL_G0_NC |
prel_g1 |
mov[nz] |
R_AARCH64_MOVW_PREL_G1 |
prel_g1_nc |
movk |
R_AARCH64_MOVW_PREL_G1_NC |
prel_g2 |
mov[nz] |
R_AARCH64_MOVW_PREL_G2 |
prel_g2_nc |
movk |
R_AARCH64_MOVW_PREL_G2_NC |
prel_g3 |
mov[nz] |
R_AARCH64_MOVW_PREL_G3 |
Operator | Instruction | Relocation |
---|---|---|
got |
adrp |
R_AARCH64_ADR_GOT_PAGE |
got_lo12 |
ldr |
R_AARCH64_LD64_GOT_LO12_NC |
gotoff_g0_nc |
movk |
R_AARCH64_MOVW_GOTOFF_G0_NC |
gotoff_g1 |
mov[nz] |
R_AARCH64_MOVW_GOTOFF_G1 |
gotoff_lo15 |
ldr |
R_AARCH64_LD64_GOTOFF_LO15 |
gotpage_lo15 |
ldr |
R_AARCH64_LD64_GOTPAGE_LO15 |
Note
#:got:src
refers to the GOT slot for the symbol src
#:got_lo12:src
refers to the lower 12 bits of the
address of the GOT slot for the symbol src
.
The symbol``__GLOBAL_OFFSET_TABLE__`` refers to the start of the
global offset table (GOT) in the ELF module. The assembler
instruction adrp xn, __GLOBAL_OFFSET_TABLE__
finds the address of the 4KiB page
containing the start of the GOT.
#:gotpage_lo15:src
is a 15-bit offset into the page containing
the GOT entry for src
.
#:gotoff_lo15:src
- This refers to the 15 bit offset from the
start of the GOT.
Operator | Instruction | Relocation |
---|---|---|
gottprel_g0_nc |
movk |
R_AARCH64_TLSIE_MOVW_GOTTPREL_G0_NC |
gottprel_g1 |
mov[nz] |
R_AARCH64_TLSIE_MOVW_GOTTPREL_G1 |
tlsgd |
adrp |
R_AARCH64_TLSGD_ADR_PAGE21 |
tlsgd |
adr |
R_AARCH64_TLSGD_ADR_PREL21 |
tlsgd_lo12 |
add |
R_AARCH64_TLSGD_ADD_LO12_NC |
tlsgd_g0_nc |
movk |
R_AARCH64_TLSGD_MOVW_G0_NC |
tlsgd_g1 |
mov[nz] |
R_AARCH64_TLSGD_MOVW_G1 |
tlsdesc |
adrp |
R_AARCH64_TLSDESC_ADR_PAGE21 |
tlsdesc |
adr |
R_AARCH64_TLSDESC_ADR_PREL21 |
tlsdesc_lo12 |
ldr |
R_AARCH64_TLSDESC_LD64_LO12 |
tlsldm |
adrp |
R_AARCH64_TLSLD_ADR_PAGE21 |
tlsldm |
adr |
R_AARCH64_TLSLD_ADR_PREL21 |
tlsldm_lo12_nc |
add |
R_AARCH64_TLSLD_ADD_LO12_NC |
dtprel_lo12 |
add |
R_AARCH64_TLSLD_ADD_DTPREL_LO12 |
dtprel_lo12 |
ldr |
R_AARCH64_TLSLD_LDST64_DTPREL_LO12 |
dtprel_lo12_nc |
add |
R_AARCH64_TLSLD_ADD_DTPREL_LO12_NC |
dtprel_lo12_nc |
ldr |
R_AARCH64_TLSLD_LDST64_DTPREL_LO12_NC |
dtprel_g0 |
mov[nz] |
R_AARCH64_TLSLD_MOVW_DTPREL_G0 |
dtprel_g0_nc |
movk |
R_AARCH64_TLSLD_MOVW_DTPREL_G0_NC |
dtprel_g1 |
mov[nz] |
R_AARCH64_TLSLD_MOVW_DTPREL_G1 |
dtprel_g1_nc |
movk |
R_AARCH64_TLSLD_MOVW_DTPREL_G1_NC |
dtprel_g2 |
mov[nz] |
R_AARCH64_TLSLD_MOVW_DTPREL_G2 |
tlsdesc_off_g0_nc |
movk |
R_AARCH64_TLSDESC_OFF_G0_NC |
tlsdesc_off_g1 |
mov[nz] |
R_AARCH64_TLSDESC_OFF_G1 |
gottprel |
adrp |
R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21 |
gottprel_lo12 |
ldr |
R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC |
tprel |
add |
R_AARCH64_TLSLE_ADD_TPREL_LO12 |
tprel_lo12 |
add |
R_AARCH64_TLSLE_ADD_TPREL_LO12 |
tprel_lo12 |
ldr |
R_AARCH64_TLSLE_LDST64_TPREL_LO12 |
tprel_hi12 |
add |
R_AARCH64_TLSLE_ADD_TPREL_HI12 |
tprel_lo12_nc |
add |
R_AARCH64_TLSLE_ADD_TPREL_LO12_NC |
tprel_lo12_nc |
ldr |
R_AARCH64_TLSLE_LDST64_TPREL_LO12_NC |
tprel_g2 |
mov[nz] |
R_AARCH64_TLSLE_MOVW_TPREL_G2 |
tprel_g1 |
mov[nz] |
R_AARCH64_TLSLE_MOVW_TPREL_G1 |
tprel_g1_nc |
movk |
R_AARCH64_TLSLE_MOVW_TPREL_G1_NC |
tprel_g0 |
mov[nz] |
R_AARCH64_TLSLE_MOVW_TPREL_G0 |
tprel_g0_nc |
movk |
R_AARCH64_TLSLE_MOVW_TPREL_G0_NC |
Note
Relocations are defined in AAELF64.
The AArch64 A64 instruction set has a number of features and constraints which make it desirable to use different code models for different sizes of executable or dynamic shared object, to improve performance and reduce static code size. The relevant constraints are:
- The LDR (literal) and ADR instructions generate a PC-relative address with a range of +/- 1MiB.
- The ADRP instruction generates a PC-relative, page aligned address with a range of +/- 4GiB (where page = 4KiB).
- The LDR and STR instructions accept a 12-bit unsigned immediate offset, scaled by the access size.
- The B and BL instructiond have a range of +/-128MiB. This is typically used for function / procedure calls.
- A series of MOVZ and up to 3 MOVK instructions can be used to construct a 64-bit value.
- Static linkers may insert a veneer (a sequence of instructions) to implement a relocated B or BL to a destination further away than the +/-128MiB range. The size of executable sections must be limited to 127 MiB to leave space for veneers to be inserted after the section.
- The relative data relocations R_AARCH64_PLT32 and R_AARCH64_PREL32 have a range of +/-2GiB.
The code models assume that an executable or shared library use an ELF file layout similar to the diagram below.
Illustrative ELF file layout
The table below identifies the code models that have been defined, along with the assumptions that the code model may make.
Code Model | Max text segment size | Max combined span of text and data segments | Additional GOT restrictions |
---|---|---|---|
tiny | 1 MiB | 1 Mib | none |
small | 2GiB | 4 GiB | pic: got size < 32 KiB |
PIC: none | |||
medium | 2GiB | no restriction | pic: got size < 32 KiB |
PIC: none | |||
large | 2GiB | no restriction | max distance from text to GOT < 4 GiB |
Note
1. The max segment size columns describes the total span of the statically allocated content. It says nothing about the base address of the program or shared object, which may be located anywhere within the AArch64 virtual address space.
2. The definition of the text segment includes the shareable PLT, code and read-only data sections. If the components of text segment are in separate consecutive ELF segments then the text segment is the maximum combined span of the ELF segments.
3. The data segment contains the statically defined, writeable, per-process data sections. In all models dynamically allocated data and stack can make use of the full virtual address space, dependent on operating system addressing limits. If the components are in separate ELF segments, the data segment is the maximum combined span of the ELF segments.
4. The code models assume the text and data segments are consecutive in virtual memory with only padding for segment alignment between them. Programs that have significant separation between the code and data segments must take this extra distance into account in the max combined span of text and data segments.
5. The text segment maximum size is limited to 2GiB by R_AARCH64_PLT32 relocations from .ehframe sections.
6. While designing the code models it was estimated that only 2.6% of load modules (executables and dynamic shared objects) have a max text segment size greater than 1MiB; the rest would all fit into the tiny model. However to avoid build option changes, it is recommended that the small model should be the default, with an explicit option to select the tiny model.
7. Executables and shared objects may be linked dynamically with other shared objects which use a different code model.
8. Linking of relocatable objects of different code models is possible as is linking of PIC/PIE and non-PIC relocatable objects. The result of the combination is always the most limited model. For example the combination of a tiny code model PIC object and small code model non-PIC object is a tiny non-PIC executable.
9. The large code model is aimed at programs with large amounts of read-write data, not large amounts or sparsely placed code.
The convention for command-line option to select code model is
-mcmodel=<model>
where code model is one of tiny
, small
,
medium
or large
.
The convention for command-line option to select position-independent
code is -fpic
for pic -fPIC
for PIC. When compiling for an
executable -fpie
and -fPIE
have the same GOT size limitations
as -fpic
and -fPIC
respectively.
Not all compilers will implement all of the code models. The table below describes the implementation status of code models for two open-source compilers.
Code Model | GCC 11 | Clang 12.0 |
---|---|---|
tiny | yes | yes |
small | pic and PIC implemented | pic and PIC accepted but PIC used |
medium | pic and PIC proposed | pic and PIC proposed |
large | no-pic support | no-pic support |
The medium code model is suitable for programs that do not have large amounts of executable code, but do contain large static data objects. The medium code model separates data into small data and large data. Small data is addressed in the same way as the small code model, with the same maximum size restrictions as the small code model. Large data is addressed via the GOT and does not have a maximum size limit.
Data is defined as large data if any of the following are true:
- The data is >= 64 KiB in size.
- The data is of unknown size.
- The data will be represented by an unallocated common block, described by a
SHN_COMMON
symbol.
All other data is small data.
Large data is placed in sections with an l
prefix, for example .lbss
,
.ldata
and .lrodata
. Both small and large RELRO data are placed in
.data.rel.ro
as many loaders only support one PT_GNU_RELRO
program
header. At static link time the large data sections must be placed after all
small data sections. If it is the last small data section, the static linker
allocates storage for SHN_COMMON
symbols at the end of the bss
section. Otherwise they are allocated in lbss
.
The following section provide some sample code sequences for addressing static data. The samples are provided to illustrate the effects on code-generation of the code-models.
Code that is not position independent may use the absolute address of the symbol. Code that is position independent may use a pc-relative offset to the symbol if the definition of the symbol is not pre-emptible. If the symbol is pre-emptible the address must be loaded from the GOT.
PC-relative offset of +/- 1 MiB. Suitable for tiny code model.
adr x0, var
PC-relative offset of +/- 4 GiB. Suitable for small/medium code model.
adrp x0, var
add x0, x0, #:lo12: var
PC-relative offset not within +/- 4 GiB. Suitable for large symbols in the medium code model.
adrp x0, :got: var
ldr x0, [x0, :got_lo12: var]
Absolute load of a literal, where the literal is within 1 MiB. Suitable for large code model if the literal is defined within the same section as the code.
ldr x0, .L0
.L0
.xword var
Absolute using a load of a literal where the literal is within 4 GiB. Suitable for the large code model.
adrp x0, .L0
ldr x0, [x0, #:lo12: .L0]
.L0
.xword var
Absolute using 4 instructions. Suitable for the large code model.
movz x0, # :abs_g0_nc: var
movk x1, # :abs_g1_nc: var
movk x2, # :abs_g2_nc: var
movk x3, # :abs_g3: var
In the general case one of the sequences above is used to get the address. A load instruction can then be used to obtain the value. In some cases the add can be folded into the load.
PC-relative offset of +/- 4 GiB. Suitable for the small/medium code model.
adrp x0, var
ldr x0, [x0, #:lo12:var]
The GOT is a linker generated table of addresses. Every address in the
GOT is at least 8 byte aligned and the base of the GOT can be obtained
by the linker defined __GLOBAL_OFFSET_TABLE__
symbol.
All of GOT is within 1 MiB. Suitable for the tiny code model.
ldr x0, :got: var
Base of GOT is within 4 GiB, GOT size is < 32 KiB. Suitable for pic.
adrp x0, :got: __GLOBAL_OFFSET_TABLE__
ldr x0, [x0, # :gotpage_lo15: var]
All of GOT is within 4 GiB, GOT size is >= 32 KiB. Suitable for PIC.
adrp x0, :got: var
ldr x0, [x0, # :got_lo12: var]
An undefined weak reference resolves to 0. In the general case it is not possible to give an offset that when added to the PC will result in 0. To get the address of a weak reference the compiler can use a load from literal or acccess the address via a GOT entry, which will evaluate to 0 if the symbol is undefined.
These follow the AAELF64 base definition from Arm.
Not available for this Alpha release.
Not applicable
Not applicable
Not applicable
Refer to CPPABI64
Not available for this Alpha release.