Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalid instruction adrl in arch/arm/{mach-omap2/sleep34xx|crypto/{sha512-core|sha256-core}|boot/compressed/head}.S #430

Closed
agners opened this issue Mar 23, 2019 · 15 comments
Labels
[ARCH] arm32 This bug impacts ARCH=arm [BUG] linux A bug that should be fixed in the mainline kernel. [FIXED][LINUX] 5.11 This bug was fixed in Linux 5.11 [TOOL] integrated-as The issue is relevant to LLVM integrated assembler [WORKAROUND] Applied This bug has an applied workaround

Comments

@agners
Copy link

agners commented Mar 23, 2019

Assembling the kernel with the integrated assembler leads to the following error:

arch/arm/mach-omap2/sleep34xx.S:92:2: error: invalid instruction, did you mean: adr?
 adrl r3, l2dis_3630_offset @ may be too distant for plain adr
 ^

There are probably other cases e.g. in arch/arm/crypto/sha512-core.S.

@agners agners added [BUG] Untriaged Something isn't working [TOOL] integrated-as The issue is relevant to LLVM integrated assembler [ARCH] arm32 This bug impacts ARCH=arm labels Mar 23, 2019
@agners
Copy link
Author

agners commented Mar 23, 2019

The file arch/arm/crypto/sha512-core.S can be assembled in Thumb2 mode as well, it then use a preprocessor directive to convert the adrl instruction into a adr instruction... So I assume in this case it is safe to use adr also for LLVM, e.g.

.syntax unified
#if defined(__thumb2__) || defined(__clang__)
#  define adrl adr
...

(FWIW, I also came across the same issue in the ChaCha20 implementation proposed by @zx2c4, see https://lore.kernel.org/r/[email protected]/)

However, the arch/arm/mach-omap2/sleep34xx.S case is different since it always gets assembled in ARM mode and the comment after the instruction explicitly says the label might be too distant for a plain adr... Do we need proper adrl support in LLVM here?

@smithp35 do you have thoughts to this case here?

@smithp35
Copy link

smithp35 commented Mar 25, 2019

ADRL is a difficult pseudo-instruction to implement in LLVM in its full generality due to the way the assembler is implemented. Details of why can be found at https://llvm.org/pr24350 ironically this was my first task in LLVM. I did find a couple of macros that could be used in many of the places that ADRL can be used. The 2 deficiencies being:

  • Need to choose a macro depending on whether the label has been defined or not.
  • The rotations are fixed, which limits the range. With a suitably aligned destination the assembler could do better.
.macro MYADRL reg:req, label:req
add \reg, pc, #((\label - .) & 0xff00) 
add \reg, \reg, #((\label - .) - ((\label - .) & 0xff00)) - 4        
.endm

.macro MYADRLSUB reg:req, label:req
sub \reg, pc, #((. - \label) & 0xff00)
sub \reg, \reg, #((. - \label) - ((. - \label) & 0xff00)) + 4
.endm

@agners
Copy link
Author

agners commented Mar 29, 2020

It seems that particular instance does not really need a adrl instruction as the label is within reach of the regular adr instruction. Will send a patch replacing adrl with adr.

@agners agners added [PATCH] Submitted A patch has been submitted for review [BUG] linux A bug that should be fixed in the mainline kernel. and removed [BUG] Untriaged Something isn't working labels Mar 29, 2020
fengguang pushed a commit to 0day-ci/linux that referenced this issue Mar 29, 2020
The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux#430
Signed-off-by: Stefan Agner <[email protected]>
@agners
Copy link
Author

agners commented Mar 29, 2020

Note that this is not technically a bug in Linux, but we can work around in Linux using adr.

Patch:
https://lore.kernel.org/linux-arm-kernel/5a6807f19fd69f2de6622c794639cc5d70b9563a.1585513949.git.stefan@agner.ch/

@nickdesaulniers nickdesaulniers changed the title invalid instruction adrl in arch/arm/mach-omap2/sleep34xx.S invalid instruction adrl in arch/arm/{mach-omap2/sleep34xx|crypto/sha512-core.S/sha256-core.S}.S Sep 2, 2020
@nickdesaulniers nickdesaulniers changed the title invalid instruction adrl in arch/arm/{mach-omap2/sleep34xx|crypto/sha512-core.S/sha256-core.S}.S invalid instruction adrl in arch/arm/{mach-omap2/sleep34xx|crypto/sha512-core.S/sha256-core|boot/compressed/head}.S Sep 2, 2020
@nickdesaulniers nickdesaulniers changed the title invalid instruction adrl in arch/arm/{mach-omap2/sleep34xx|crypto/sha512-core.S/sha256-core|boot/compressed/head}.S invalid instruction adrl in arch/arm/{mach-omap2/sleep34xx|crypto/{sha512-core|sha256-core}|boot/compressed/head}.S Sep 2, 2020
@nickdesaulniers
Copy link
Member

nickdesaulniers commented Sep 2, 2020

Following up on the lore thread from March. It looks like linker support for group relocations landed in lld. @ardbiesheuvel had a question about them, but I don't understand how he was envisioning using them?

It looks like @agners patch landed in d85d524, v5.8-rc1, but there are a couple more instances of adrl being used; at least for an ARCH=arm defconfig (v7), I see:

  • arch/arm/crypto/sha512-core.S
  • arch/arm/boot/compressed/head.S
  • arch/arm/crypto/sha256-core.S

@ardbiesheuvel what did you have in mind? Was this it or something else, like @smithp35 's suggestion?

@smithp35
Copy link

smithp35 commented Sep 2, 2020

Only the most basic of the group relocations landed in LLD unfortunately. I didn't add support for the larger group values. There is no way of generating them in llvm-mc either. In summary adrl r0, symbol is something like (I can never remember which way round g0 and g1 go) :
add r0, pc :pc_g0: sym
add r0, r0 :pc_g1: sym
The g1 operator masks off the bits calculated by g0 and then tries to encode the residual value using a shift.

It would probably be less disruptive to add support :pc_g0:, :pc_g1: to MC and LLD and using a macro for adrl when clang is used than it would be to try and add support ADRL.

From memory ADRL has several complications:

  • ARM backend doesn't support one pseudo instruction mapping to more than one instruction (can be fixed to return a list, but this has large code-churn).
  • adrl has to come out as two add instructions or two sub instructions depending on whether the offset is positive or negative and you don't know that till late on.
  • I seem to remember it being quite awkward to make a fixup transform an instruction in MC. I think it could by expanding into an adr instruction and a new pseudo instruction that was like adr but performed the :pc_g1: calculation, but doesn't itself have an assembler syntax.

@ardbiesheuvel
Copy link

ardbiesheuvel commented Sep 11, 2020

Yes, providing something like the following when using Clang and building in ARM mode should do the trick afaict.

        .macro adrl, reg:req, lbl:req
        .subsection 1
.L1_\@: .long \lbl - (.L2_\@ + 8)
        .previous
        ldr \reg, .L1_\@
.L2_\@: add \reg, \reg, pc
        .endm

although I suppose this may trigger another limitation of the assembler, due to the use of a subsection?

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Sep 11, 2020

The .subsection and \@ are new to me. From the docs on .subsection, it looks like that will move the contained sequence of instructions out of line from the current section (maybe placing it at the end of the current section)?

Then \@ is like the C preprocessors UNIQUE for defining unique local labels from a .macro?

due to the use of a subsection?

Seems to work:

.macro adrl, reg:req, lbl:req
.subsection 1
.L1_\@: .long \lbl - (.L2_\@ + 8)
.previous
ldr \reg, .L1_\@
.L2_\@: add \reg, \reg, pc
.endm

foo:
adrl r3, l2dis_3630_offset
bx lr

l2dis_3630_offset:
bx lr

same output from the disassembler between clang, gcc, and GNU as.

One thing I'm surprised from my example though; it seems it's not an error to define the macro that conflicts with a pseudo. No tools complain about that, so I guess everything is "working as expected," but I would have expected to have to surround the macro definition in a preprocessor guard to define it only for clang. Or maybe that's a feature request to warn when a macro definition would shadow an existing instruction?

Oh, if I remove the macro definition then reassemble with GNU as or GCC, the output is no longer the same.

No macro:

00000000 <foo>:
   0:   e28f3004        add     r3, pc, #4
   4:   e1a00000        nop                     ; (mov r0, r0)
   8:   e12fff1e        bx      lr

0000000c <l2dis_3630_offset>:
   c:   e12fff1e        bx      lr

(not sure what's up with that nop...would have expected maybe a relocation there? Oh, is that the assembler not performing relaxation? As in it's always returning 2 instructions from adrl, either a ldr+add for large offsets or add+nop for small offsets?)

vs macro:

00000000 <foo>:
   0:   e59f3008        ldr     r3, [pc, #8]    ; 10 <l2dis_3630_offset+0x4>
   4:   e083300f        add     r3, r3, pc
   8:   e12fff1e        bx      lr

0000000c <l2dis_3630_offset>:
   c:   e12fff1e        bx      lr

Ah, I guess those are the same, it't just that the psuedo can smartly be expanded into macro case only when the offset is too large to encode? In that case, we'd still probably want to wrap the macro definition in preprocessor guards, so that users of GNU as don't pay the penalty Clang users would (1 additional non-nop-instruction per adrl, which is not terrible...).

@ardbiesheuvel
Copy link

The adrl pseudo resolves to either add+nop or add+add (depending on the distance), so it is always two instructions.

The adrl macro resolves to a literal load + add, so it takes up two instructions, plus 4 bytes for the literal (plus alignment overhead).

So the macro should really only be used when it is needed.

@ardbiesheuvel
Copy link

(in response to @smithp35)

The following code works in GAS

	.macro	adrl, reg:req, lbl:req
	.reloc	., R_ARM_ALU_PC_G1, \lbl
	add	\reg, pc, #-8
	.reloc	., R_ARM_ALU_PC_G0_NC, \lbl
	add	\reg, \reg, #0
	.endm

	adrl	r0, 0f

	.skip 1028
0:	.long	0x0

and produces the following output after linking

00000190 <.text>:
 190:	e28f0004 	add	r0, pc, #4
 194:	e2800e40 	add	r0, r0, #64, 28	; 0x400

so would it be feasible for the Clang assembler to do the same? You will still need to implement the group relocation handling in the linker, of course, but the assembler will not have to deal with the group relocation arithmetic at all.

@smithp35
Copy link

(in response to @smithp35)

The following code works in GAS

	.macro	adrl, reg:req, lbl:req
	.reloc	., R_ARM_ALU_PC_G1, \lbl
	add	\reg, pc, #-8
	.reloc	., R_ARM_ALU_PC_G0_NC, \lbl
	add	\reg, \reg, #0
	.endm

	adrl	r0, 0f

	.skip 1028
0:	.long	0x0

and produces the following output after linking

00000190 <.text>:
 190:	e28f0004 	add	r0, pc, #4
 194:	e2800e40 	add	r0, r0, #64, 28	; 0x400

so would it be feasible for the Clang assembler to do the same? You will still need to implement the group relocation handling in the linker, of course, but the assembler will not have to deal with the group relocation arithmetic at all.

Yes that code sequence will assemble in the integrated assembler with a relatively recent version of clang. Support for .reloc in Arm and AArch64 went in about 6 Months ago. Possibly for clang 10 but definitely for clang 11.

Would definitely need support in LLD. If it would help resolve this issue then I think it would be worth a go at implementing them. I don't have any spare work time available at the moment, but I could probably take a look in my spare time although progress could be slow.

@nickdesaulniers
Copy link
Member

@ardbiesheuvel sent a series related to this: https://lore.kernel.org/linux-arm-kernel/[email protected]/T/#t. Looks like a solid cleanup on quick scan (interesting that adrl doesn't exist for THUMB). I hope to be able to test by EOW and will report on the mailing list thread. Thank you @ardbiesheuvel .

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Sep 15, 2020

@nickdesaulniers
Copy link
Member

5478193 landed in v5.10-rc1.
Looks like https://lore.kernel.org/linux-arm-kernel/[email protected]/ is on track to land in v5.11.

@nickdesaulniers nickdesaulniers added [PATCH] Accepted A submitted patch has been accepted upstream and removed [PATCH] Submitted A patch has been submitted for review labels Dec 1, 2020
@nickdesaulniers nickdesaulniers added [FIXED][LINUX] 5.11 This bug was fixed in Linux 5.11 [WORKAROUND] Applied This bug has an applied workaround and removed [PATCH] Accepted A submitted patch has been accepted upstream labels Feb 3, 2021
mrchapp pushed a commit to mrchapp/linux that referenced this issue Mar 15, 2021
commit d85d524 upstream.

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
KonstaT pushed a commit to lineage-rpi/android_kernel_brcm_rpi that referenced this issue Nov 4, 2021
commit d85d524 upstream.

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux/linux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Bug: 141693040
Change-Id: I645604524b52b963c8223b82c29a0afd73e104d5
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this issue Jun 30, 2022
commit d85d524 upstream

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this issue Jun 30, 2022
commit d85d524 upstream

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-block that referenced this issue Jun 30, 2022
commit d85d524 upstream

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux/linux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-block that referenced this issue Jul 2, 2022
commit d85d524 upstream

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux/linux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-block that referenced this issue Jul 2, 2022
commit d85d524 upstream

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux/linux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
woodsts pushed a commit to woodsts/linux-stable that referenced this issue Jul 2, 2022
commit d85d524 upstream

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux/linux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-block that referenced this issue Jul 2, 2022
commit d85d524 upstream

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux/linux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
jpuhlman pushed a commit to MontaVista-OpenSourceTechnology/linux-mvista that referenced this issue Aug 5, 2022
Source: Kernel.org
MR: 120397
Type: Integration
Disposition: Backport from git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable linux-5.4.y
ChangeID: 9e00e5d195ed38bf577101100fa67bf02139623d
Description:

commit d85d524 upstream

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux/linux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Armin Kuster <[email protected]>
oraclelinuxkernel pushed a commit to oracle/linux-uek that referenced this issue Aug 26, 2022
commit d85d524 upstream

The adrl instruction has been introduced with commit dd31394 ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux/linux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit 9e00e5d195ed38bf577101100fa67bf02139623d)
Signed-off-by: Sherry Yang <[email protected]>
yuurha12 pushed a commit to yuurha12/moonstone_personal that referenced this issue Jul 30, 2023
commit d85d5247885ef2e8192287b895c2e381fa931b0b upstream

The adrl instruction has been introduced with commit dd31394779aa ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: ClangBuiltLinux/linux#430
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[ARCH] arm32 This bug impacts ARCH=arm [BUG] linux A bug that should be fixed in the mainline kernel. [FIXED][LINUX] 5.11 This bug was fixed in Linux 5.11 [TOOL] integrated-as The issue is relevant to LLVM integrated assembler [WORKAROUND] Applied This bug has an applied workaround
Projects
None yet
Development

No branches or pull requests

4 participants