Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support relocation kind 0003 and extend IMAGE_REL_ types #141

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jonahbeckford
Copy link
Contributor

@jonahbeckford jonahbeckford commented Jul 29, 2024

Fixes #29 .

Specifically adds support for:

  • IMAGE_REL_AMD64_ADDR32NB. (This is the "relocation kind 0003").

  • IMAGE_REL_I386_DIR32NB. (This is the x86 version of the above).

  • IMAGE_REL_AMD64_REL32_5. (This is documented but weirdly not implemented).

Tested with my ucrt branch https://github.com/jonahbeckford/flexdll/tree/0.43%2Bucrt where relocation kind 0003 occurs often. It especially occurs in "normal" C libraries (ucvrt; /MD) that are linked into an OCaml executable. None of the tested code used /GS-.

References:

Specifically adds support for:

- IMAGE_REL_AMD64_ADDR32NB. (This is the "relocation kind 0003").

- IMAGE_REL_I386_DIR32NB. (This is the x86 version of the above).

- IMAGE_REL_AMD64_REL32_5. (This is documented but weirdly not implemented).
flexdll.c Outdated
err->code = 3;
goto restore;
}
*((UINT32*) ptr->addr) = s;
Copy link
Collaborator

@nojb nojb Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own understanding:

  • In LLVM, this relocation is implemented by add32(off, s), so that off in LLVM corresponds to ptr->addr in Flexlink and s in LLVM corresponds to s in Flexlink.
  • The relocation IMAGE_REL_AMD64_ADDR64 in LLVM (RELOC_ABS in Flexlink) is implemented by add64(off, s + imageBase), so that s + imageBase in LLVM corresponds to s in Flexlink.

Do you understand why the s arguments do not seem to match between the two cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going one step further, those equations imply that imageBase = 0.

I spent a lot of time looking and I couldn't find out why IMAGE_REL_AMD64_ADDR64 seems to work. The math for all the other relocation kinds make sense.

However, two things ...

  1. The /base: base address link.exe option is fixed for OCaml executable to be 0x10000 at
    let base_addr = ref "0x10000"

    and also the same in
    let image_base = 0x10000l in

    for OCaml plugins / stubs.
  2. All of the relative relocations from this section of code are translated by:

    flexdll/reloc.ml

    Lines 470 to 475 in 80496b5

    Reloc.abs !machine sect (Int32.of_int (Buffer.length data)) strsym;
    int_to_buf data pos;
    Reloc.abs !machine sect (Int32.of_int (Buffer.length data))
    (Lazy.force secsym);
    int_to_buf data (Int32.to_int rel.addr);

    into absolute relocations by Reloc.abs:

    flexdll/coff.ml

    Lines 437 to 444 in 80496b5

    module Reloc = struct
    let abs machine sec addr sym =
    let rtype =
    match machine with
    | `x86 -> 0x06
    | `x64 -> 0x01
    in
    sec.relocs <- { addr = addr; symbol = sym; rtype = rtype } :: sec.relocs

So my suspicion was that the Reloc.abs absolute relocations were being translated at CreateProcess time by ntdll.LdrInitializeThunk (or whatever is reading the PE .reloc section) to complete the imageBase adjustment.

@dra27, any chance you can explain why base_addr (or image_base for DLLs) is not used in the calculations?

(These calculations are undocumented in flexdll and fairly complicated)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's far too long since I stared at all this, but are we comparing like with like here? Isn't that code in LLVM the part of a linker which is computing actual relocations to be carried out by the loader? In flexdll's, case, I don't think we have to worry about the image base - the symbol table as a data structure has already been had all the relocations resolved when the executable was loaded (by the Windows loader) - so s in flexdll is an absolute address for the symbol, where in lld it's still base-relative?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your loader comment is what I had meant with "translated at CreateProcess time by ntdll.LdrInitializeThunk". But I am still puzzled why that only applies to IMAGE_REL_AMD64_ADDR64.

@jonahbeckford
Copy link
Contributor Author

I just noticed that I have not extended the DLL logic at

flexdll/create_dll.ml

Lines 235 to 253 in 80496b5

match !Cmdline.machine, r.rtype with
| `x86, 0x06 (* IMAGE_REL_I386_DIR32 *)
| `x64, 0x02 (* IMAGE_REL_AMD64_ADDR32 *) ->
(* 32-bit VA *)
relocs := (rel_rva, `R32) :: !relocs;
Buf.patch_lazy_int32 buf pos (lazy (Int32.add (Int32.add initial (Lazy.force rva)) image_base))
| `x64, 0x01 (* IMAGE_REL_AMD64_ADDR64 *) ->
(* 64-bit VA *)
assert(read_int32 sdata (pos + 4) = 0l);
relocs := (rel_rva, `R64) :: !relocs;
Buf.patch_lazy_int32 buf pos (lazy (Int32.add (Int32.add initial (Lazy.force rva)) image_base))
| `x86, 0x14 (* IMAGE_REL_I386_REL32 *)
| `x64, 0x04 (* IMAGE_REL_AMD64_REL32 *) ->
Buf.patch_lazy_int32 buf pos (lazy (Int32.sub (Int32.add initial (Lazy.force rva)) (Int32.add (Lazy.force rel_rva) 4l)))
| _, k ->
Printf.ksprintf failwith "Unsupport relocation kind %04x for %s"
k r.symbol.sym_name

That DLL logic does use the image_base for IMAGE_REL_AMD64_ADDR64 so at least in that code the IMAGE_REL_AMD64_ADDR64 math makes sense.

I don't really have a way to test that. The stubs/plugins generated by OCaml do not have those relocations (which is also why I didn't notice them).

flexdll.c Outdated Show resolved Hide resolved
flexdll.c Outdated Show resolved Hide resolved
Copy link
Member

@dra27 dra27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct to me - at least, it sounds as though it's reasonably clear from your testing that the RELOC_32NB implementation is working, and the RELOC_REL32_5 implementation is clearly correct w.r.t. the others already there.

Do you mind adding the last missing RELOC_REL32_3 while we here, and I think this can go straight in.

reloc.ml Outdated Show resolved Hide resolved
@jonahbeckford
Copy link
Contributor Author

jonahbeckford commented Nov 26, 2024

FIXED. This is an independent problem.

This PR needs to stay open ... I am using this PR much more extensively and I'm encountering:

Fatal error: cannot load shared library dllthreads
Reason: flexdll error: cannot relocate caml_stat_free RELOC_REL32, target is too far: FFFFFFFF43282860  0000000043282860
Fatal error: cannot load shared library dllcurl_stubs
Reason: flexdll error: cannot relocate Caml_state RELOC_REL32, target is too far: FFFFFFFF2DAA3206  000000002DAA3206

etc.

@MisterDA: Looks like casting of int64/uint64 to uint32 is not done correctly ... I think it was just luck before if it was working. Your Avoid conversion from 'INT_PTR' to 'UINT32' warnings review suggestion is very close to where I'm seeing the failures, so it sounds like the C compiler was giving us a good warning. If you know how to make those INT_PTR/INT32/UINT32 operations robust, reply back! I'll investigate more in the meantime.

EDIT: After looking at old emails, it sounds like the checks are to make sure that the relocations are within 2GB, which presumes all the DLLs are within 2GB (here and here). And that begs the question what magic mechanism is enforcing that 2GB bin-packing! I wonder if there is a MSVC option to force all these problematic address references to be 64-bit on 64-bit machines.

EDIT 2: Changing CAMLextern to be __declspec(dllimport) extern on MSVC (diskuv/dkml-compiler@b211c65...main) forces the address references to be 64-bit. And there will be packages like mirage/digestif#157 that will need adjustments if they don't use CAMLextern.

Jonah Beckford added 2 commits November 26, 2024 13:51
Also reorder match (OCaml) and case (C) for clarity and to ensure nothing is missed.
Copy link
Member

@dra27 dra27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit with the (very welcome) readme update, but otherwise good to go, thanks!

README.md Outdated Show resolved Hide resolved
@dra27
Copy link
Member

dra27 commented Nov 28, 2024

Indeed, the docs in flexdll are out of date w.r.t. __declspec(dllimport). FlexDLL is showing its age, in that when it was first written Windows was still overwhelmingly 32-bit! Originally, the magic was that by specifying a base address for the DLL by default simply caused the 64-bit to load the DLLs close - there was never a guarantee, it just worked reliably for quite some time.

Except that we lost Cygwin64 support in 2018 with the release of Cygwin's rebase 4.4.4 (see this thread for the explanation of why that change was made). That killed shared library / natdynlink support for Cygwin until 4.12 (and ocaml/ocaml#9927). That's when the __declspec(dllimport) mechanism was restored (and is when the README ought to have been updated).

IIRC mingw-w64 had never clamped the base address because it hadn't been necessary, but that started to become necessary with binutils 2.36 - at which point (ocaml/ocaml#10351) co-opted the same mechanism for mingw-w64. Hwoever, msvc64 still uses the base address trick.

@dra27
Copy link
Member

dra27 commented Nov 28, 2024

Are those failures with msvc64 or mingw-w64, @jonahbeckford? IIRC opam-repository-mingw "fixed" the binutils thing by forcing the base address for mingw-w64 instead, which means that this will be becoming a gradually bigger problem, requiring more uses of CAMLextern in libraries, indeed.

The systhreads failure is surprising, as I thought that was only in 5.1+ (and is fixed in 5.2.1 and 5.3.0) - which version of OCaml is that one with?

Co-authored-by: David Allsopp <[email protected]>
@jonahbeckford
Copy link
Contributor Author

Are those failures with msvc64 or mingw-w64, @jonahbeckford? IIRC opam-repository-mingw "fixed" the binutils thing by forcing the base address for mingw-w64 instead, which means that this will be becoming a gradually bigger problem, requiring more uses of CAMLextern in libraries, indeed.

msvc64

The systhreads failure is surprising, as I thought that was only in 5.1+ (and is fixed in 5.2.1 and 5.3.0) - which version of OCaml is that one with?

4.14.2 with the nine (9) ocaml-common-4_14-a* patches at https://github.com/diskuv/dkml-compiler/tree/main/src/p and two (2) non-DkML patches (see below) using the ucrt branch of flexdll (ucrt link in this PR's main comment).

ocaml-common-4_14-z02: This is relevant to systhreads.
https://gist.github.com/jonahbeckford/6cdbfb559b3f2c0b4eb143210bed0d3f. Of course, that is orthogonal to this PR but I'm curious what needed to be fixed with it. (I only backport a change when I see a problem, but I generally don't monitor the OCaml 5 changelog. I was expecting bugfixes to be backported to the 4.14 "LTS" version but that hasn't been happening.)

@jonahbeckford
Copy link
Contributor Author

.... FlexDLL is showing its age

Thanks for the context! I feel at some point I might explore getting rid of flexdll entirely. There is a lot of complexity in flexdll for a a) use case centered around resolving circular references for plugins which I think was not a great design, and b) to remove the need for __declspec(dll{im,ex}port) which is no longer true today for 64-bit. Combine that with the sad state of flexdll security like missing ASLR and the (huge) effort I needed to remove /GS- from

MSVC_FLAGS = /nologo /MD -D_CRT_SECURE_NO_DEPRECATE /GS- /W3

Basically, I think today there is little benefit and a lot of drawbacks where 10-20 years ago the balance was a lot more positive. But OCaml is deeply tied to the circular reference model for stubs and also (maybe related) deeply tied to a flat symbol namespace ... the two assumptions from ELF linking where DLL and dylib linking have better designs ... so I'm under no illusion that untangling the assumptions would be easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unsupported relocation kind 0003 for __GSHandlerCheck in libcamlrun.lib(intern.obj)
4 participants