Float128 calling convention and alignment #4

simonbyrne · 2019-02-18T03:59:13Z

At the moment we fake the calling convention on Mac and Linux by reinterpreting as NTuple{2,VecElement{Float64}}. Unfortunately this doesn't seem to work on Windows.

Upstream:

Also mentioned in

reinterpret segfault JuliaLang/julia#21216

The text was updated successfully, but these errors were encountered:

simonbyrne · 2019-02-26T01:27:51Z

cc: @vchuravy @vjtnash @yuyichao

yuyichao · 2019-02-26T02:06:51Z

Just a thought, have you try to add a x86_vectorcallcc on the ccall? Tht seems to produce the correct calling convention assuming you only have fp128.

simonbyrne · 2019-02-26T04:12:02Z

how do I do that? is that an undocumented calling convention?

simonbyrne · 2019-02-26T04:18:42Z

Ah, we don't seem to support it. Should I open an issue?

yuyichao · 2019-02-26T04:19:39Z

I don't think our ccall supports it and I'm not sure where it's documented in LLVM but you can use this trick

yuyichao · 2019-02-26T04:21:00Z

Note that I think you won't be able to use it in ccall directly even if we support it since supporting it in ccall would probably need to come with automatic name mangling support...

simonbyrne · 2019-02-26T05:30:22Z

I see, thanks.

RalphAS · 2019-03-04T03:41:57Z

I tried some experiments following @yuyichao 's suggestion, for example

function baz(x::Float128, y::Float128)
    r = Base.llvmcall("""%f = inttoptr i64 %2 to <2 x double> (<2 x double>, <2 x double>)*
                    %vv = call x86_vectorcallcc <2 x double> %f(<2 x double> %0, <2 x double> %1)
                    ret <2 x double> %vv""",
                 Cfloat128, Tuple{Cfloat128,Cfloat128,Ptr{Cvoid}},
                 x.data, y.data, cglobal((:__addtf3,quadoplib)))
    return Float128(r)
end

Replacing methods in Quadmath with patterns like this works fine for Linux and OSX. It runs on Windows (no segfaults) but produces incorrect answers, apparently because the Windows version of libgcc_s doesn't return Float128 results in xmm0 as expected from the ABI. (AFAICT the value of xmm0 is preserved across the call.)

yuyichao · 2019-03-04T03:59:53Z

Come to think about it, depending on if LLVM and GCC agrees on the calling convention, if you are already using llvmcall, you can probably directly use fp128 in it ;-p....

Other than that, I have no idea what calling convention gcc uses..... It is worth noting that GCC does NOT use the same vector calling convention as clang or msvc. (I didn't realize you are calling gcc compiled library....) I just finished dealing with a similar issue with vector calls on windows so I'm pretty positive on that....... You might have to check the assembly code to figure out what calling convention gcc is using..... =(

simonbyrne · 2019-03-04T05:02:57Z

(I didn't realize you are calling gcc compiled library....)

we're calling into libquadmath, which is bundled as part of the gcc runtime (which we ship with Julia apparently)

simonbyrne · 2019-03-04T05:06:05Z

Replacing methods in Quadmath with patterns like this works fine for Linux and OSX. It runs on Windows (no segfaults) but produces incorrect answers, apparently because the Windows version of libgcc_s doesn't return Float128 results in xmm0 as expected from the ABI. (AFAICT the value of xmm0 is preserved across the call.)

Not sure if it makes a difference, but did you change how Cfloat128 is defined on Windows to be the same as linux and mac?

RalphAS · 2019-03-04T05:49:56Z

Yes, I changed the Float128/Cfloat128 structure for Windows to resemble the others.

Note that the Quadmath conversions, comparisons, and arithmetic are passed to libgcc_s (not libquadmath) in Windows and Linux.

I was surprised to learn that LLVM fp128 instructions are also compiled into libgcc_s calls (according to code_native). My original thought was that "direct" fp128 IR might help the Windows issue, but that turned out to be pointless (other than helping me to learn LLVM).

FWIW, I found that IR using fp128 is rather fragile, in that it can cause LLVM to crash Julia or go into a coma (presumably a recurrent loop). That's why the above example just uses <2 x double>.

RalphAS · 2019-03-05T06:27:44Z

It seems that we have overthought this. The Windows libraries treat Float128 as a struct, so pointers should be used. I've started on this in PR #16. (Perhaps someone should notify the LLVM developers.)

simonbyrne · 2019-04-04T16:23:45Z

Fixed by #16.

simonbyrne assigned vchuravy Feb 26, 2019

simonbyrne changed the title ~~Float128 alignment~~ Float128 calling convention and alignment Feb 26, 2019

simonbyrne mentioned this issue Feb 26, 2019

linux is ok, win crashes #11

Closed

simonbyrne unassigned vchuravy Feb 26, 2019

simonbyrne closed this as completed Apr 4, 2019

simonbyrne mentioned this issue May 9, 2019

Use primitive type JuliaMath/DecFP.jl#91

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Float128 calling convention and alignment #4

Float128 calling convention and alignment #4

simonbyrne commented Feb 18, 2019 •

edited

Loading

simonbyrne commented Feb 26, 2019

yuyichao commented Feb 26, 2019

simonbyrne commented Feb 26, 2019

simonbyrne commented Feb 26, 2019

yuyichao commented Feb 26, 2019

yuyichao commented Feb 26, 2019 •

edited

Loading

simonbyrne commented Feb 26, 2019

RalphAS commented Mar 4, 2019

yuyichao commented Mar 4, 2019

simonbyrne commented Mar 4, 2019

simonbyrne commented Mar 4, 2019

RalphAS commented Mar 4, 2019

RalphAS commented Mar 5, 2019 •

edited

Loading

simonbyrne commented Apr 4, 2019

Float128 calling convention and alignment #4

Float128 calling convention and alignment #4

Comments

simonbyrne commented Feb 18, 2019 • edited Loading

simonbyrne commented Feb 26, 2019

yuyichao commented Feb 26, 2019

simonbyrne commented Feb 26, 2019

simonbyrne commented Feb 26, 2019

yuyichao commented Feb 26, 2019

yuyichao commented Feb 26, 2019 • edited Loading

simonbyrne commented Feb 26, 2019

RalphAS commented Mar 4, 2019

yuyichao commented Mar 4, 2019

simonbyrne commented Mar 4, 2019

simonbyrne commented Mar 4, 2019

RalphAS commented Mar 4, 2019

RalphAS commented Mar 5, 2019 • edited Loading

simonbyrne commented Apr 4, 2019

simonbyrne commented Feb 18, 2019 •

edited

Loading

yuyichao commented Feb 26, 2019 •

edited

Loading

RalphAS commented Mar 5, 2019 •

edited

Loading