Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building luajit2 on Raspbian Stretch: Error: DASM error 11001109 #37

Closed
alecmuffett opened this issue Jan 21, 2019 · 35 comments
Closed

Comments

@alecmuffett
Copy link

Hi! I am trying to build the latest repo on Raspbian Stretch (fully up to date) and encountering this:

$ pwd
/home/pi/eotk/opt.d/nginx-1.15.8/luajit2

$ git pull
Already up-to-date.

$ git status
On branch v2.1-agentzh
Your branch is up-to-date with 'origin/v2.1-agentzh'.
nothing to commit, working tree clean

$ make clean
make -C src clean
make[1]: Entering directory '/big/home/pi/eotk/opt.d/nginx-1.15.8/luajit2/src'
rm -f luajit libluajit.a libluajit.so host/minilua host/buildvm lj_vm.S lj_bcdef.h lj_ffdef.h lj_libdef.h lj_recdef.h lj_folddef.h host/buildvm_arch.h jit/vmdef.lua *.o host/*.o *.obj *.lib *.exp *.dll *.exe *.manifest *.pdb *.ilk
make[1]: Leaving directory '/big/home/pi/eotk/opt.d/nginx-1.15.8/luajit2/src'

$ make
==== Building LuaJIT 2.1.0-beta3 ====
make -C src
make[1]: Entering directory '/big/home/pi/eotk/opt.d/nginx-1.15.8/luajit2/src'
HOSTCC    host/minilua.o
HOSTLINK  host/minilua
DYNASM    host/buildvm_arch.h
HOSTCC    host/buildvm.o
HOSTCC    host/buildvm_asm.o
HOSTCC    host/buildvm_peobj.o
HOSTCC    host/buildvm_lib.o
HOSTCC    host/buildvm_fold.o
HOSTLINK  host/buildvm
BUILDVM   lj_vm.S
Error: DASM error 11001109
Makefile:650: recipe for target 'lj_vm.S' failed
make[1]: *** [lj_vm.S] Error 1
make[1]: Leaving directory '/big/home/pi/eotk/opt.d/nginx-1.15.8/luajit2/src'
Makefile:112: recipe for target 'default' failed
make: *** [default] Error 2

I'm not really sure where to go with this, I've dug into buildvm.c a bit, and 11001109 seems to be a complex status code returned by one of two possible functions?

@alecmuffett
Copy link
Author

If it helps, I can/have built http://luajit.org/download/LuaJIT-2.0.5.tar.gz entirely successfully on the same machine.

@agentzh
Copy link
Member

agentzh commented Jan 21, 2019

@alecmuffett Hmm, this is strange. Will you try the v2.1 branch of the following github repo on your side please?

https://github.com/LuaJIT/LuaJIT/tree/v2.1

Please use the v2.1 branch instead of the default "master" branch.

@alecmuffett
Copy link
Author

As requested. It appears to function; log below.

$ git clone https://github.com/LuaJIT/LuaJIT.git
Cloning into 'LuaJIT'...
remote: Enumerating objects: 49, done.
remote: Counting objects: 100% (49/49), done.
remote: Compressing objects: 100% (25/25), done.
remote: Total 14861 (delta 24), reused 38 (delta 24), pack-reused 14812
Receiving objects: 100% (14861/14861), 5.09 MiB | 4.40 MiB/s, done.
Resolving deltas: 100% (12127/12127), done.

$ cd LuaJIT/

$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working tree clean

$ git branch -a
* master
  remotes/origin/HEAD -> origin/master
  remotes/origin/master
  remotes/origin/v2.0
  remotes/origin/v2.1

$ git checkout remotes/origin/v2.1
Note: checking out 'remotes/origin/v2.1'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at f0e865d... Improve luaL_addlstring().

$ git status
HEAD detached at origin/v2.1
nothing to commit, working tree clean

$ make
==== Building LuaJIT 2.1.0-beta3 ====
make -C src
make[1]: Entering directory '/big/home/pi/LuaJIT/src'
HOSTCC    host/minilua.o
HOSTLINK  host/minilua
DYNASM    host/buildvm_arch.h
HOSTCC    host/buildvm.o
HOSTCC    host/buildvm_asm.o
HOSTCC    host/buildvm_peobj.o
HOSTCC    host/buildvm_lib.o
HOSTCC    host/buildvm_fold.o
HOSTLINK  host/buildvm
BUILDVM   lj_vm.S
ASM       lj_vm.o
CC        lj_gc.o
BUILDVM   lj_ffdef.h
<...deletia...>
CC        lib_ffi.o
CC        lib_init.o
AR        libluajit.a
CC        luajit.o
BUILDVM   jit/vmdef.lua
DYNLINK   libluajit.so
LINK      luajit
OK        Successfully built LuaJIT
make[1]: Leaving directory '/big/home/pi/LuaJIT/src'
==== Successfully built LuaJIT 2.1.0-beta3 ====

@xsw9527
Copy link

xsw9527 commented Jan 29, 2019

I also got this problem
luajit2-2.1-20190115$ make clean
make -C src clean
make[1]: Entering directory /home/xstrive/hisi/nginxrtmp_xstrivev100/luajit2-2.1-20190115/src' rm -f luajit libluajit.a libluajit.so host/minilua host/buildvm lj_vm.S lj_bcdef.h lj_ffdef.h lj_libdef.h lj_recdef.h lj_folddef.h host/buildvm_arch.h jit/vmdef.lua *.o host/*.o *.obj *.lib *.exp *.dll *.exe *.manifest *.pdb *.ilk make[1]: Leaving directory /home/xstrive/hisi/nginxrtmp_xstrivev100/luajit2-2.1-20190115/src'
xstrive@xstrive-virtual-machine:~/hisi/nginxrtmp_xstrivev100/luajit2-2.1-20190115$ make HOST_CC="gcc -m32" CROSS=arm-hisiv100nptl-linux- TARGET_CFLAGS="-mfloat-abi=soft"
==== Building LuaJIT 2.1.0-beta3 ====
make -C src
make[1]: Entering directory /home/xstrive/hisi/nginxrtmp_xstrivev100/luajit2-2.1-20190115/src' HOSTCC host/minilua.o HOSTLINK host/minilua DYNASM host/buildvm_arch.h HOSTCC host/buildvm.o HOSTCC host/buildvm_asm.o HOSTCC host/buildvm_peobj.o HOSTCC host/buildvm_lib.o HOSTCC host/buildvm_fold.o HOSTLINK host/buildvm BUILDVM lj_vm.S Error: DASM error 1100104c make[1]: *** [lj_vm.S] Error 1 make[1]: Leaving directory /home/xstrive/hisi/nginxrtmp_xstrivev100/luajit2-2.1-20190115/src'
make: *** [default] Error 2

if I using luajit2.1 from https://github.com/LuaJIT/LuaJIT/tree/v2.1 everything is ok
xstrive@xstrive-virtual-machine:~/hisi/nginxrtmp_xstrivev100/LuaJIT-2.1$ make HOST_CC="gcc -m32" CROSS=arm-hisiv100nptl-linux- TARGET_CFLAGS="-mfloat-abi=soft"
==== Building LuaJIT 2.1.0-beta3 ====
make -C src
make[1]: Entering directory /home/xstrive/hisi/nginxrtmp_xstrivev100/LuaJIT-2.1/src' HOSTCC host/minilua.o HOSTLINK host/minilua DYNASM host/buildvm_arch.h HOSTCC host/buildvm.o HOSTCC host/buildvm_asm.o HOSTCC host/buildvm_peobj.o HOSTCC host/buildvm_lib.o HOSTCC host/buildvm_fold.o HOSTLINK host/buildvm BUILDVM lj_vm.S ASM lj_vm.o CC lj_gc.o BUILDVM lj_ffdef.h CC lj_err.o CC lj_char.o BUILDVM lj_bcdef.h CC lj_bc.o CC lj_obj.o CC lj_buf.o CC lj_str.o CC lj_tab.o CC lj_func.o CC lj_udata.o CC lj_meta.o CC lj_debug.o CC lj_state.o CC lj_dispatch.o CC lj_vmevent.o CC lj_vmmath.o CC lj_strscan.o CC lj_strfmt.o CC lj_strfmt_num.o CC lj_api.o CC lj_profile.o CC lj_lex.o CC lj_parse.o CC lj_bcread.o CC lj_bcwrite.o CC lj_load.o CC lj_ir.o CC lj_opt_mem.o BUILDVM lj_folddef.h CC lj_opt_fold.o CC lj_opt_narrow.o CC lj_opt_dce.o CC lj_opt_loop.o CC lj_opt_split.o CC lj_opt_sink.o CC lj_mcode.o CC lj_snap.o CC lj_record.o CC lj_crecord.o BUILDVM lj_recdef.h CC lj_ffrecord.o CC lj_asm.o CC lj_trace.o CC lj_gdbjit.o CC lj_ctype.o CC lj_cdata.o CC lj_cconv.o CC lj_ccall.o CC lj_ccallback.o CC lj_carith.o CC lj_clib.o CC lj_cparse.o CC lj_lib.o CC lj_alloc.o CC lib_aux.o BUILDVM lj_libdef.h CC lib_base.o CC lib_math.o CC lib_bit.o CC lib_string.o CC lib_table.o CC lib_io.o CC lib_os.o CC lib_package.o CC lib_debug.o CC lib_jit.o CC lib_ffi.o CC lib_init.o AR libluajit.a CC luajit.o BUILDVM jit/vmdef.lua DYNLINK libluajit.so LINK luajit OK Successfully built LuaJIT make[1]: Leaving directory /home/xstrive/hisi/nginxrtmp_xstrivev100/LuaJIT-2.1/src'
==== Successfully built LuaJIT 2.1.0-beta3 ====

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

@xsw9527 Are you also on Raspbian Stretch?

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

@alecmuffett Thanks for your update.

@alecmuffett @xsw9527 What version of Raspberry Pi device are you using, respectively?

@alecmuffett
Copy link
Author

alecmuffett commented Jan 29, 2019

Raspbian Stretch, latest patches for everything, on a RPi3B with plenty of resources/filesystem/swap.

I am not certain that @xsw9527 is using a RPi; his prompt has a system hostname of xstrive@xstrive-virtual-machine so may be using a VM or Container of some kind? (Edit: I just noticed HOST_CC="gcc -m32" CROSS=arm-hisiv100nptl-linux which may suggest cross-compilation for ARM?)

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

@alecmuffett It'll be great if you can help pin down the guilty commit SHA1 in our repo by bisecting the commits. Maybe writing a script to automate the bisecting process.

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

@alecmuffett In the meantime, I'll try finding a RPi device so that I can actually reproduce and debug this.

@alecmuffett
Copy link
Author

alecmuffett commented Jan 29, 2019

@agentzh No promises, but I will have a try tonight. I am familiar with the concept, but am a Git noob (previous: svn, mercurial) so I don't suppose you can recommend a webpage which walks through the Git workflow for bisecting?

Edit: https://git-scm.com/docs/git-bisect ?

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

@alecmuffett The git log --oneline command would be very helpful in getting all the commits' SHA1. And check git checkout SHA1 where SHA1 is the target commit's SHA1. The starting point can be any old version that works for you; the more recent, the better. It's essentially like that.

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

@alecmuffett git bisect is indeed better.

@xsw9527
Copy link

xsw9527 commented Jan 29, 2019

I am using arm based Linux,I use cross-compile tools to compile luajit in a virtual machine which running Ubuntu 14.04.

@xsw9527
Copy link

xsw9527 commented Jan 29, 2019

I can give you my cross-compile tools if you need,so that you can reproduce this issue easier.

@xsw9527
Copy link

xsw9527 commented Jan 29, 2019

I looked into the code
Found that this error came from src/host/buildvm.c
a

@alecmuffett
Copy link
Author

Bisected:

$ git bisect good
c844a613df9f2a4e2102d0fa5de827e63ab19281 is the first bad commit
commit c844a613df9f2a4e2102d0fa5de827e63ab19281
Author: Yichun Zhang (agentzh) <[email protected]>
Date:   Thu Jun 7 22:27:29 2018 -0700

    feature: implemented the new Lua and C API functions for thread exdata.

    The Lua API can be used like below:

      local exdata = require "thread.exdata"
      exdata(0xdeadbeefLL)  -- set the exdata of the current Lua thread
      local ptr = exdata()  -- fetch the exdata of the current Lua thread

    The exdata value on the Lua land is represented as a cdata object of the
    ctype "void*".

    Right now the reading API, i.e., `exdata()` calls without any arguments,
    can be JIT compiled.

    Also exposed the following public C API functions for manipulating
    exdata on the C land:

      void lua_setexdata(lua_State *L, void *exdata);
      void *lua_getexdata(lua_State *L);

    The exdata pointer is initialized to NULL when the main thread is
    created. Any child Lua threads created will inherit the parent's exdata
    but still have their own exdata storage. So child Lua threads can always
    override the inherited parent exdata pointer values.

    This API is used internally by the OpenResty core so never ever mess
    with it yourself in the context of OpenResty.

    Thanks Zexuan Luo for preparing the final version of the patch.

    Signed-off-by: Yichun Zhang (agentzh) <[email protected]>

:100644 100644 7864b7f0e3e45e120edf0707b20a87e493de141e 159ee994a46a11313c60909db90720e290084e07 M      README
:040000 040000 5a309b395c635bbe283b8f3439a7aea59554fe74 65a142a981f045251886372472230a124a590a8e M      src
:000000 040000 0000000000000000000000000000000000000000 f8852ff9e83ef4a7a88fce9b5c37e94ab7c33366 A      t

@alecmuffett
Copy link
Author

Hypothesis: this rings a bell from another issue that I logged in Nginx:

openresty/lua-nginx-module#1423

...maybe there's a similar (but incorrect) blind assumption of 64-bit ARM behaviour in the C API code?

@alecmuffett
Copy link
Author

For convenience/confirmation, this was the last bad bisect error:

Bisecting: 1 revision left to test after this (roughly 1 step)
[c844a613df9f2a4e2102d0fa5de827e63ab19281] feature: implemented the new Lua and C API functions for thread exdata.
13:33:03 invalid:luajit2 $ make clean && make
make -C src clean
make[1]: Entering directory '/big/home/pi/luajit2/src'
rm -f luajit libluajit.a libluajit.so host/minilua host/buildvm lj_vm.S lj_bcdef.h lj_ffdef.h lj_libdef.h lj_recdef.h lj_folddef.h host/buildvm_arch.h jit/vmdef.lua *.o host/*.o *.ob
j *.lib *.exp *.dll *.exe *.manifest *.pdb *.ilk
make[1]: Leaving directory '/big/home/pi/luajit2/src'
==== Building LuaJIT 2.1.0-beta3 ====
make -C src
make[1]: Entering directory '/big/home/pi/luajit2/src'
HOSTCC    host/minilua.o
HOSTLINK  host/minilua
DYNASM    host/buildvm_arch.h
HOSTCC    host/buildvm.o
HOSTCC    host/buildvm_asm.o
HOSTCC    host/buildvm_peobj.o
HOSTCC    host/buildvm_lib.o
HOSTCC    host/buildvm_fold.o
HOSTLINK  host/buildvm
BUILDVM   lj_vm.S
Error: DASM error 11001109
Makefile:646: recipe for target 'lj_vm.S' failed
make[1]: *** [lj_vm.S] Error 1
make[1]: Leaving directory '/big/home/pi/luajit2/src'
Makefile:112: recipe for target 'default' failed
make: *** [default] Error 2
13:33:31 invalid:luajit2 $ git bisect bad

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

@alecmuffett Thanks for digging it up! I can now reproduce the error on my side with arm cross-compiling. We'll investigate it.

@sparvu
Copy link

sparvu commented Jan 29, 2019

This works fine on FreeBSD 12.0 RBPI3B+ . Must be something Linux related.

  • platform
FreeBSD k50dev 12.0-RELEASE FreeBSD 12.0-RELEASE r341666 GENERIC  arm64
% gmake
==== Building LuaJIT 2.1.0-beta3 ====
gmake -C src
gmake[1]: Entering directory '/home/krmx/luajit2/src'
HOSTCC    host/minilua.o
...
LINK      luajit
OK        Successfully built LuaJIT
gmake[1]: Leaving directory '/home/krmx/luajit2/src'
==== Successfully built LuaJIT 2.1.0-beta3 ====
% gmake
==== Building LuaJIT 2.1.0-beta3 ====
gmake -C src
gmake[1]: Entering directory '/home/krmx/LuaJIT/src'
HOSTCC    host/minilua.o
...
OK        Successfully built LuaJIT
gmake[1]: Leaving directory '/home/krmx/LuaJIT/src'
==== Successfully built LuaJIT 2.1.0-beta3 ====

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

Guys, will you try the following patch on your side? It seems to work fine on my side with cross-compilation (at least it builds successfully):

diff --git a/src/lj_obj.h b/src/lj_obj.h
index 08612d7912..94ac9beee9 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -662,7 +662,11 @@ struct lua_State {
   void *cframe;                /* End of C stack frame chain. */
   MSize stacksize;     /* True stack size (incl. LJ_STACK_EXTRA). */
   void *exdata;                /* user extra data pointer. added by OpenResty */
-};
+}
+#if LJ_TARGET_ARM
+__attribute__ ((aligned (16)))
+#endif
+;

 #define G(L)                   (mref(L->glref, global_State))
 #define registry(L)            (&G(L)->registrytv)

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

@sparvu Maybe it's specific to 32-bit arm instead of arm64...

@agentzh
Copy link
Member

agentzh commented Jan 29, 2019

@alecmuffett Show your uname -a output please?

@sparvu
Copy link

sparvu commented Jan 29, 2019

@agentzh : ahh. right. i was thinking we are talking only for 64bit arm baords, like RBPI3B+

@alecmuffett
Copy link
Author

@sparvu - the RPi3b is a 64-bit chip, but Raspbian is a 32-bit OS because 1GB of ram is limited when running in 64-bit mode

@alecmuffett
Copy link
Author

alecmuffett commented Jan 29, 2019

$ uname -a
Linux invalid 4.14.79-v7+ #1159 SMP Sun Nov 4 17:50:20 GMT 2018 armv7l GNU/Linux

$ getconf LONG_BIT
32

$ lscpu
Architecture:          armv7l
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
Model:                 4
Model name:            ARMv7 Processor rev 4 (v7l)
CPU max MHz:           1200.0000
CPU min MHz:           600.0000
BogoMIPS:              38.40
Flags:                 half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32

$ cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 76.80
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4
[...deletia for the other 3 cores...]
Hardware        : BCM2835
Revision        : a02082
Serial          : 000000002f24d5b7

$ git status
On branch v2.1-agentzh
Your branch is up-to-date with 'origin/v2.1-agentzh'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   src/lj_obj.h

no changes added to commit (use "git add" and/or "git commit -a")
$ git diff
diff --git a/src/lj_obj.h b/src/lj_obj.h
index 08612d7..e4c6f8f 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -662,7 +662,12 @@ struct lua_State {
   void *cframe;                /* End of C stack frame chain. */
   MSize stacksize;     /* True stack size (incl. LJ_STACK_EXTRA). */
   void *exdata;                /* user extra data pointer. added by OpenResty */
-};
+}
+#if LJ_TARGET_ARM
+__attribute__ ((aligned (16)))
+#endif
+;
+

 #define G(L)                   (mref(L->glref, global_State))
 #define registry(L)            (&G(L)->registrytv)

...it compiles to completion; I can't find a test suite? (edit: though there's a curious Perl module)

@sparvu
Copy link

sparvu commented Jan 29, 2019 via email

@alecmuffett
Copy link
Author

@sparvu I thought I recognised the name! :-)

@agentzh
Copy link
Member

agentzh commented Jan 30, 2019

@alecmuffett Good to know it compiles fine on your side with my patch. You can try running the test suite here:

https://github.com/openresty/luajit2-test-suite

BTW, the memory limit is 128TB (hence actually no limit on most, if not all systems) when building LuaJIT on arm64 because the GC64 mode is enforced on that architecture.

@alecmuffett
Copy link
Author

These are headless machines, is there a way to run tests without a GTK dependency?

$ ./run-tests /home/pi/foo/
Package gtk+-2.0 was not found in the pkg-config search path.
Perhaps you should add the directory containing `gtk+-2.0.pc'
to the PKG_CONFIG_PATH environment variable
No package 'gtk+-2.0' found
failed to run command pkg-config --cflags --libs gtk+-2.0: 256 at ./run-tests line 73.

@agentzh
Copy link
Member

agentzh commented Jan 30, 2019

@alecmuffett Some of the tests just manipulates the gtk 2 libraries. It does not require a desktop environment nor an X window server. I always run the tests in a headless server myself. And that github repo also runs the test suite in Travis CI, which is also a headless environment.

@agentzh
Copy link
Member

agentzh commented Jan 30, 2019

@agentzh
Copy link
Member

agentzh commented Jan 30, 2019

Just for the record, here are some details for this issue:

The original Error: DASM error 11001109 error message is the DASM_S_RANGE_I error code, defined as 0x11000000 (the lower bits are the location of the offending DynASM instruction). It's usually set by the CK macro on the C level.

There were many offending DynASM instructions that set this RANGE_I error. All of them were caused by ill-formed immediate values to be encoded into the ARM machine instructions. The first one's C-land backtrace is like this:

#0 0x08048ee2 in dasm_imm12 (n=2888) at ./../dynasm/dasm_arm.h:175
#1 0x080495ec in dasm_put (ctx=0xffffb140, start=110) at ./../dynasm/dasm_arm.h:268
#2 0x08049fe2 in build_subroutines (ctx=0xffffb140) at host/buildvm_arch.h:6601
#3 0x0804c7bc in build_backend (ctx=0xffffb140) at host/buildvm_arch.h:11312
#4 0x0804cdc7 in build_code (ctx=0xffffb140) at host/buildvm.c:195
#5 0x0804d78f in main (argc=5, argv=0xffffbbb4) at host/buildvm.c:447

The DynASM instruction encoding statement looks like this (at host/buildvm_arch.h:6601):

  dasm_put(Dst, 110, Dt2(->vmstate), ~CFRAME_RAWMASK, Dt1(->base), Dt1(->glref), ~LJ_TFALSE, GG_G2DISP, LJ_VMST_INTERP, DISPATCH_GL(vmstate));

The offending instruction operand is the value of GG_G2DISP. This operand value , which is 2888 acccording to the backtrace above (as the n argument of the dasm_imm12 call), cannot be encoded directly as a 12-bit immediate value as per the ARM instruction set encoding rules. Whereas in the official LuaJIT v2.1 branch, the value is 2896, which is fine.

This GG_G2DISP macro is defined as

#define GG_G2DISP       (GG_OFS(dispatch) - GG_OFS(g))

where GG_OFS() is defined as

#define GG_OFS(field)   ((int)offsetof(GG_State, field))

where GG_State has a lua_State field as its first field like this:

typedef struct GG_State {
  lua_State L;				/* Main thread. */
  global_State g;			/* Global state. */
  ...

Thus, GG_OFS() returns different values based on the size of lua_State. In the guilty git commit found by @alecmuffett above, we add a new void *exdata field to the end of lua_State, changing the size of the struct, thus affecting the final (constant) value of GG_G2DISP (and things like that). We just need to adjust the size of lua_State, such that the final value can be directly encoded as an ARM immediate value (an 8-bit value rotated by a multiply of 4).

My previous patch enforces 16-byte alignment in lua_State, which is overkill. The following patch is a bit more efficient in that it saves 4 bytes in every lua_State object (from 64 bytes to 60 bytes) on 32-bit ARM systems:

diff --git a/src/lj_obj.h b/src/lj_obj.h
index 08612d7912..a89ea0d982 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -662,6 +662,10 @@ struct lua_State {
   void *cframe;                /* End of C stack frame chain. */
   MSize stacksize;     /* True stack size (incl. LJ_STACK_EXTRA). */
   void *exdata;                /* user extra data pointer. added by OpenResty */
+#if LJ_TARGET_ARM
+  uint32_t unused1;
+  uint32_t unused2;
+#endif
 };

 #define G(L)                   (mref(L->glref, global_State))

Because we are now padding the struct size ourselves, it is also more portable (no longer requires the gcc attribute support).

I've run the test suite using qemu-arm on my side and most of the tests are passing. Also I checked the failures are also present when running the test suite with the official v2.1 branch. Most of the failures are due to the lack of arm versions of gtk and mpc libraries in my system. The details of the failures are like below:

https://gist.github.com/agentzh/1295b2ad913f1bcf32a65733460ab3f1

The only test failure that really deserves further investigation is the one in compare.lua:

/opt/luajit-arm/bin/luajit-2.1.0-beta3: compare.lua:226: assertion failed!
stack traceback:
	[C]: in function 'assert'
	compare.lua:226: in main chunk
	[C]: at 0x00014cb8
=== test/misc/argcheck.lua
=== test/misc/self.lua
Failed test when running /home/agentzh/git/luajit2-test-suite/arm-luajit compare.lua 1: 256

When print out the expression value, it is indeed negative:

-2147483592

But for some reason, bit.tobit(i+0x7fffffff) < 0 does not evaluate to a true value. Not sure if it can also be reproducible on a real arm chip. Anyway I'm closing this issue.

@agentzh
Copy link
Member

agentzh commented Jan 30, 2019

Patch is already committed.

@agentzh
Copy link
Member

agentzh commented Jan 31, 2019

I've already get all the tests pass with cross-compiled arm and aarch64 builds of LuaJIT on my Linux x86_64 system using qemu user mode. The only remaining issue is the failure in compile.lua for the ARM build, as mentioned above. Created issue #38 to record this.

siddhesh referenced this issue in moonjit/moonjit Aug 21, 2019
Buristan added a commit to tarantool/luajit that referenced this issue Oct 7, 2020
This patch introduces the following counters:
  - overall amount of allocated tables, cdata and udata objects
  - number of incremental GC steps grouped by GC state
  - number of string hashes hits and misses
  - amount of allocated and freed memory
  - number of trace aborts, number of traces and restored snapshots

Also this patch fixes alignment for 64-bit architectures.

NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
MRef and GCSize sizes depend on LJ_GC64 define.

struct GCState is terminated by three fields: GCSize estimate, MSize
stepmul and MSize pause, which are aligned. The introduces size_t
fields do not violate the alligment too.

vmstate 32-bit field goes right after GCState field within global_State
structure. The next field tmpbuf consists of several MRef fields that
have 64-bit size each. This issue can be solved by moving vmstate field
below. However DynASM doesn't work well with unaligned memory access on
64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
boundary.

Furthermore field order has been changed to be able to compile code by
DynASM for 32-bit ARM too (see also
openresty/luajit2#37 (comment)).

Interfaces to obtain these metrics via both Lua and C API are
introduced in the next patch.

Part of tarantool/tarantool#5187
Buristan added a commit to tarantool/luajit that referenced this issue Oct 7, 2020
This patch introduces the following counters:
  - overall amount of allocated tables, cdata and udata objects
  - number of incremental GC steps grouped by GC state
  - number of string hashes hits and misses
  - amount of allocated and freed memory
  - number of trace aborts, number of traces and restored snapshots

Also this patch fixes alignment for 64-bit architectures.

NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
MRef and GCSize sizes depend on LJ_GC64 define.

struct GCState is terminated by three fields: GCSize estimate, MSize
stepmul and MSize pause, which are aligned. The introduces size_t
fields do not violate the alignment too.

vmstate 32-bit field goes right after GCState field within global_State
structure. The next field tmpbuf consists of several MRef fields that
have 64-bit size each. This issue can be solved by moving vmstate field
below. However DynASM doesn't work well with unaligned memory access on
64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
boundary.

Furthermore field order has been changed to be able to compile code by
DynASM for 32-bit ARM too (see also
openresty/luajit2#37 (comment)).

Interfaces to obtain these metrics via both Lua and C API are
introduced in the next patch.

Part of tarantool/tarantool#5187
Buristan added a commit to tarantool/luajit that referenced this issue Oct 8, 2020
This patch introduces the following counters:
  - overall amount of allocated tables, cdata and udata objects
  - number of incremental GC steps grouped by GC state
  - number of string hashes hits and misses
  - amount of allocated and freed memory
  - number of trace aborts, number of traces and restored snapshots

Also this patch fixes alignment for 64-bit architectures.

NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
MRef and GCSize sizes depend on LJ_GC64 define.

struct GCState is terminated by three fields: GCSize estimate, MSize
stepmul and MSize pause, which are aligned. The introduced size_t
fields do not violate the alignment too.

vmstate 32-bit field goes right after GCState field within global_State
structure. The next field tmpbuf consists of several MRef fields that
have 64-bit size each. This issue can be solved by moving vmstate field
below. However DynASM doesn't work well with unaligned memory access on
64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
boundary.

Furthermore field order has been changed to be able to compile code by
DynASM for 32-bit ARM too (see also
openresty/luajit2#37 (comment)).

Interfaces to obtain these metrics via both Lua and C API are
introduced in the next patch.

Part of tarantool/tarantool#5187
kyukhin pushed a commit to tarantool/luajit that referenced this issue Oct 13, 2020
This patch introduces the following counters:
  - overall amount of allocated tables, cdata and udata objects
  - number of incremental GC steps grouped by GC state
  - number of string hashes hits and misses
  - amount of allocated and freed memory
  - number of trace aborts, number of traces and restored snapshots

Also this patch fixes alignment for 64-bit architectures.

NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
MRef and GCSize sizes depend on LJ_GC64 define.

struct GCState is terminated by three fields: GCSize estimate, MSize
stepmul and MSize pause, which are aligned. The introduced size_t
fields do not violate the alignment too.

vmstate 32-bit field goes right after GCState field within global_State
structure. The next field tmpbuf consists of several MRef fields that
have 64-bit size each. This issue can be solved by moving vmstate field
below. However DynASM doesn't work well with unaligned memory access on
64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
boundary.

Furthermore field order has been changed to be able to compile code by
DynASM for 32-bit ARM too (see also
openresty/luajit2#37 (comment)).

Interfaces to obtain these metrics via both Lua and C API are
introduced in the next patch.

Part of tarantool/tarantool#5187
kyukhin pushed a commit to tarantool/luajit that referenced this issue Oct 13, 2020
This patch introduces the following counters:
  - overall amount of allocated tables, cdata and udata objects
  - number of incremental GC steps grouped by GC state
  - number of string hashes hits and misses
  - amount of allocated and freed memory
  - number of trace aborts, number of traces and restored snapshots

Also this patch fixes alignment for 64-bit architectures.

NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
MRef and GCSize sizes depend on LJ_GC64 define.

struct GCState is terminated by three fields: GCSize estimate, MSize
stepmul and MSize pause, which are aligned. The introduced size_t
fields do not violate the alignment too.

vmstate 32-bit field goes right after GCState field within global_State
structure. The next field tmpbuf consists of several MRef fields that
have 64-bit size each. This issue can be solved by moving vmstate field
below. However DynASM doesn't work well with unaligned memory access on
64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
boundary.

Furthermore field order has been changed to be able to compile code by
DynASM for 32-bit ARM too (see also
openresty/luajit2#37 (comment)).

Interfaces to obtain these metrics via both Lua and C API are
introduced in the next patch.

Part of tarantool/tarantool#5187
igormunkin pushed a commit to tarantool/luajit that referenced this issue Jun 16, 2022
This patch introduces the following counters:
  - overall amount of allocated tables, cdata and udata objects
  - number of incremental GC steps grouped by GC state
  - number of string hashes hits and misses
  - amount of allocated and freed memory
  - number of trace aborts, number of traces and restored snapshots

Also this patch fixes alignment for 64-bit architectures.

NB: MSize and BCIns are the only fixed types that equal 32 bits. GCRef,
MRef and GCSize sizes depend on LJ_GC64 define.

struct GCState is terminated by three fields: GCSize estimate, MSize
stepmul and MSize pause, which are aligned. The introduced size_t
fields do not violate the alignment too.

vmstate 32-bit field goes right after GCState field within global_State
structure. The next field tmpbuf consists of several MRef fields that
have 64-bit size each. This issue can be solved by moving vmstate field
below. However DynASM doesn't work well with unaligned memory access on
64-bit bigendian MIPS, so vmstate should be aligned to a 64-bit
boundary.

Furthermore field order has been changed to be able to compile code by
DynASM for 32-bit ARM too (see also
openresty/luajit2#37 (comment)).

Interfaces to obtain these metrics via both Lua and C API are
introduced in the next patch.

Part of tarantool/tarantool#5187
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants