Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The graalvm-example fails with graalvm 1.0.0-rc13 #1031

Closed
wkozaczuk opened this issue Mar 8, 2019 · 5 comments
Closed

The graalvm-example fails with graalvm 1.0.0-rc13 #1031

wkozaczuk opened this issue Mar 8, 2019 · 5 comments

Comments

@wkozaczuk
Copy link
Collaborator

One can easily reproduce the issue by replacing rc12 with rc13 in the Makefile of the graalvm-example app and then building the image like so (I prefer ROFS and turn networking off to limit noise):

./scripts/build fs=rofs image=graalvm-example
./scripts/run.py -c 1 --nics 0
OSv v0.52.0-54-ga191c10e
initialization error

If one goes to the ./apps/graalvm-example and runs this directly on host:

LD_LIBRARY_PATH=. ./main.so
Hello, World from GraalVM on OSv!

it works fine.

Also same example works just fine with graalvm 1.0.0-rc12.

I spent some time digging and here is what I have found so far:

  1. The main.c exits failed call to graal_create_isolate.
  2. After applying this patch:
diff --git a/graalvm-example/main.c b/graalvm-example/main.c
index 90cb558..74baccc 100644
--- a/graalvm-example/main.c
+++ b/graalvm-example/main.c
@@ -6,6 +6,7 @@
 //
 #include <stdlib.h>
 #include <stdio.h>
+#include <errno.h>

 #include <libhello.h>

@@ -13,7 +14,11 @@ int main(int argc, char **argv) {
   graal_isolate_t *isolate = NULL;
   graal_isolatethread_t *thread = NULL;

-  if (graal_create_isolate(NULL, &isolate, &thread) != 0) {
+  int ret = graal_create_isolate(NULL, &isolate, &thread);
+  if (ret != 0) {
+    printf("Ret: %d\n", ret);
+    printf("Errno: %d\n", errno);
+    perror("graal_create_isolate failed!");
     fprintf(stderr, "initialization error\n");
     return 1;
   }

one can see this output:

OSv v0.52.0-54-ga191c10e
Ret: 9
Errno: 22
graal_create_isolate failed!: Invalid argument
initialization error
  1. The errno 22 (EINVAL) indicates something passed wrong argument, possibly to OSv system function but obviously we do not know what. The ret value of 9 is what graal_create_isolate() returns from inside of graalvm generated native image (libhello.so <- Hello.java) called from thin main.so wrapper. I do not know for sure what 9 means but I have found this one of graalvm sources:
@Description("Setting the protection of the heap memory failed.") //
    public static final int PROTECT_HEAP_FAILED = 9;
  1. When searching for where PROTECT_HEAP_FAILED is used, I have found this in 5 places across 2 source files:
    -LinuxImageHeapProvider
    -CopyingImageHeapProvider

I have also discovered that LinuxImageHeapProvider is new and was introduced in rc13 probably to optimize "heap handling". I am not 100% sure but I think that something fails in this section of LinuxImageHeapProvider - this code is actually pretty easy to read and has many comments that hopefully may lead to some good hints from some more experienced OSv developers :-)

  1. Finally I have captured tracepoints from the app built from the graalvm version rc12 and rc13 and I saw some interesting differences between them - the rc13 version of this app shows these calls:
0xffff800001a2d040 /main.so         0         0.210289942 vfs_pwritev          1 0x00002000001ffa70 0x2 0x-1
0xffff800001a2d040 /main.so         0         0.210804418 vfs_pwritev          1 0x00002000001ffae0 0x2 0x-1
0xffff800001a2d040 /main.so         0         0.211052340 vfs_pwritev          2 0x00002000001ffdf0 0x2 0x-1
0xffff800001a2d040 /main.so         0         0.211534094 vfs_pwritev          2 0x00002000001ffe10 0x2 0x-1
0xffff800001a2d040 /main.so         0         0.211599214 vfs_pwritev          2 0x00002000001ffe10 0x2 0x-1
0xffff800001a2d040 /main.so         0         0.211640726 vfs_pwritev          2 0x00002000001ffdf0 0x2 0x-1
0xffff800001a2d040 /main.so         0         0.211850051 vfs_pwritev          2 0x00002000001ffe10 0x2 0x-1
0xffff800001a2d040 /main.so         0         0.211893681 vfs_pwritev          2 0x00002000001ffe20 0x2 0x-1

and not mmap (but I think mmap is used internally when we load ELF objects but somehow it does not get registered by relevant tracepoints). Why would it try to write to a file? To mmaped file? Is is what some comments in LinuxImageHeapProvider indicate?
The rc12 does not show any vfs_pwritev calls instead one can see one mmap:

0xffff800001a2c040 /main.so         0         0.350311380 memory_mmap          addr=0x0000000000000000, length=2097152, prot=3, flags=34, fd=-1, offset=0

Any idea/hints how to debug/troubleshoot it better?

Some other links:

@vjovanov
Copy link

You could try with -H:-SpawnIsolates. I believe that the copy-on-write image heap is not working well on OSV for some reason. The problem on the Graal side is most likely in com.oracle.svm.core.posix.linux.LinuxImageHeapProvider as it was never tested on OSV.

@wkozaczuk
Copy link
Collaborator Author

@vjovanov Thanks for reaching out to us. Indeed using -H:-SpawnIsolates makes the problem go away (I tested with RC 16). I was trying to find out what this advanced flag does but could not find any good info. Could you please explain what behavior in a produced native image SpawnIsolates setting drives? Also what is the difference between CopyingImageHeapProvider and LinuxImageHeapProvider? Is the latter more memory optimized, faster? I wonder what it would mean to OSv as it intends to run single executable only.

Our dynamic linker should be able to handle unmodified Linux ELFs so I wonder if there is a bug on our side in it. One thing we had to change in the past to support GraalVM, was enhancing OSv linker to support non fPIC shared objects (please see #1004). Relatedly I wonder if graalvm will support at some point position independent executables (pies)? Currently OSv only supports shared libraries and pies only. But I hope to support position dependant ones at some point as well.

@vjovanov
Copy link

vjovanov commented May 7, 2019

This article is the best documentation:

https://medium.com/graalvm/isolates-and-compressed-references-more-flexible-and-efficient-memory-management-for-graalvm-a044cc50b67e

-H:+SpawnIsolates first creates a heap by requesting a contiguous chunk of memory space from the OS and setting the isolate pointer (heap-base pointer) to the beginning of that space. Then, the initial image heap is copy-on-write memory mapped to the beginning of the newly provisioned heap. This step requires some low-level hacking to get the right positions. CopyingImageHeapProvider will copy the initial heap instead of copy-on-write mapping it and it is slower linearly to the size of the image heap (memory mapping is a constant operation). Further, compressed references (less memory consumption) require isolates to be enabled. This is, however, an implementation detail as on HotSpot we have compressed references but no isolates.

We support a mode where you will get a shared library with a run_main method. For this you need to pass --shared next to the main entry point.

Once this is produced you can simply call into run_main from the OSv entry point and provide the right arguments.

@wkozaczuk
Copy link
Collaborator Author

wkozaczuk commented Jul 9, 2019

@vjovanov I have just committed a new version of the original example that works with LinuxImageHeapProvider as is with the version 19.1.0 of GraalVM including full support of isolates. It turns out there were some trivial issues - this and that - I had to fix in OSv to make it all work.
I also added a couple of new examples, most notably the graalvm-netty-plot one that demonstrates exact same one described in the article you mentioned above. All seems to work just fine.

A couple of questions, if you do not mind:

  1. I have noticed that by default native-image produces PIEs (position-independent executables) on my Ubuntu 19.04. I am almost certain in the past it was producing position dependent ones. But I wonder if that was older graalvm version or older version of Ubuntu that may have been the cause of it. Could you please shed some light on how native-image using GCC toolchain? Does it produce GAS assembly that goes through the standard GCC chain?
  2. How different/similar are GraalVM isolates from Google V8 isolates? Could you point me to an article?

Lastly, I noticed the native-netty-plot example does not work with latest GraalVM (I ended up using the RC15 community edition) as the Feature interface got moved to a different package.

@vjovanov
Copy link

Cool, glad that it works. I'll answer in order:

  1. I think what changed is the isolates. If you use isolates there is no need to have relocations. The image heap is just a chunk of memory mapped at runtime. Before we did not have isolates so relocations were needed. All of the image building magic happens around: NativeBootImageViaCC.
  2. The core concepts around isolates are the same. Native Image has some extra things I think: (1) the initial image heap and (2 )multi-threading.

We had to move the feature for the official release. Well, just change the feature package and it will work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants