Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core) EspressoVM compatibility #1003

Closed
wants to merge 8 commits into from

Conversation

lewurm
Copy link

@lewurm lewurm commented Aug 19, 2024

reserved0 is used by EspressoVM, thus it cannot serve as a "reference NULL" as on HotSpot: https://github.com/oracle/graal/blob/3149f62458029ffe92b8dcefe0b3e59612684cfa/espresso/src/com.oracle.truffle.espresso.mokapot/include/mokapot.h#L67-L73

The relevant config is MOKA_LATTE. reserved3 is NULL though.


EspressoVM aka. Java on Truffle: https://www.graalvm.org/latest/reference-manual/java-on-truffle/

I tested the fix with Espresso, but I'm not 100% sure about the implications for other implementations. So please understand this PR more like a bug report 🙂 In general, I saw that in #875 (comment) it is mentioned that you want to get rid of (ab)using reserved3, so I think that's the better fix going forward.

@lewurm lewurm changed the title feat(core) GraalVM Native Image compatibility feat(core) EspressoVM compatibility Aug 19, 2024
`reserved0` is used by EspressoVM, thus it cannot serve as a "reference NULL" as on HotSpot:
https://github.com/oracle/graal/blob/3149f62458029ffe92b8dcefe0b3e59612684cfa/espresso/src/com.oracle.truffle.espresso.mokapot/include/mokapot.h#L67-L73

The relevant config is `MOKA_LATTE`. `reserved3` is `NULL` though.
@lewurm lewurm force-pushed the reserved3-espressovm branch from ad4e87e to 2756241 Compare August 19, 2024 08:49
@lewurm lewurm force-pushed the reserved3-espressovm branch from d12cf34 to c8a8187 Compare August 19, 2024 13:13
@xxDark
Copy link

xxDark commented Aug 24, 2024

I have ran into this issue myself, what if all reversed slots are in use? Won't it be better to copy JNIEnv with one more pointer in the function table? nsetupEnvData copies the JNI function table already. It's not like this could get any worse: some JVMTI agent might replace the function table with its own, completely overriding even reserved slots. With one more function pointer, however, there is at least less of a chance of someone putting something in reservedN and overwriting LWJGL thread-local data.

@Spasi
Copy link
Member

Spasi commented Sep 25, 2024

Hey @lewurm and @gilles-duboscq,

I'm not convinced this is necessary. It looks like EspressoVM uses the reserved fields of the JNIInvokeInterface struct (i.e. JavaVM) and not those of the JNINativeInterface struct (i.e. JNIEnv) which LWJGL (ab)uses.

Is the current implementation actually broken on EspressoVM? Could the issue be related to something else?

@Spasi
Copy link
Member

Spasi commented Sep 26, 2024

I've now done some testing with EspressoVM and it indeed populates reserved0-3, just not with the values mentioned above. This is my current understanding of the situation in each VM:

VM        | HotSpot | GraalVM Native Image (#875)     | EspressoVM / Truffle (#1003)     |
----------+---------+---------------------------------+----------------------------------+
reserved0 | NULL    | UnimplementedWithJNIEnvArgument | struct NespressoEnv *            |
reserved1 | NULL    | UnimplementedWithJNIEnvArgument | struct MokapotNativeInterface_ * |
reserved2 | NULL    | UnimplementedWithJNIEnvArgument | unset_function_error             |
reserved3 | NULL    | UnimplementedWithJNIEnvArgument | unset_function_error             |

Based on the above, the following implementation seems to work:

private static final int CAPABILITIES_OFFSET = 3 * POINTER_SIZE;

private static final long RESERVED_NULL = memGetAddress(JNI_NATIVE_INTERFACE + CAPABILITIES_OFFSET);

public static void setFunctionMissingAddresses(int functionCount) {
    long ptr = JNI_NATIVE_INTERFACE + CAPABILITIES_OFFSET;

    long currentTable = memGetAddress(ptr);
    if (functionCount == 0) {
        if (currentTable == FUNCTION_MISSING_ABORT_TABLE && FUNCTION_MISSING_ABORT_TABLE != NULL) {
            FUNCTION_MISSING_ABORT_TABLE = NULL;
            getAllocator().free(currentTable);
            memPutAddress(ptr, RESERVED_NULL);
        }
    } else {
        if (currentTable != RESERVED_NULL) {
            throw new IllegalStateException("setFunctionMissingAddresses has been called already");
        }
        if (currentTable != NULL) {
            // check reserved0 to see if this Native Image or EspressoVM. EspressoVM will not have the reserved NULL here.
            if (memGetAddress(JNI_NATIVE_INTERFACE) == RESERVED_NULL) {
                // silently abort on Native Image, the global JNIEnv object lives in read-only memory by default. (see #875)
                return;
            }
        }

        FUNCTION_MISSING_ABORT_TABLE = getAllocator().malloc(Integer.toUnsignedLong(functionCount) * POINTER_SIZE);
        for (int i = 0; i < functionCount; i++) {
            memPutAddress(FUNCTION_MISSING_ABORT_TABLE + Integer.toUnsignedLong(i) * POINTER_SIZE, FUNCTION_MISSING_ABORT);
        }

        memPutAddress(ptr, FUNCTION_MISSING_ABORT_TABLE);
    }
}

I can successfully run LWJGL applications with EspressoVM (with -truffle) on macOS. However, I'm having trouble on Windows which is unrelated to ThreadLocalUtil. It crashes much earlier, without any output, when calling any native method of the WinBase class. Output with --log.java.level=FINEST looks like:

[...]
[java::VM] FINEST: JVM_FindLibraryEntry(LibFFILibrary(140731941978112), Java_org_lwjgl_system_jni_JNINativeInterface_nNewDirectByteBuffer) -> NativePointer(140731942113088)
[java] FINEST: Initializing: java/lang/invoke/LambdaForm$DMH+1069
[java] FINEST: Initializing: java/lang/invoke/LambdaForm$DMH+1070
[java] FINEST: Initializing: org/lwjgl/system/MemoryUtil$$Lambda+1072
[java] FINEST: Initializing: java/util/function/LongPredicate
[java] FINEST: Initializing: java/lang/invoke/LambdaForm$DMH+1073
[java] FINEST: Initializing: org/lwjgl/system/Pointer
[java] FINEST: Initializing: org/lwjgl/system/MemoryAccessJNI
[java::VM] FINEST: JVM_FindLibraryEntry(LibFFILibrary(140731941978112), Java_org_lwjgl_system_MemoryAccessJNI_malloc) -> NativePointer(140731942110592)
[java::VM] FINEST: JVM_FindLibraryEntry(LibFFILibrary(140731941978112), Java_org_lwjgl_system_MemoryAccessJNI_calloc) -> NativePointer(140731942110608)
[java::VM] FINEST: JVM_FindLibraryEntry(LibFFILibrary(140731941978112), Java_org_lwjgl_system_MemoryAccessJNI_realloc) -> NativePointer(140731942110624)
[java::VM] FINEST: JVM_FindLibraryEntry(LibFFILibrary(140731941978112), Java_org_lwjgl_system_MemoryAccessJNI_free) -> NativePointer(140731942110640)
[java::VM] FINEST: JVM_FindLibraryEntry(LibFFILibrary(140731941978112), Java_org_lwjgl_system_MemoryAccessJNI_aligned_1alloc) -> NativePointer(140731942110656)
[java::VM] FINEST: JVM_FindLibraryEntry(LibFFILibrary(140731941978112), Java_org_lwjgl_system_MemoryAccessJNI_aligned_1free) -> NativePointer(140731942110688)
[java::VM] FINEST: JVM_FindLibraryEntry(LibFFILibrary(140731941978112), Java_org_lwjgl_system_MemoryAccessJNI_getPointerSize) -> NativePointer(140731942110576)
[java] FINEST: Initializing: java/lang/invoke/LambdaForm$DMH+1075
[java] FINEST: Initializing: java/lang/invoke/LambdaForm$DMH+1076
[java] FINEST: Initializing: org/lwjgl/system/MemoryUtil$$Lambda+1077
[java] FINEST: Initializing: java/lang/invoke/LambdaForm$DMH+1078
[java] FINEST: Initializing: java/lang/invoke/LambdaForm$MH+1079
[java] FINEST: Initializing: java/lang/invoke/LambdaForm$DMH+1080
[java] FINEST: Initializing: java/lang/invoke/LambdaForm$DMH+1081
[java] FINEST: Initializing: org/lwjgl/system/MemoryUtil$$Lambda+1082
[java] FINEST: Initializing: org/lwjgl/system/FunctionProvider
[java] FINEST: Initializing: org/lwjgl/system/NativeResource
[java] FINEST: Initializing: org/lwjgl/system/MemoryStack
[java] FINEST: Initializing: org/lwjgl/system/MemoryStack$$Lambda+1087
[java] FINEST: Initializing: org/lwjgl/BufferUtils
[java] FINEST: Initializing: org/lwjgl/system/windows/WinBase
[java::VM] FINEST: JVM_FindLibraryEntry(LibFFILibrary(140731941978112), Java_org_lwjgl_system_windows_WinBase_nGetModuleHandle) -> NativePointer(140731942118960)

The process simply dies after the last line (and before the native method returns). I have no idea what might be causing this.

@Spasi
Copy link
Member

Spasi commented Oct 4, 2024

After more fixes and testing, it turns out that abusing JNIEnv * for thread-local storage is fundamentally incompatible with the JNI implementation in EspressoVM:

  • On Hotspot, each thread uses a different JNIEnv *, pointing to the corresponding field of its JavaThread instance. This is what allows LWJGL to inject JNIEnv copies and use the reserved fields for storage. Also, all threads normally point to the same, global JNIEnv struct, which is never deallocated. This simplifies the cleanup LWJGL has to do on thread exit.

  • On EspressoVM, JNI calls use a single, global JNIEnv *, allocated dynamically and reused by all threads. Therefore, abusing it for thread-local storage is not possible. Cleanup is also complicated because the JNIEnv it points to is freed on VM exit.

As a partial fix, I've replaced the thread-local storage of errno/GetLastError() with virtual output parameters, (optionally) passed explicitly by the caller. This approach is similar to Linker.Option.CaptureCallState in Project Panama's FFM API. With this change, LWJGL no longer uses JNIEnv::reserved2 and there is no need for JNIEnv copy injections outside of OpenGL(ES).

2b12f5e

After the above commit, LWJGL tests pass and demos that do not use OpenGL work correctly on EspressoVM. It is a breaking change, but it should affect very few users, that make system calls under org.lwjgl.system.linux.*, org.lwjgl.system.windows.* and org.lwjgl.opengl.WGL.

@Spasi Spasi closed this Oct 4, 2024
@Spasi Spasi added the Type: Bug label Oct 4, 2024
@lewurm
Copy link
Author

lewurm commented Oct 16, 2024

  • On EspressoVM, JNI calls use a single, global JNIEnv *, allocated dynamically and reused by all threads. Therefore, abusing it for thread-local storage is not possible.

ahhh 😅

Thanks for the investigation and for the fix @Spasi!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants