Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crucible-jvm: Investigate bugs when verifying JDK 9+ code that uses String #641

Open
RyanGlScott opened this issue Feb 5, 2021 · 6 comments
Labels

Comments

@RyanGlScott
Copy link
Contributor

There are still some yet-to-be-diagnosed issues with using crucible-jvm in tandem with JDK 9 or later. I'll copy-paste a relevant excerpt from GaloisInc/saw-script#861 (comment), which describes the challenge:

  • [...] After bumping the submodule to commit GaloisInc/jvm-parser@d440e6b, which contains a fix for that issue, I instead get a different error when running the crucible-jvm test suite:

    javac -g Str.java
    Compiled from "Str.java"
    class Str {
      Str();
        Code:
           0: aload_0
           1: invokespecial #1                  // Method java/lang/Object."<init>":()V
           4: return
    
      public static void main(java.lang.String[]);
        Code:
           0: ldc           #2                  // String 1234567890
           2: astore_1
           3: new           #3                  // class java/lang/StringBuffer
           6: dup
           7: aload_1
           8: invokespecial #4                  // Method java/lang/StringBuffer."<init>":(Ljava/lang/String;)V
          11: astore_2
          12: aload_2
          13: iconst_2
          14: ldc           #5                  // String abcd
          16: invokevirtual #6                  // Method java/lang/StringBuffer.insert:(ILjava/lang/String;)Ljava/lang/StringBuffer;
          19: pop
          20: getstatic     #7                  // Field java/lang/System.out:Ljava/io/PrintStream;
          23: aload_2
          24: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
          27: return
    }
    [Crux] Checking "Str.java"
    starting executeCrucibleJVM
    initializeClass java/lang/String  (start)
    initializeClass java/lang/Object  (start)
    special java/lang/Object/<clinit>
    initializeClass java/lang/Object  (finish)
    special java/lang/String/<clinit>
    initializeClass java/lang/String  (finish)
    [Crux] resolveField: Field FieldId {fieldIdClass = ClassName "java/lang/Object", fieldIdName = "value", fieldIdType = char[]} not found
    CallStack (from HasCallStack):
      error, called at src/Lang/Crucible/JVM/Translation/Monad.hs:61:15 in crucible-jvm-0.2-inplace:Lang.Crucible.JVM.Translation.Monad
      jvmFail, called at src/Lang/Crucible/JVM/Translation/Class.hs:836:23 in crucible-jvm-0.2-inplace:Lang.Crucible.JVM.Translation.Class
    

    This time, the problem lies in the fact that the API for the String class has changed. In JDK 8, the private value field in String has type char[], but in recent JDKs, such as JDK11, its type has changed to byte[]. crucible-jvm, on the other hand, hard-codes the assumption that value is of type char[] here:

    -- It'd be preferable to use createInstance here, but the amount of
    -- infrastructure needed to create strings via the Java runtime is significant
    -- (thread local variables, character encodings, builtin unsafe operations,
    -- etc.), so we cheat and just forcibly set the (private) instance fields.
    -- We'll want want to REVISIT this in the future.
    obj1 <- setInstanceFieldValue
    obj
    (J.FieldId "java/lang/String" "value" J.charArrayTy)
    (RValue (App (JustValue knownRepr arrRef)))

  • As a quick experiment, I tried to see what would happen if I updated this assumption to use the modern JDK's type for value:

    diff --git a/crucible-jvm/src/Lang/Crucible/JVM/Translation/Class.hs b/crucible-jvm/src/Lang/Crucible/JVM/Translation/Class.hs
    index 3c09eb72..821ab6cd 100644
    --- a/crucible-jvm/src/Lang/Crucible/JVM/Translation/Class.hs
    +++ b/crucible-jvm/src/Lang/Crucible/JVM/Translation/Class.hs
    @@ -911,7 +911,7 @@ refFromString s =  do
       -- We'll want want to REVISIT this in the future.
       obj1 <- setInstanceFieldValue
         obj
    -    (J.FieldId "java/lang/String" "value" J.charArrayTy)
    +    (J.FieldId "java/lang/String" "value" J.byteArrayTy)
         (RValue (App (JustValue knownRepr arrRef)))
     
       obj2 <- setInstanceFieldValue

    This appeared to fix the previous issue, but it uncovered another issue:

    javac -g Str.java
    Compiled from "Str.java"
    class Str {
      Str();
        Code:
           0: aload_0
           1: invokespecial #1                  // Method java/lang/Object."<init>":()V
           4: return
    
      public static void main(java.lang.String[]);
        Code:
           0: ldc           #2                  // String 1234567890
           2: astore_1
           3: new           #3                  // class java/lang/StringBuffer
           6: dup
           7: aload_1
           8: invokespecial #4                  // Method java/lang/StringBuffer."<init>":(Ljava/lang/String;)V
          11: astore_2
          12: aload_2
          13: iconst_2
          14: ldc           #5                  // String abcd
          16: invokevirtual #6                  // Method java/lang/StringBuffer.insert:(ILjava/lang/String;)Ljava/lang/StringBuffer;
          19: pop
          20: getstatic     #7                  // Field java/lang/System.out:Ljava/io/PrintStream;
          23: aload_2
          24: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
          27: return
    }
    Warning: Parsing the index cache failed (Data.Binary.Get.runGet at position
    16: Non-matching structured hashes: f46da61e7afa58a5e8fd1d2b6fb79899;
    expected: a257ca064dfb5e0cb74f74e64a975b9e). Trying to regenerate the index
    cache...
    Resolving dependencies...
    [Crux] Checking "Str.java"
    starting executeCrucibleJVM
    initializeClass java/lang/String  (start)
    initializeClass java/lang/Object  (start)
    special java/lang/Object/<clinit>
    initializeClass java/lang/Object  (finish)
    special java/lang/String/<clinit>
    initializeClass java/lang/String  (finish)
    new ClassName "java/lang/StringBuffer" (start)
    fields are [FieldId {fieldIdClass = ClassName "java/lang/AbstractStringBuilder", fieldIdName = "value", fieldIdType = byte[]},FieldId {fieldIdClass = ClassName "java/lang/AbstractStringBuilder", fieldIdName = "coder", fieldIdType = byte},FieldId {fieldIdClass = ClassName "java/lang/AbstractStringBuilder", fieldIdName = "count", fieldIdType = int},FieldId {fieldIdClass = ClassName "java/lang/AbstractStringBuilder", fieldIdName = "EMPTYVALUE", fieldIdType = byte[]},FieldId {fieldIdClass = ClassName "java/lang/AbstractStringBuilder", fieldIdName = "MAX_ARRAY_SIZE", fieldIdType = int},FieldId {fieldIdClass = ClassName "java/lang/StringBuffer", fieldIdName = "toStringCache", fieldIdType = java.lang.String},FieldId {fieldIdClass = ClassName "java/lang/StringBuffer", fieldIdName = "serialVersionUID", fieldIdType = long},FieldId {fieldIdClass = ClassName "java/lang/StringBuffer", fieldIdName = "serialPersistentFields", fieldIdType = java.io.ObjectStreamField[]}]
    new ClassName "java/lang/StringBuffer" (finish)
    initializeClass java/lang/StringBuffer  (start)
    initializeClass java/lang/AbstractStringBuilder  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/AbstractStringBuilder  (finish)
    special java/lang/StringBuffer/<clinit>
    initializeClass java/lang/StringBuffer  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/String  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    initializeClass java/lang/String  (finish)
    invoke: java/lang/StringBuffer/insert
    finish invoke:java/lang/StringBuffer/insert
    initializeClass java/lang/System  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    Initializing System.out static field
    Finished initializing System.out
    Initializing System.err static field
    Finished initializing System.err
    initializeClass java/lang/System  (finish)
    invoke: java/io/PrintStream/println
    finish invoke:java/io/PrintStream/println
    initializeClass java/lang/AbstractStringBuilder  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/AbstractStringBuilder  (finish)
    invoke: java/lang/String/length
    finish invoke:java/lang/String/length
    initializeClass java/lang/AbstractStringBuilder  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/AbstractStringBuilder  (finish)
    invoking static method with return type UnitRepr
    invoke: java/lang/StringBuffer/append
    finish invoke:java/lang/StringBuffer/append
    getfield "java/lang/String.value"
    invoke: java/lang/String/coder
    finish invoke:java/lang/String/coder
    initializeClass java/lang/String  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    initializeClass java/lang/String  (finish)
    getfield "java/lang/String.coder"
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/String  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    initializeClass java/lang/String  (finish)
    initializeClass java/lang/StringUTF16  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/StringUTF16  (finish)
    invoking static method with return type MaybeRepr (ReferenceRepr (RecursiveRepr JVM_object []))
    putfield "java/lang/AbstractStringBuilder.value"
    putfield "java/lang/AbstractStringBuilder.coder"
    putfield "java/lang/AbstractStringBuilder.value"
    putfield "java/lang/AbstractStringBuilder.coder"
    invoke: java/lang/Class/desiredAssertionStatus
    finish invoke:java/lang/Class/desiredAssertionStatus
    initializeClass java/lang/StringUTF16  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/StringUTF16  (finish)
    initializeClass java/lang/StringUTF16  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/StringUTF16  (finish)
    invoking static method with return type BVRepr 32
    initializeClass java/lang/StringUTF16  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/StringUTF16  (finish)
    initializeClass java/lang/StringUTF16  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/StringUTF16  (finish)
    initializeClass java/lang/StringUTF16  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/StringUTF16  (finish)
    initializeClass java/lang/StringUTF16  (start)
    initializeClass java/lang/Object  (start)
    initializeClass java/lang/Object  (finish)
    invoking static method with return type UnitRepr
    initializeClass java/lang/StringUTF16  (finish)
    Abort due to false assumption:
      internal: error: in java/lang/StringUTF16.<clinit>
      null dereference
      In java/lang/StringUTF16.<clinit> at internal
      When calling <clinit>
      In java/lang/AbstractStringBuilder.<init> at internal
      When calling <init>
      In java/lang/StringBuffer.<init> at internal
      When calling <init>
      In Str.main at internal
      When calling main
    [Crux] Attempting to prove verification conditions.
    [Crux] Simulation complete.
    [Crux] Found counterexample for verification goal
    
    [Crux] Goal status:
    [Crux]   Total: 1
    [Crux]   Proved: 0
    [Crux]   Disproved: 1
    [Crux]   Incomplete: 0
    [Crux]   Unknown: 0
    [Crux] Overall status: Invalid.
    

    A null dereference! I couldn't tell at a glance what was causing it, but it's quite possible that this arises due to other hard-coded assumptions in crucible-jvm that no longer hold in modern JDKs.

RyanGlScott added a commit that referenced this issue Feb 5, 2021
This adds basic functionality for `crucible-jvm` to deal with JDK 9 or later,
which packages its standard library not in a JAR file, but in a JIMAGE file.
Extracting `.class` files from JIMAGE files proves to be surprisingly tricky,
and I've carefully documented the intricacies of doing so in
`Note [Loading classes from JIMAGE files]` in `Lang.JVM.Codebase`.
This is part of a fix for GaloisInc/saw-script#861.

In general, support for JDK 9 or later is still experimental, as there are
still unresolved bugs to diagnose. See #641.
RyanGlScott added a commit that referenced this issue Feb 5, 2021
This adds basic functionality for `crucible-jvm` to deal with JDK 9 or later,
which packages its standard library not in a JAR file, but in a JIMAGE file.
Extracting `.class` files from JIMAGE files proves to be surprisingly tricky,
and I've carefully documented the intricacies of doing so in
`Note [Loading classes from JIMAGE files]` in `Lang.JVM.Codebase`.
This is part of a fix for GaloisInc/saw-script#861.

In general, support for JDK 9 or later is still experimental, as there are
still unresolved bugs to diagnose. See #641.
RyanGlScott added a commit to GaloisInc/saw-script that referenced this issue Feb 8, 2021
This bumps the `crucible` submodule to include GaloisInc/crucible#638, which
adds basic support for handling JDK 9 or later. JDK 9+ packages its standard
library not in a JAR file, but in a JIMAGE file. For more details on how
`crucible-jvm` handles JIMAGE files, refer to
`Note [Loading classes from JIMAGE files]` in `Lang.JVM.Codebase`.

This fixes #861, although there are still unsolved issues that arise when
using modern JDKs with certain classes, such as `String`. As a result, I have
decided to label support for JDK 9+ as experimental:

* I have updated the SAW documentation to mention these shortcomings.
* I have opened GaloisInc/crucible#641 to track the remaining issues.

Other things:

* GaloisInc/crucible#636 and GaloisInc/crucible#638 upstreamed the code from
  `SAWScript.JavaTools` and `SAWScript.ProcessUtils` into `crucible-jvm`, so
  we can remove these modules in favor of importing `Lang.JVM.JavaTools` and
  `Lang.JVM.ProcessUtils` from `crucible-jvm`.

* I removed the dependency on the `xdg-basedir`, as it was unused. This
  dependency was likely added quite some time ago, and it appears that
  `saw-script` switched over to using XDG-related functionality from the
  `directory` library since then. I opted to use `directory` to find the
  `.cache` directory as well, so I have made that clear in the `.cabal` file.

* The `biJavaCodebase :: Codebase` field of `BuiltinContext` is completely
  unused, which I noticed when making changes to the `Codebase` type. Let's
  just remove it.
RyanGlScott added a commit to GaloisInc/saw-script that referenced this issue Feb 11, 2021
This bumps the `crucible` submodule to include GaloisInc/crucible#638, which
adds basic support for handling JDK 9 or later. JDK 9+ packages its standard
library not in a JAR file, but in a JIMAGE file. For more details on how
`crucible-jvm` handles JIMAGE files, refer to
`Note [Loading classes from JIMAGE files]` in `Lang.JVM.Codebase`.

This fixes #861, although there are still unsolved issues that arise when
using modern JDKs with certain classes, such as `String`. As a result, I have
decided to label support for JDK 9+ as experimental:

* I have updated the SAW documentation to mention these shortcomings.
* I have opened GaloisInc/crucible#641 to track the remaining issues.

Other things:

* GaloisInc/crucible#636 and GaloisInc/crucible#638 upstreamed the code from
  `SAWScript.JavaTools` and `SAWScript.ProcessUtils` into `crucible-jvm`, so
  we can remove these modules in favor of importing `Lang.JVM.JavaTools` and
  `Lang.JVM.ProcessUtils` from `crucible-jvm`.

* I removed the dependency on the `xdg-basedir`, as it was unused. This
  dependency was likely added quite some time ago, and it appears that
  `saw-script` switched over to using XDG-related functionality from the
  `directory` library since then. I opted to use `directory` to find the
  `.cache` directory as well, so I have made that clear in the `.cabal` file.

* The `biJavaCodebase :: Codebase` field of `BuiltinContext` is completely
  unused, which I noticed when making changes to the `Codebase` type. Let's
  just remove it.
@robdockins robdockins added the jvm label Feb 16, 2021
@weaversa
Copy link

weaversa commented Feb 9, 2023

Just a note to say that we bumped up against this today.

@RyanGlScott
Copy link
Contributor Author

Indeed, this is a pretty notable limitation of crucible-jvm's support for JDK9+. In order to make this work, I think the first step would be to characterize all of the differences in how String.class is implemented in JDK9+ versus older JDKs and audit which of the changes affect our model of the class.

@WeeknightMVP
Copy link

Thanks! The same issue is encountered in (my installation of) OpenJDK 8 unless specifying -b $JAVA_BIN, which might be related...

@RyanGlScott
Copy link
Contributor Author

Hm, I am surprised to hear that this happens even with OpenJDK 8. Just to be sure, are you putting OpenJDK 8 first on your PATH before any other JDK versions?

@WeeknightMVP
Copy link

Hm, I am surprised to hear that this happens even with OpenJDK 8. Just to be sure, are you putting OpenJDK 8 first on your PATH before any other JDK versions?

Aha. I did not modify PATH when trying different versions, so this would have remained OpenJDK 17 throughout:

$ which javac
/usr/bin/javac
$ readlink -f $(which javac)
/usr/lib/jvm/java-17-openjdk-amd64/bin/javac

@RyanGlScott
Copy link
Contributor Author

Right, the PATH is where SAW looks by default when --java-bin-dirs is not set. See this section of the SAW manual (under "PATH").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants