Skip to content
psuter edited this page Mar 29, 2012 · 9 revisions

Abstract Bytecodes

Loading constant values

Constant values can be pushed on the stack in several different ways. Integers in particular can be pushed using the various ICONST_n codes, BIPUSH, SIPUSH and LDC or LDC_W for larger values recovered from the stack.

The abstract bytecode:

  • Ldc(arg)

...can be used instead. It can take as parameter an integer, a float, a double, a long or a string, and will generate the appropriate code. For instance:

codeHandler << Ldc(1.0f)

...is equivalent to adding the bytecode FCONST_0, and:

codeHandler << Ldc(35042)

...will add the value 35042 to the constant pool and emit the appropriate LDC operation.

Loading and storing locals

All Java bytecodes XLOAD, XLOAD_n, XSTORE and XSTORE_n where X is one of { A, D, F, I, L } and n is in 0...3 can be replaced by their more general counterparts XLoad(n) and XStore(n), where the appropriate instructions are automatically generated (shortcut, standard or "wide" mode).

Loading method parameters

You can use the abstract bytecode ArgLoad(n) to load a method argument. In this case, n corresponds to the position of the argument in the argument list (0-indexed for static methods, 1-indexed for non-static methods, with 0 representing this). This will compute the slot position and use the appropriate XLOAD operation.

IInc

  • IInc(varIndex, increment)

...can be used instead of the standard bytecode IINC. It automatically switches to the "wide" mode when the increment or the variable index exceed one-byte values.

Accessing fields

Accessing fields is achieved through the following abstract bytecodes:

  • GetField(className, fieldName, fieldType)
  • PutField(className, fieldName, fieldType)
  • GetStatic(className, fieldName, fieldType)
  • PutStatic(className, fieldName, fieldType)

The correspondence to Java bytecodes should be obvious. The operands must be prepared on the stack as for the actual bytecodes. All names are given as strings. Thus, to put the value 42 in a static field answer of type int in a class Universe of a package questions, for instance, we write:

codeHandler << Ldc(42) << PutStatic("questions/Universe", "answer", "I")

Note that the field type must be encoded according to the convention for type descriptors in the JVM (see top of this page). The class name corresponds to the fully qualified Java name for the class, where dots are replaced by slashes.

Method calls

The abstract bytecodes:

  • InvokeVirtual(className, methodName, methodSignature),
  • InvokeStatic(className, methodName, methodSignature) and
  • InvokeSpecial(className, methodName, methodSignature)

...can be used to call methods. The receiver (for non-static methods) and the arguments must be prepared on the stack as for the corresponding bytecodes, but all relevant information will be added to the constant pool automatically. Again, the method signature must be encoded following the JVM convention. Run javap -c -v on some class files to see examples, or read this page.

Creating objects and arrays

There is an abstract byte code to create new instances of classes using their default constructor:

  • DefaultNew(className) creates and initializes a new object of class className and leaves it on the stack

Similarly for arrays:

  • NewArray(className) creates a new array containing objects of type className (the size has to be on top of the stack)
  • NewArray(primitiveTypeCode) creates a new array of a primitive type. The integer primitiveTypeCode should correspond to the codes specified in the JVM for the NEWARRAY instruction.

Labels and control flow

All control instructions have equivalent abstract bytecodes which refer to their target using strings rather than offsets. Labels can be inserted which are, well, labeled with these strings. For example, the following:

codeHandler << Ldc(42) << Goto("after") << POP << Ldc(41) << Label("After") << IRETURN

...will generate code for a method which always returns 42, as the control flow "jumps over" the instructions which would replace 42 by 41.

The exhaustive list of control flow abstract bytecodes is given here. The correspondence with actual Java bytecodes should be obvious.

  • Goto(target)
  • IfEq(target)
  • IfNe(target)
  • IfLt(target)
  • IfLe(target)
  • IfGt(target)
  • IfGe(target)
  • IfNull(target)
  • IfNonNull(target)
  • If_ICmpEq(target)
  • If_ICmpNe(target)
  • If_ICmpLt(target)
  • If_ICmpLe(target)
  • If_ICmpGt(target)
  • If_ICmpGe(target)
  • If_ACmpEq(target)
  • If_ACmpNe(target)

Line numbers

Class files can store positional information. This is used for instance when the virtual machine needs to print a stack trace, or for debuggers. To store such information with Cafebabe, one can use the LineNumber(n) abstract byte code. Just like labels and comments, it does not translate into any bytecode, but rather is compiled into a table that is part of the class file format and thus understood by the virtual machine. A LineNumber(n) abstract bytecode indicates that everything that follows was compiled from code found at line n. For instance, for the following code:

i = i + j; // line 3 in source
j = 0;     // line 4 in source
...

...one could write:

codeHandler << LineNumber(3) << ILoad(1) << ILoad(2) << IADD << IStore(1)
codeHandler << LineNumber(4) << Ldc(0) << IStore(2)
...

Note that attaching line information is not so useful if one does not attach the source name, using the .setSourceFile(...) method on the ClassFile instance.

Clone this wiki locally