-
Notifications
You must be signed in to change notification settings - Fork 4
Tour
This page demonstrates Attestor's state space generation by means of a small example using singly-linked lists (SLL). In particular, we cover all necessary inputs to execute the analysis and explore the generated state space step by step.
We assume a basic understanding of concepts from model-checking and static program analysis. Moreover, we recommend to have a look at the architecture page and the glossary to clarify concepts and terminology. Further details are found in the publications section.
We will often graphically illustrate the heap configurations occurring within program states. Hence, it is recommended to get accustomed with our graphical notation first.
Further information on using the graphical user interface are found in the manual.
The state space obtained for our example can be graphically explored online. We recommend to follow the analysis described below while retracing the individual states in the graphical user interface.
We provide an archive containing a binary of Attestor and all example files considered on this page here.
Please notice that Attestor requires Java JDK 1.8 in order to run (cf. system requirements).
Moreover, please make sure that JAVA_HOME
is set correctly.
Further information on installing other versions of Attestor are found here.
After unpacking the archive with examples, each individual example can be executed as follows:
java -jar attestor.jar -l FILE.attestor
where FILE.attestor
specifies the example that should be executed. The graphical user interface generated for the provided example file is located in a directory FILE/
after execution.
This section serves as a reference for the individual steps considered in our example. We recommend to read the first to sections of the setup section to get accustomed with the data structure and the source code that is analyzed. After that, feel free to sections and come back whenever you want to improve your understanding of some components of the analysis.
In this section we discuss all inputs passed to Attestor:
- The data structure under consideration
- The source code that is analyzed
- The graph grammar used to guide abstraction and concretization
- The custom initial state
- Running Attestor
In this section, we explore the state space generated by Attestor and discuss a few steps in detail.
- The JIMPLE code obtained from the source code
- The initial state
We first provide an overview of the full state space generated by Attestor. After that, we discuss individual steps:
This section discusses a few modifications of our running example to explore more features of Attestor.
In this section, we briefly discuss the data structure that is analyzed throughout the example. Furthermore, we consider all inputs passed to Attestor to perform the analysis.
In our running example, we consider program manipulating singly-linked lists (SLL). We will assume the following class definition of an SLL in Java:
public class SLList {
public SLList next;
public SLList( SLList next ){
this.next = next;
}
}
As explained on the page on graphical notation, we model the heap as a graph, where objects are represented by nodes and pointers, for example x.next, are represented by edges between the source object of a pointer and its target object.
In the above case, instances of class SLL will be represented by nodes and outgoing edges of such nodes correspond to class variable next
. The source of an edge labeled with next
is thus the object whose class variable is considered. Its target is the value of class variable edge
for its source object.
An exemplary list is depicted below, where the shaded node represents the special location null
, i.e. the last element of the list points to null
.
Throughout our example analysis, we consider the following Java class:
public class SLList {
public SLList next;
public SLList( SLList next ){
this.next = next;
}
public static SLList prependSLList( SLList tail ){
SLList first = new SLList( tail );
SLList curr = first;
for( int i = 0; i < 10; i++ ){ //for Attestor this is non-deterministic
SLList tmp = new SLList( null ); //for demonstration: setting next to null first
curr.next = tmp;
curr = tmp;
}
return first;
}
// this is just a dummy such that the file compiles
public static void main(String[] args) {
prependSLList( null );
}
}
Our goal is to analyze the method prependSLList
.
It takes a singly linked list as input and prepends a number of new elements to it, starting with the new head of the list. So for example, if we start with the list [a,b,c], in the first iteration we would get [1,a,b,c] in the second
[1,2,a,b,c], etc.
Notice that Attestor does not account for arithmetic (yet). Hence, the guard i < 10
in the above loop will be interpreted nondeterministically.
We have to pass the classpath, the class name and the method to analyze to Attestor as follows (assuming the above class SLList.java
is located in directory code
):
--classpath "code"
--class "SLList"
--method "prependSLList"
Attestor requires a graph grammar to guide abstraction and concretization. The grammar used in our example describes all singly-linked list segments and is defined as follows:
Here, the third rule does not increase expressiveness, i.e. all graphs generated by the grammar can also be generated without the third rule. However, we need the third rule to obtain a backward confluent grammar.
Although Attestor has a built-in graph grammar for singly-linked lists, we define our own grammar for educational purposes. The grammar is then passed to Attestor as follows:
--grammar "grammar/SLList.json"
The full grammar file specifying the above grammar is found below. Further information about writing customized graph grammars are found on the corresponding page.
[
{
"nonterminal": "SLL",
"rank": 2,
"rules": [
{
"nodes": [
{
"type": "SLList",
"number": 2
}
],
"externals": [
0,
1
],
"variables": [],
"selectors": [
{
"label": "next",
"origin": 0,
"target": 1
}
],
"hyperedges": []
},
{
"nodes": [
{
"type": "SLList",
"number": 3
}
],
"externals": [
0,
2
],
"variables": [],
"selectors": [
{
"label": "next",
"origin": 0,
"target": 1
}
],
"hyperedges": [
{
"label": "SLL",
"tentacles": [
1,
2
]
}
]
},
{
"nodes": [
{
"type": "SLList",
"number": 3
}
],
"externals": [
0,
2
],
"variables": [],
"selectors": [],
"hyperedges": [
{
"label": "SLL",
"tentacles": [
0,
1
]
},
{
"label": "SLL",
"tentacles": [
1,
2
]
}
]
}
]
}
]
Since the method considered in our example takes a (relevant) parameter, we provide a custom initial state that determines the (abstract) value of this parameter. In our case, we would like the parameter to initially point to the head of an arbitrary singly-linked list. This is achieved by specifying a heap configuration where one node has an attached variable that is labeled with a parameter reference.
The heap configuration used in our example is illustrated below.
To pass such a heap configuration to Attestor, we use the option
--initial "initial/list.json"
where list.json
is a description of the above heap configuration in JSON.
The full specification of the initial state is provided below. Further information on writing custom initial states are found on the corresponding page.
{
"nodes":[
{
"type":"NULL",
"number":1
},{
"type":"SLList",
"number":1
}
],
"externals":[],
"variables":[
{
"name":"null",
"target":0
},{
"name":"@parameter0:",
"target":1
}
],
"selectors":[],
"hyperedges":[
{
"label":"SLL",
"tentacles":[1,0]
}
]
}
Putting all input files together, we end up with the following command-line options passed to Attestor:
--description "tour example 'prepend SLL'"
--export "example"
--classpath "code"
--class "SLList"
--method "prependSLList"
--grammar "grammar/SLList.json"
--initial "initial/list.json"
Here, --description
is optional and just provides a short description of the task that is, for example, also passed to the graphical user interface. Furthermore, --export
tells Attestor to export a report that enables graphical state space exploration.
The remaining options have been discussed in previous sections and serve to pass the class and method to analyze to Attestor together with the grammar and initial state used within the analysis. A complete reference of all command-line options of Attestor is found here.
We can either pass these options directly to Attestor or store them in a settings file, say example.attestor
, that is then fed into Attestor using the --load
option:
java -jar attestor -l example.attestor
The console output should look as follows:
java -jar attestor.jar -l example.attestor
[Version] attestor - version 0.3.8
Analyzed method: SLList.prependSLList
Scenario: tour example 'prepend SLL'
+-------------------------+------------------+
[Summary] | Generated states | Number of states |
+-------------------------+------------------+
| w/ procedure calls | 56 |
| w/o procedure calls | 38 |
| final states | 1 |
+-------------------------+------------------+
[Summary] Exports:
Report exported to example
+-----------------------------+--------------+
[Summary] | Phase | Runtime |
+-----------------------------+--------------+
| Command Line Interface | 0.011 s |
| Parse program | 0.603 s |
| Parse grammar | 0.020 s |
| Parse input | 0.001 s |
| Parse contracts | 0.000 s |
| Marking generation | 0.001 s |
| Grammar Refinement | 0.000 s |
| Abstraction preprocessing | 0.101 s |
| Interprocedural Analysis | 0.057 s |
| Model checking | 0.000 s |
| Counterexample generation | 0.000 s |
| Report generation | 0.065 s |
+-----------------------------+--------------+
| Total verification time | 0.058 s |
| Total runtime | 0.861 s |
+-----------------------------+--------------+
Moreover, the state space has been exported for graphical exploration to directory example/
.
In the remainder of this page, we discuss the results and individual steps performed by Attestor for our example setup.
The program semantics of Attestor is defined for Jimple programs, an intermediate language for Java that unifies several Bytecode statements and hence requires less semantics transformers. The translation from Java (Bytecode) programs is performed using soot. When exploring a state space generated by Attestor, the statement of each state thus corresponds to a Jimple statement.
For instance, the Jimple code of procedure prependSLList
is found below.
public static SLList prependSLList(SLList){
SLList tail, temp$0, curr, temp$1;
int i, temp$3;
1 tail := @parameter0: SLList;
2 temp$0 = new SLList;
3 specialinvoke temp$0.<SLList: void <init>(SLList)>(tail);
4 curr = temp$0;
5 i = 0;
label1:
6 if i < 10 goto label2;
7 goto label3;
label2:
8 temp$1 = new SLList;
9 specialinvoke temp$1.<SLList: void <init>(SLList)>(null);
10 curr.<SLList: SLList next> = temp$1;
11 curr = temp$1;
12 temp$3 = i + 1;
13 i = temp$3;
14 goto label1;
label3:
15 return temp$0;
In our example, we set the option --export "example"
in order to generate a report (stored in directory example
) that allows graphical state space exploration.
For our example, the graphical user interface can also be accessed online.
Further details regarding the graphical user interface are found in the manual.
Let us have a look at the generated state space. Please find below a screenshot of the generated state space in which we grouped several states into a single block.
The topmost state is the initial state, i.e. the state of the heap before the execution of the program started. The green states correspond to the if-statement in line 6. We observe that both states have two outgoing transitions, because the condition of the if-statement has been resolved by non-deterministic branching.
The pink states before the first branching state correspond to the statements before the loop.
The two blue segments each correspond to the statements in the loop body. They start after an evaluation of the if-statement and end in another evaluation of the if-statement.
For the segment in light blue (the lower one) we see, that the evaluation results in an actual loop indicating that Attestor found a fixed point and hence a loop invariant.
Finally, the yellow statements correspond to the program statements after the loop. There is a single final state. Notice, however, that analyzing other programs could result in multiple final states.
In the following, we consider a few states in detail to highlight the essential steps of the analysis.
The initial state (ID: 0) of the state space is depicted below.
Notice that this state it is not identical to the custom initial state.
This is due to the fact that Attestor has automatically inserted missing constants, such as true
and false
.
The new
statement in Jimple code does not include the call to the constructor.
It only creates the new object and sets its fields to the default values.
The constructor is then invoked by the next instruction, see method call.
Our example contains several new
statements.
Let us consider the statement in line 2.
It is executed on the state with ID 1, namely:
The resulting state (ID: 2) adds a new object (in the picture this is node 11), assigns it to variable temp$0
and sets the field next
to the null-node.
Attestor then directly canonicalizes the state resulting in an abstract list between nodes 11 and 0.
The method calls in this example are all constructor calls.
Howeover, notice that Attestor is capable of analyzing arbitrary method calls, e.g. static, non-static, and recursive calls.
Directly after the new statement described above the call
specialinvoke temp$0.<SLList: void <init>(SLList)>(tail);
has to be executed (state with ID 2).
Here, <init>
is the constructor,
temp$0
, the base value, is the newly created object (node 11 in the pictures) and tail
is the argument
(attached to node 1).
After identifying tail
as the argument, Attestor automatically removes the variable tail
as it is no longer used (using Soot's Live Variable Analysis).
This is an optimization to allow for a more aggressive canonicalization.
The constructor sets the next
-field of the base value to the argument.
After executing the return
statement, Attestor automatically canonicalizes the state. Since the variable tail
pointing to node 1 was removed, it can be abstracted.
Therefore, the resulting state contains only a single list segment:
Attestor does only manipulate concrete parts of the heap. Whenever it needs to access a field which is hidden in a nonterminal edge, it first materializes it (in all possible ways) and then performs the semantics on the concrete elements instead. In the state space materialization is indicated by red transitions. For our running example, materialization is performed at the state with ID 11.
The next statement to be executed is curr.next = temp$1;
. Node 11 is referenced by variable curr
and its
next
-field is hidden in the adjacent nonterminal edge. Attestor thus uses its grammar to materialize the edge,
resulting in states with IDs 12 and 13, respectively:
In one of the resulting states, the first grammar rule is applied such that the next field directly points to null
.
In the other state, the
second rule is applied resulting in an additional node (3) to which the next field points and that is attached to a new nonterminal edge.
We observe that the second materialization is actually spurious, because the next-field of curr
was explicitly set to null
in the previous statement.
In other words, the abstraction is too aggressive.
Attestor performs a less aggressive abstraction strategy if the option --admissible-abstraction
is enabled.
We have not enabled this option to highlight the effects of materialization.
Note that Attestor does not apply unnecessary rules. This means that
- It does not materialize nonterminal edges when it does not need a field from the adjacent nodes (here: the edge adjacent to node 13 is not materialized) and
- It does not use rules which do not result in the requested field (here: the third rule is not used since it would not generate a field
next
at node 11 anyway).
Assignment statements set variables or fields.
For example, the effect of line 10 curr.<SLList: SLList next> = temp$1;
on the state with ID 13 is to set the
next-field of node 11 from null
to node 13.
Return statements are resolved by identifying the node to be returned and attaching a special variable with a
return label to it.
For example in the state with ID 8, the next statement to be executed is return temp$0;
.
Here, node 11 is referenced by temp$0
. The resulting state (ID: 9) thus references this node via @return
.
Attestor abstracts states directly after computing the effects of a statement.
For instance, the state with ID 33 is canonicalized after its associated statement, goto 5
, has been executed.
Since this statement does not affect the heap, all changes in the heap are due to canonicalization.
The state with ID 33 contains two abstract list segments and a next pointer from node 13 to 14:
Its successor state (ID 23) does not contain this next edge. The state is obtained by first embedding the first grammar rule between nodes 13 and 14 and then using the third rule to abstract the two adjacent nonterminal edges from 11 to 13 (already there) and from 13 to 14 (resulting from the previous step).
Note that, as the resulting state is merged with an existing one, the node identifiers may have changed.
For example, node 13
in the resulting state should not be seen as identical to the one previous to canonicalization.
After following the above steps, a reader should be familiar with generating state spaces with Attestor and exploring them using the generated graphical user interface. In this section, we look present a few options that you might want to try out next.
As we observed when considering the new statement, Attestor may abstract too much. This results in unncessary materialization steps and thus larger state spaces and run times. In particular, over-abstraction can be observed if parts of the heap are abstracted and immediately accessed in the next statement. Attestor can counter this kind of over-abstraction by performing abstraction only, if the embedding used for abstraction does not violate admissibility.
To prevent abstraction from violating admissibility, it suffices to set the option --admissible-abstraction
.
Run the above example again and explore the changes in the resulting state space:
java -jar attestor -l admissible.attestor
It is possible to supply LTL specifications to Attestor that are checked after state space generation. There are three possible outcomes of the verification process: Either Attestor finds a proof that the specification is satisfied or states that the specification is violated. In the second case, Attestor produces a counterexample which can be explored similarly to state spaces.
Notice that a specification might be violated due to the over-approximation performed by Attestor to obtain a finite state space. To check whether a property is really violated or provided counterexamples are spurious, the option --canonical
can additionally be supplied. In this case, Attestor attempts to find an input that is guaranteed to violate the specification.
Specifications are supplied using the option -mc <spec>
, where <spec>
is a linear temporal logic formula as defined here. Please find below a few example specifications to experiment with. For each specification, the example archive contains a corresponding .attestor
file.
Assume we want to express that for every point in the program execution, the heap will - at some point in the future - consist of a singly-linked list again and of nothing else but constants. A corresponding formula looks as follows:
GF { L(SLL) }
Please find below the relevant output of Attestor when running our example with the above formula:
java -jar attestor -l model-checking-01.attestor
...
[Summary] Model checking results: All provided LTL formulae are satisfied.
[SAT] GF { L(SLL) }
...
One might guess that the heap is not only eventually a list, but it actually consists of a list and nothing else during the whole execution of our example program. We can formalize this property as follows:
G { L(SLL) }
Please find below the relevant output of Attestor when running our example with the above formula:
java -jar attestor -l model-checking-02.attestor
...
[Warning] Verification of potentially spurious counterexamples is disabled.
[Warning] It is advised to rerun with option '--canonical'.
...
[Summary] Model checking results: Some provided LTL formulae could not be verified.
[UNSAT] G { L(SLL) }
[Summary] Detected counterexamples are not verified.
G { L(SLL) }
Counterexample trace: [0, 1, 2]
...
As we can see, Attestor fails to verify this property! In fact, it produces a counterexample, the trace [0,1,2]
, that leads to a state in which the heap does not consist exactly of a singly-linked list.
Since Attestor over-approximates the concrete program semantics, this counterexample might be spurious. Hence, a corresponding warning is produced. We can tell Attestor to check whether our counterexample [0,1,2]
is spurious, by running the above example again with option --canonical
:
java -jar attestor.jar -l model-checking03.attestor
...
[Summary] Model checking results: Some provided LTL formulae could not be verified.
[UNSAT] G { L(SLL) }
[Summary] Detected a non-spurious counterexample for all violated LTL formulae.
G { L(SLL) }
Counterexample trace: [0, 1, 2]
...
As stated above, Attestor verified that the counterexample is a realistic one. The heap thus does not always consist exactly of a singly-linked list.
We collect various running examples including source code, grammars, LTL properties, and initial states in a separate example repository. Feel free to try out the analyses specified in this repository and modify them to analyze your own code.