Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi dimensional arrays or maybe partial modification of pointer values...not really sure. #517

Open
weaversa opened this issue Jul 29, 2019 · 8 comments
Labels
performance Issues that involve or include performance problems subsystem: crucible-llvm Issues related to LLVM bitcode verification with crucible-llvm topics: memory model Issues that relate to the LLVM and/or Crucible model of pointers and memory blocks type: support Issues that are primarily support requests
Milestone

Comments

@weaversa
Copy link
Contributor

Consider the following contrived C function.

void f(int A[10][9][8], int val, int pos) {
  for(int i = 0; i < 10; i++) {
    for(int j = 0; j < 9; j++) {
      A[i][j][pos] = val;
    }
  }
}

Here is my attempt to say something about it with saw

let f_spec = do {
  A_p <- crucible_alloc (llvm_array 10 (llvm_array 9 (llvm_array 8 (llvm_int 32))));
  val <- crucible_fresh_var "val" (llvm_int 32);
  pos <- crucible_fresh_var "pos" (llvm_int 32);

  crucible_precond {{ pos <= 7 }};

  crucible_execute_func [ A_p, crucible_term val, crucible_term pos ];

  A' <- crucible_fresh_var "A'" (llvm_array 10 (llvm_array 9 (llvm_array 8 (llvm_int 32))));
  crucible_points_to A_p (crucible_term {{ A' }});

  crucible_postcond
    {{ [ [ aij@pos == val | aij <- ai ] | ai <- A' ] == ~zero }};
};

m <- llvm_load_module "test.bc";
crucible_llvm_verify m "f" [] false f_spec z3;
clang -emit-llvm -c test.c -o test.bc && saw test.saw

Unfortunately, saw gets stuck at the crucible_points_to line, eating up all the memory on my machine. Am I doing this kind of proof right? I'm not really sure what the issue is here...Any help would be greatly appreciated!

Also, I picked 10, 9, and 8 out of thin air...

@langston-barrett langston-barrett added the subsystem: crucible-llvm Issues related to LLVM bitcode verification with crucible-llvm label Jul 30, 2019
@weaversa weaversa added type: support Issues that are primarily support requests type: question Issues that are primarily asking questions labels Aug 13, 2019
@langston-barrett
Copy link
Contributor

@andreistefanescu As someone who has recently been working on SAW/crucible-llvm performance improvements, do you have any ideas as to what might be going on here? Also cc'ing @robdockins

@weaversa Some of Andrei's improvements happened in our What4 library, which can simplify some symbolic expressions. It's probably worth a shot to use w4_z3 in place of the z3 tactic.

@brianhuffman
Copy link
Contributor

First of all, this proof script should fail, as the crucible_points_to in the post-condition asserts that the entire allocated region should be initialized, which is not the case.

Secondly, this example looks like it would stress the LLVM memory model pretty hard, as the function performs 900 writes to the same memory region, all of them at symbolic offsets.

However, I would expect the runtime to degrade predictably as the array dimensions increase, but it seems like it falls off a cliff. Reducing the dimensions from A[10][9][8] to A[1][4][8] makes it finish in less than a second, then A[1][5][8] takes about 10 seconds, and with A[1][6][8] I killed it after a few minutes.

I made a profiling build and tested the six-element version, letting it run for a few minutes. It looks like one of the expensive bits is the function treeToPredicate in module Land.Crucible.CFG.Extension.Safety, which calls collapseAT in What4.Partial.AssertionTree. We should have a closer look at these.

@brianhuffman
Copy link
Contributor

Looking at the proof obligations that come out of the symbolic simulator, I can see that the final one (asserting that the post-condition read of the whole array is valid) gets really big, really fast.

In the presence of writes at symbolic offsets, the formula that we generate to assert the validity of memory is just way more complicated than it needs to be. If we have an array of integers and do some writes at symbolic offsets (at 'i' and 'j', say) and then try to read at offset 'k', the validity predicate should be pretty simple: i = k \/ j = k. The size of this formula should scale linearly with the number of writes.

But instead, we currently produce a huge mux tree that enumerates separate cases for each of the possible values of i and j. The formula size grows exponentially in the number of writes, and the base of the exponent apparently depends on the size of the array. I think we need to sit down and redesign some chunks of the memory model code.

@weaversa
Copy link
Contributor Author

weaversa commented Sep 6, 2019

Here is another example that I think exercises the same underlying issue. The first proof goes through in about 1 second. But, simply adding an extra field to the struct causes the proof to fall into an abyss.

Another oddity is this --- changing the prover for the second proof from abc to z3 still exhibits the slow down, but if I change the solver to offline_smtlib2, z3 can solve the resulting smt2 files in less than a second.

#include <stdint.h>

typedef struct example_struct_t {
  uint32_t a[4];
} example_struct_t;

typedef struct example_struct_2_t {
  uint32_t a[4];
  uint32_t b[4];
} example_struct_2_t;

uint32_t mult(uint32_t data[], example_struct_t *s, uint32_t i) {
  return data[i] * s->a[i];
}

uint32_t mult2(uint32_t data[], example_struct_2_t *s, uint32_t i) {
  return data[i] * s->a[i];
}
let alloc_init var_type v = do {
  p <- crucible_alloc var_type;
  crucible_points_to p v;
  return p;
};

let ptr_to_fresh n var_type = do {
  x <- crucible_fresh_var n var_type;
  p <- alloc_init var_type (crucible_term x);
  return (x, p);
};

let {{

  type State = [4][32]
  type example_struct_t = { a : [4][32] }
  type example_struct_2_t = { a: [4][32], b : [4][32] }

  mult : State -> example_struct_t -> [32] -> [32]
  mult data s i = data@i * s.a@i

  mult2 : State -> example_struct_2_t -> [32] -> [32]
  mult2 data s i = data@i * s.a@i

}};


let mult_spec = do {
  (data, data_p) <- ptr_to_fresh "data" (llvm_array 4 (llvm_int 32));

  (a, a_p) <- ptr_to_fresh "a" (llvm_array 4 (llvm_int 32));
  i <- crucible_fresh_var "i" (llvm_int 32);

  crucible_precond {{ (i >= 0) /\ (i <= 3) }};

  let s = crucible_struct [crucible_term a];
  s_p <- alloc_init (llvm_struct "struct.example_struct_t") s;

  crucible_execute_func [data_p, s_p, crucible_term i];

  crucible_return (crucible_term {{ mult data {a=a} i }});
};

let mult2_spec = do {
  (data, data_p) <- ptr_to_fresh "data" (llvm_array 4 (llvm_int 32));

  (a, a_p) <- ptr_to_fresh "a" (llvm_array 4 (llvm_int 32));
  (b, b_p) <- ptr_to_fresh "b" (llvm_array 4 (llvm_int 32));
  i <- crucible_fresh_var "i" (llvm_int 32);

  crucible_precond {{ (i >= 0) /\ (i <= 3) }};

  let s = crucible_struct [crucible_term a, crucible_term b];
  s_p <- alloc_init (llvm_struct "struct.example_struct_2_t") s;

  crucible_execute_func [data_p, s_p, crucible_term i];

  crucible_return (crucible_term {{ mult2 data {a=a, b=b} i }});
};

mult_bc <- llvm_load_module "mult.bc";

mult_ov <- crucible_llvm_verify mult_bc "mult" [] false mult_spec abc;

mult2_ov <- crucible_llvm_verify mult_bc "mult2" [] false mult2_spec abc;
$ make
clang -emit-llvm -c mult.c -o mult.bc
saw mult.saw
[15:53:57.297] Loading file "test/mult.saw"
[15:53:57.335] Verifying mult ...
[15:53:57.347] Simulating mult ...
[15:53:57.348] Checking proof obligations mult ...
[15:53:58.433] Proof succeeded! mult
[15:53:58.454] Verifying mult2 ...
[15:53:58.474] Simulating mult2 ...
[15:53:58.476] Checking proof obligations mult2 ...

@weaversa
Copy link
Contributor Author

weaversa commented Nov 4, 2019

bump

@brianhuffman
Copy link
Contributor

I will be doing some work on LLVM memory model performance over the next couple of weeks. I will probably get some significant speedup on this example, but there's no quick fix; I expect it will take some time.

@weaversa
Copy link
Contributor Author

weaversa commented Nov 5, 2019

@brianhuffman I don't believe the issues with the second example are due to inefficiencies in the LLVM memory model.

@brianhuffman
Copy link
Contributor

You're right. The second example is a completely separate issue, and it should have its own separate ticket.

@robdockins robdockins added the topics: memory model Issues that relate to the LLVM and/or Crucible model of pointers and memory blocks label Jul 16, 2021
@sauclovian-g sauclovian-g added performance Issues that involve or include performance problems and removed type: question Issues that are primarily asking questions labels Oct 24, 2024
@sauclovian-g sauclovian-g added this to the Someday milestone Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues that involve or include performance problems subsystem: crucible-llvm Issues related to LLVM bitcode verification with crucible-llvm topics: memory model Issues that relate to the LLVM and/or Crucible model of pointers and memory blocks type: support Issues that are primarily support requests
Projects
None yet
Development

No branches or pull requests

5 participants