Multi dimensional arrays or maybe partial modification of pointer values...not really sure. #517

weaversa · 2019-07-29T23:41:57Z

Consider the following contrived C function.

void f(int A[10][9][8], int val, int pos) {
  for(int i = 0; i < 10; i++) {
    for(int j = 0; j < 9; j++) {
      A[i][j][pos] = val;
    }
  }
}

Here is my attempt to say something about it with saw

let f_spec = do {
  A_p <- crucible_alloc (llvm_array 10 (llvm_array 9 (llvm_array 8 (llvm_int 32))));
  val <- crucible_fresh_var "val" (llvm_int 32);
  pos <- crucible_fresh_var "pos" (llvm_int 32);

  crucible_precond {{ pos <= 7 }};

  crucible_execute_func [ A_p, crucible_term val, crucible_term pos ];

  A' <- crucible_fresh_var "A'" (llvm_array 10 (llvm_array 9 (llvm_array 8 (llvm_int 32))));
  crucible_points_to A_p (crucible_term {{ A' }});

  crucible_postcond
    {{ [ [ aij@pos == val | aij <- ai ] | ai <- A' ] == ~zero }};
};

m <- llvm_load_module "test.bc";
crucible_llvm_verify m "f" [] false f_spec z3;

clang -emit-llvm -c test.c -o test.bc && saw test.saw

Unfortunately, saw gets stuck at the crucible_points_to line, eating up all the memory on my machine. Am I doing this kind of proof right? I'm not really sure what the issue is here...Any help would be greatly appreciated!

Also, I picked 10, 9, and 8 out of thin air...

The text was updated successfully, but these errors were encountered:

langston-barrett · 2019-08-28T20:19:57Z

@andreistefanescu As someone who has recently been working on SAW/crucible-llvm performance improvements, do you have any ideas as to what might be going on here? Also cc'ing @robdockins

@weaversa Some of Andrei's improvements happened in our What4 library, which can simplify some symbolic expressions. It's probably worth a shot to use w4_z3 in place of the z3 tactic.

brianhuffman · 2019-08-28T21:31:41Z

First of all, this proof script should fail, as the crucible_points_to in the post-condition asserts that the entire allocated region should be initialized, which is not the case.

Secondly, this example looks like it would stress the LLVM memory model pretty hard, as the function performs 900 writes to the same memory region, all of them at symbolic offsets.

However, I would expect the runtime to degrade predictably as the array dimensions increase, but it seems like it falls off a cliff. Reducing the dimensions from A[10][9][8] to A[1][4][8] makes it finish in less than a second, then A[1][5][8] takes about 10 seconds, and with A[1][6][8] I killed it after a few minutes.

I made a profiling build and tested the six-element version, letting it run for a few minutes. It looks like one of the expensive bits is the function treeToPredicate in module Land.Crucible.CFG.Extension.Safety, which calls collapseAT in What4.Partial.AssertionTree. We should have a closer look at these.

brianhuffman · 2019-08-29T18:28:18Z

Looking at the proof obligations that come out of the symbolic simulator, I can see that the final one (asserting that the post-condition read of the whole array is valid) gets really big, really fast.

In the presence of writes at symbolic offsets, the formula that we generate to assert the validity of memory is just way more complicated than it needs to be. If we have an array of integers and do some writes at symbolic offsets (at 'i' and 'j', say) and then try to read at offset 'k', the validity predicate should be pretty simple: i = k \/ j = k. The size of this formula should scale linearly with the number of writes.

But instead, we currently produce a huge mux tree that enumerates separate cases for each of the possible values of i and j. The formula size grows exponentially in the number of writes, and the base of the exponent apparently depends on the size of the array. I think we need to sit down and redesign some chunks of the memory model code.

weaversa · 2019-09-06T16:03:19Z

Here is another example that I think exercises the same underlying issue. The first proof goes through in about 1 second. But, simply adding an extra field to the struct causes the proof to fall into an abyss.

Another oddity is this --- changing the prover for the second proof from abc to z3 still exhibits the slow down, but if I change the solver to offline_smtlib2, z3 can solve the resulting smt2 files in less than a second.

#include <stdint.h>

typedef struct example_struct_t {
  uint32_t a[4];
} example_struct_t;

typedef struct example_struct_2_t {
  uint32_t a[4];
  uint32_t b[4];
} example_struct_2_t;

uint32_t mult(uint32_t data[], example_struct_t *s, uint32_t i) {
  return data[i] * s->a[i];
}

uint32_t mult2(uint32_t data[], example_struct_2_t *s, uint32_t i) {
  return data[i] * s->a[i];
}

let alloc_init var_type v = do {
  p <- crucible_alloc var_type;
  crucible_points_to p v;
  return p;
};

let ptr_to_fresh n var_type = do {
  x <- crucible_fresh_var n var_type;
  p <- alloc_init var_type (crucible_term x);
  return (x, p);
};

let {{

  type State = [4][32]
  type example_struct_t = { a : [4][32] }
  type example_struct_2_t = { a: [4][32], b : [4][32] }

  mult : State -> example_struct_t -> [32] -> [32]
  mult data s i = data@i * s.a@i

  mult2 : State -> example_struct_2_t -> [32] -> [32]
  mult2 data s i = data@i * s.a@i

}};


let mult_spec = do {
  (data, data_p) <- ptr_to_fresh "data" (llvm_array 4 (llvm_int 32));

  (a, a_p) <- ptr_to_fresh "a" (llvm_array 4 (llvm_int 32));
  i <- crucible_fresh_var "i" (llvm_int 32);

  crucible_precond {{ (i >= 0) /\ (i <= 3) }};

  let s = crucible_struct [crucible_term a];
  s_p <- alloc_init (llvm_struct "struct.example_struct_t") s;

  crucible_execute_func [data_p, s_p, crucible_term i];

  crucible_return (crucible_term {{ mult data {a=a} i }});
};

let mult2_spec = do {
  (data, data_p) <- ptr_to_fresh "data" (llvm_array 4 (llvm_int 32));

  (a, a_p) <- ptr_to_fresh "a" (llvm_array 4 (llvm_int 32));
  (b, b_p) <- ptr_to_fresh "b" (llvm_array 4 (llvm_int 32));
  i <- crucible_fresh_var "i" (llvm_int 32);

  crucible_precond {{ (i >= 0) /\ (i <= 3) }};

  let s = crucible_struct [crucible_term a, crucible_term b];
  s_p <- alloc_init (llvm_struct "struct.example_struct_2_t") s;

  crucible_execute_func [data_p, s_p, crucible_term i];

  crucible_return (crucible_term {{ mult2 data {a=a, b=b} i }});
};

mult_bc <- llvm_load_module "mult.bc";

mult_ov <- crucible_llvm_verify mult_bc "mult" [] false mult_spec abc;

mult2_ov <- crucible_llvm_verify mult_bc "mult2" [] false mult2_spec abc;

$ make
clang -emit-llvm -c mult.c -o mult.bc
saw mult.saw
[15:53:57.297] Loading file "test/mult.saw"
[15:53:57.335] Verifying mult ...
[15:53:57.347] Simulating mult ...
[15:53:57.348] Checking proof obligations mult ...
[15:53:58.433] Proof succeeded! mult
[15:53:58.454] Verifying mult2 ...
[15:53:58.474] Simulating mult2 ...
[15:53:58.476] Checking proof obligations mult2 ...

weaversa · 2019-11-04T18:33:43Z

bump

brianhuffman · 2019-11-05T18:14:15Z

I will be doing some work on LLVM memory model performance over the next couple of weeks. I will probably get some significant speedup on this example, but there's no quick fix; I expect it will take some time.

weaversa · 2019-11-05T18:19:38Z

@brianhuffman I don't believe the issues with the second example are due to inefficiencies in the LLVM memory model.

brianhuffman · 2019-11-05T19:16:30Z

You're right. The second example is a completely separate issue, and it should have its own separate ticket.

langston-barrett added the subsystem: crucible-llvm Issues related to LLVM bitcode verification with crucible-llvm label Jul 30, 2019

weaversa added type: support Issues that are primarily support requests type: question Issues that are primarily asking questions labels Aug 13, 2019

weaversa mentioned this issue Nov 5, 2019

Unused C struct fields causing saw to enter possible infinite loop #581

Open

robdockins added the topics: memory model Issues that relate to the LLVM and/or Crucible model of pointers and memory blocks label Jul 16, 2021

sauclovian-g added performance Issues that involve or include performance problems and removed type: question Issues that are primarily asking questions labels Oct 24, 2024

sauclovian-g added this to the Someday milestone Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi dimensional arrays or maybe partial modification of pointer values...not really sure. #517

Multi dimensional arrays or maybe partial modification of pointer values...not really sure. #517

weaversa commented Jul 29, 2019

langston-barrett commented Aug 28, 2019

brianhuffman commented Aug 28, 2019

brianhuffman commented Aug 29, 2019

weaversa commented Sep 6, 2019 •

edited

Loading

weaversa commented Nov 4, 2019

brianhuffman commented Nov 5, 2019

weaversa commented Nov 5, 2019

brianhuffman commented Nov 5, 2019

Multi dimensional arrays or maybe partial modification of pointer values...not really sure. #517

Multi dimensional arrays or maybe partial modification of pointer values...not really sure. #517

Comments

weaversa commented Jul 29, 2019

langston-barrett commented Aug 28, 2019

brianhuffman commented Aug 28, 2019

brianhuffman commented Aug 29, 2019

weaversa commented Sep 6, 2019 • edited Loading

weaversa commented Nov 4, 2019

brianhuffman commented Nov 5, 2019

weaversa commented Nov 5, 2019

brianhuffman commented Nov 5, 2019

weaversa commented Sep 6, 2019 •

edited

Loading