[analyzer][Solver] Early return if sym is concrete on assuming #115579

danix800 · 2024-11-09T04:33:11Z

This could deduce some complex syms derived from simple ones whose values could be constrainted to be concrete during execution, thus reducing some overconstrainted states.

This commit also fix unix.StdCLibraryFunctions crash due to these overconstrainted states being added to the graph, which is marked as sink node (PosteriorlyOverconstrained). The 'assume' API is used in non-dual style so the checker should protectively test whether these newly added nodes are actually impossible.

The crash: https://godbolt.org/z/8KKWeKb86
The solver needs to solve equivalent: https://godbolt.org/z/ed8WqsbTh

This could deduce some complex syms derived from simple ones whose values could be constrainted to be concrete during execution, thus reducing some overconstrainted states. This commit also fix 'unix.StdCLibraryFunctions' crash due to these overconstrainted states being added to the graph, which is marked as sink node (PosteriorlyOverconstrained). The 'assume' API is used in non-dual style so the checker should protectively test whether these newly added nodes are actually impossible.

llvmbot · 2024-11-09T04:33:44Z

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Ding Fei (danix800)

Changes

This could deduce some complex syms derived from simple ones whose values could be constrainted to be concrete during execution, thus reducing some overconstrainted states.

This commit also fix unix.StdCLibraryFunctions crash due to these overconstrainted states being added to the graph, which is marked as sink node (PosteriorlyOverconstrained). The 'assume' API is used in non-dual style so the checker should protectively test whether these newly added nodes are actually impossible.

The crash: https://godbolt.org/z/8KKWeKb86
The solver needs to solve equivalent: https://godbolt.org/z/ed8WqsbTh

Full diff: https://github.com/llvm/llvm-project/pull/115579.diff

6 Files Affected:

(modified) clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp (+2)
(modified) clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp (+1-1)
(modified) clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp (+8-1)
(added) clang/test/Analysis/solver-sym-simplification-on-assumption.c (+31)
(added) clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c (+33)
(modified) clang/test/Analysis/symbol-simplification-fixpoint-two-iterations.cpp (+3-3)

diff --git a/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
index 4f30b2a0e7e7da..5faaf9cf274531 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
@@ -1354,6 +1354,8 @@ void StdLibraryFunctionsChecker::checkPreCall(const CallEvent &Call,
             if (BR.isInteresting(ArgSVal))
               OS << Msg;
           }));
+      if (NewNode->isSink())
+        break;
     }
   }
 }
diff --git a/clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp b/clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp
index c0b3f346b654df..2b77167fab86f2 100644
--- a/clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp
@@ -74,7 +74,7 @@ ConstraintManager::assumeDualImpl(ProgramStateRef &State,
       // it might happen that a Checker uncoditionally uses one of them if the
       // other is a nullptr. This may also happen with the non-dual and
       // adjacent `assume(true)` and `assume(false)` calls. By implementing
-      // assume in therms of assumeDual, we can keep our API contract there as
+      // assume in terms of assumeDual, we can keep our API contract there as
       // well.
       return ProgramStatePair(StInfeasible, StInfeasible);
     }
diff --git a/clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp b/clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp
index 4bbe933be2129e..cc2280faa6f730 100644
--- a/clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp
@@ -23,7 +23,14 @@ RangedConstraintManager::~RangedConstraintManager() {}
 ProgramStateRef RangedConstraintManager::assumeSym(ProgramStateRef State,
                                                    SymbolRef Sym,
                                                    bool Assumption) {
-  Sym = simplify(State, Sym);
+  SVal SimplifiedVal = simplifyToSVal(State, Sym);
+  if (SimplifiedVal.isConstant()) {
+    bool Feasible = SimplifiedVal.isZeroConstant() ? !Assumption : Assumption;
+    return Feasible ? State : nullptr;
+  }
+
+  if (SymbolRef SimplifiedSym = SimplifiedVal.getAsSymbol())
+    Sym = SimplifiedSym;
 
   // Handle SymbolData.
   if (isa<SymbolData>(Sym))
diff --git a/clang/test/Analysis/solver-sym-simplification-on-assumption.c b/clang/test/Analysis/solver-sym-simplification-on-assumption.c
new file mode 100644
index 00000000000000..941c584c598c52
--- /dev/null
+++ b/clang/test/Analysis/solver-sym-simplification-on-assumption.c
@@ -0,0 +1,31 @@
+// RUN: %clang_analyze_cc1 %s \
+// RUN:   -analyzer-checker=debug.ExprInspection \
+// RUN:   -verify
+
+void clang_analyzer_eval(int);
+
+void test_derived_sym_simplification_on_assume(int s0, int s1) {
+  int elem = s0 + s1 + 1;
+  if (elem-- == 0) // elem = s0 + s1
+    return;
+
+  if (elem-- == 0) // elem = s0 + s1 - 1
+    return;
+
+  if (s0 < 1) // s0: [1, 2147483647]
+    return;
+  if (s1 < 1) // s0: [1, 2147483647]
+    return;
+
+  if (elem-- == 0) // elem = s0 + s1 - 2
+    return;
+
+  if (s0 > 1) // s0: [-2147483648, 0] U [1, 2147483647] => s0 = 0
+    return;
+
+  if (s1 > 1) // s1: [-2147483648, 0] U [1, 2147483647] => s1 = 0
+    return;
+
+  // elem = s0 + s1 - 2 should be 0
+  clang_analyzer_eval(elem); // expected-warning{{FALSE}}
+}
diff --git a/clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c b/clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c
new file mode 100644
index 00000000000000..3b39bbe32dfc21
--- /dev/null
+++ b/clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c
@@ -0,0 +1,33 @@
+// RUN: %clang_analyze_cc1 %s \
+// RUN:   -analyzer-checker=unix.StdCLibraryFunctions \
+// RUN:   -analyzer-config unix.StdCLibraryFunctions:ModelPOSIX=true \
+// RUN:   -analyzer-checker=debug.ExprInspection \
+// RUN:   -triple x86_64-unknown-linux-gnu \
+// RUN:   -verify
+
+// expected-no-diagnostics
+
+#include "Inputs/std-c-library-functions-POSIX.h"
+
+void _add_one_to_index_C(int *indices, int *shape) {
+  int k = 1;
+  for (; k >= 0; k--)
+    if (indices[k] < shape[k])
+      indices[k]++;
+    else
+      indices[k] = 0;
+}
+
+void PyObject_CopyData_sptr(char *i, char *j, int *indices, int itemsize,
+    int *shape, struct sockaddr *restrict sa) {
+  int elements = 1;
+  for (int k = 0; k < 2; k++)
+    elements += shape[k];
+
+  // no contradiction after 3 iterations when 'elements' could be
+  // simplified to 0
+  while (elements--) {
+    _add_one_to_index_C(indices, shape);
+    getnameinfo(sa, 10, i, itemsize, i, itemsize, 0);
+  }
+}
diff --git a/clang/test/Analysis/symbol-simplification-fixpoint-two-iterations.cpp b/clang/test/Analysis/symbol-simplification-fixpoint-two-iterations.cpp
index 679ed3fda7a7a7..3f34d9982e7c8a 100644
--- a/clang/test/Analysis/symbol-simplification-fixpoint-two-iterations.cpp
+++ b/clang/test/Analysis/symbol-simplification-fixpoint-two-iterations.cpp
@@ -28,13 +28,13 @@ void test(int a, int b, int c, int d) {
     return;
   clang_analyzer_printState();
   // CHECK:       "constraints": [
-  // CHECK-NEXT:    { "symbol": "(reg_$0<int a>) != (reg_$3<int d>)", "range": "{ [0, 0] }" },
+  // CHECK-NEXT:    { "symbol": "((reg_$0<int a>) + (reg_$2<int c>)) != (reg_$3<int d>)", "range": "{ [0, 0] }" },
   // CHECK-NEXT:    { "symbol": "reg_$1<int b>", "range": "{ [0, 0] }" },
   // CHECK-NEXT:    { "symbol": "reg_$2<int c>", "range": "{ [0, 0] }" }
   // CHECK-NEXT:  ],
   // CHECK-NEXT:  "equivalence_classes": [
-  // CHECK-NEXT:    [ "(reg_$0<int a>) != (reg_$3<int d>)" ],
-  // CHECK-NEXT:    [ "reg_$0<int a>", "reg_$3<int d>" ],
+  // CHECK-NEXT:    [ "((reg_$0<int a>) + (reg_$2<int c>)) != (reg_$3<int d>)" ],
+  // CHECK-NEXT:    [ "(reg_$0<int a>) + (reg_$2<int c>)", "reg_$3<int d>" ],
   // CHECK-NEXT:    [ "reg_$2<int c>" ]
   // CHECK-NEXT:  ],
   // CHECK-NEXT:  "disequality_info": null,

steakhal · 2024-11-09T13:20:35Z

Hi, thanks for the PR!

I'm slightly confused that the compiler crash you refer to comes from the stdlibrary fn checker.
This suggest to me a checker problem - and likely relates to the stdlibraryfn checker early return.

However, this also couples with a solver change. Is this improving the solver that would also make the checker avoid the crash? If this is the case, then we should have separate PRs because we should harden the checker on one side, and also improve the solver in an other area.

In any case, I'll have a look at this next week.

danix800 · 2024-11-10T01:36:13Z

Yes these two are related.

Solver is the root cause, the checker crash is the effect. Either fix could cover the crash
but not enough without the other.

The solver is inherently incapable of solving some constraints, which would brings in
some overly constrainted states. We could try to improve the solver's precision as much
as possible. In this case these impossible states would crash the checker.

The checker should protectively test against impossible states since they're inevitable
in anyway.

NagyDonat

Thanks for the commit, I'm satisfied with it :)

I actually like that these two related changes (the checker change and the constraint manager improvement) are handled together in a single commit -- this way somebody who browses the commit log can directly see the "other half" of the change (without following cumbersome links through "this commit is mentioned in the commit message of that one" or opening this github review). Also, the checker change is so trivial (+2 lines) that the full combined commit is still small enough to be easily understood.

clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp

steakhal

Thanks for the commit, I'm satisfied with it :)

I actually like that these two related changes (the checker change and the constraint manager improvement) are handled together in a single commit -- this way somebody who browses the commit log can directly see the "other half" of the change (without following cumbersome links through "this commit is mentioned in the commit message of that one" or opening this github review). Also, the checker change is so trivial (+2 lines) that the full combined commit is still small enough to be easily understood.

I agree.

The patch looks good to me. I made some recommendations for improving the tests.

I have one question. In the StdLibraryFn checker you added a test for isSink().
I thought that the crash was that some assume(..., true|false) returned a null State, that we dereferenced. Where was that assume call and how did we unconditionally dereference it?

clang/test/Analysis/solver-sym-simplification-on-assumption.c

clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c

danix800 · 2024-11-11T17:00:53Z

Thanks for the commit, I'm satisfied with it :)
I actually like that these two related changes (the checker change and the constraint manager improvement) are handled together in a single commit -- this way somebody who browses the commit log can directly see the "other half" of the change (without following cumbersome links through "this commit is mentioned in the commit message of that one" or opening this github review). Also, the checker change is so trivial (+2 lines) that the full combined commit is still small enough to be easily understood.

I agree.

The patch looks good to me. I made some recommendations for improving the tests.

I have one question. In the StdLibraryFn checker you added a test for isSink(). I thought that the crash was that some assume(..., true|false) returned a null State, that we dereferenced. Where was that assume call and how did we unconditionally dereference it?

It's not a null state being dereferenced, but an assertion failure crash. The checker operates on some impossible states
and try to add one of the states back to the graph, which is rejected (assertion failed).

steakhal · 2024-11-12T16:31:47Z

First, let me thank you for posting a high quality patch like this.

ProgramState::isPosteriorlyOverconstrained() was introduced to carry the fact that even the parent state was infeasible.
Checkers were written in the sense that ProgramState::assume() will return at least one valid State. This was true for a really long time, until we improved the Solver with some simplification triggered by perfectly constraining symbols.
This meant that at that point we could have situations where the Solver disproves both the true and the false states, thus force us to violate the contract of ProgramState::assume by returning two null states. This caused crashes.
To illustrate the issue consider this:

// Let's say x could be either 0 or 1 here.
if (/*some fancy infeasible constraint dependent on x */) {
  if (x == 0) { } else { }
}

Let's pretend that at the first if we barely couldn't prove that there is no value x that would satisfy the constraint.
At the if (x == 0) we would split the path 2 ways, where x is 0 and on the other one is 1. The Solver would try to simplify the constraints and realize that the State where x is 1 has contradictions; and then later also realize that the other State is also infeasible. It's only possible if even the parent State was infeasible.

So, we have to return a "valid" state just to discard everything the checker would do. This wasted work is the lesser evil.
Basically, this is the story of isPosteriorlyOverconstrained(), and the why behind it.

So, the fact that assume() never returns two null states is a blissful lie to ease checker writing.

Now, why is this issue interesting? It's because the checkers can still observe the side effects of this machinery I shared when they chain ExplodedNodes returned by addTransition(). This breaks the illusion, leading to crashes as you shared.

Let's say we have a checker that adds some assumptions, and the assume call returned a single valid state. It's would be surprising that if I add a transition to it, suddenly it sinks. Even though that's technically the right thing to do, the checker did nothing that would imply sinking the path. This doesn't mean that the defensive isSink call in the StdLibFnChecker is unjustified. I'm okay with that. However, this would imply that the rest of the API uses of addTransition where we chain the ExplodedNodes are also left vulnerable to this issue. There are probably a lot less checkers doing this sort of chaining, so checking each callsite may be a valid option. However, this would raise a bar for using this API for newcomers, which I don't really like.

Consequently, I think this patch would mask the issue. We should rather prevent breaking the illusion by somehow allowing subsequent addTransitions even if isPosteriorlyOverconstrained() is true.

danix800 · 2024-11-13T03:21:03Z

First, let me thank you for posting a high quality patch like this.

ProgramState::isPosteriorlyOverconstrained() was introduced to carry the fact that even the parent state was infeasible. Checkers were written in the sense that ProgramState::assume() will return at least one valid State. This was true for a really long time, until we improved the Solver with some simplification triggered by perfectly constraining symbols. This meant that at that point we could have situations where the Solver disproves both the true and the false states, thus force us to violate the contract of ProgramState::assume by returning two null states. This caused crashes. To illustrate the issue consider this:
// Let's say x could be either 0 or 1 here.
if (/*some fancy infeasible constraint dependent on x */) {
  if (x == 0) { } else { }
}
Let's pretend that at the first if we barely couldn't prove that there is no value x that would satisfy the constraint. At the if (x == 0) we would split the path 2 ways, where x is 0 and on the other one is 1. The Solver would try to simplify the constraints and realize that the State where x is 1 has contradictions; and then later also realize that the other State is also infeasible. It's only possible if even the parent State was infeasible.

So, we have to return a "valid" state just to discard everything the checker would do. This wasted work is the lesser evil. Basically, this is the story of isPosteriorlyOverconstrained(), and the why behind it.

Thanks for these helpful details!

So, the fact that assume() never returns two null states is a blissful lie to ease checker writing.

Now, why is this issue interesting? It's because the checkers can still observe the side effects of this machinery I shared when they chain ExplodedNodes returned by addTransition(). This breaks the illusion, leading to crashes as you shared.

Let's say we have a checker that adds some assumptions, and the assume call returned a single valid state. It's would be surprising that if I add a transition to it, suddenly it sinks. Even though that's technically the right thing to do, the checker did nothing that would imply sinking the path. This doesn't mean that the defensive isSink call in the StdLibFnChecker is unjustified. I'm okay with that. However, this would imply that the rest of the API uses of addTransition where we chain the ExplodedNodes are also left vulnerable to this issue. There are probably a lot less checkers doing this sort of chaining, so checking each callsite may be a valid option. However, this would raise a bar for using this API for newcomers, which I don't really like.

I think there's no clear boundaries between checkers and the engine. Checkers can decide whether the engine should
stop, a state should be splitted or transitioned to new one, or do nothing at all.

So contracts are important, I think we should clarify these contracts and make more if needed, but not compromise to
some not-well-written checkers. This also means that checker should be aware of the existence of posteriorly-overconstrained states, especially when they try to chain those states.

Consequently, I think this patch would mask the issue. We should rather prevent breaking the illusion by somehow allowing subsequent addTransitions even if isPosteriorlyOverconstrained() is true.

How about: whenever a checker tries to chain states by itself, it should be more careful on node validation. Engine
responds with a failed assertion in this case, which drives us to spot the issue and fix it. If isPosteriorlyOverconstrained()
nodes are allowed then we might lose this chance to fix issues of both the checker and the solver.

isSink() testing is suffice for this case, but might not be clear enough generally. I was suprised by this precedingly sink node too.

As an anology, when checker try to transition to an unchanged state, engine will return the predecessor
as a prevention. If checker needs to chain states, it's a waste for the checker to keep working on this state without
testing that the returned node is the predecessor (nothing is changed).

We might change the engine to return null node intentionaly if it's marked as sink by isPosteriorlyOverconstrained().
It's more clear. But the checker still needs to test it. Whenever we generates error node for bug reporting, we test if the
new node is actually generated:

      if (ExplodedNode *N = C.generateErrorNode(NewState, NewNode))
        reportBug(Call, N, Constraint.get(), Summary, C);

This is another similar contract between checkers and the engine. In this sense, how about testing like this in this case:

     if (!NewNode || NewNode->isSink())
        break;

However, this would imply that the rest of the API uses of addTransition where we chain the ExplodedNodes are also left vulnerable to this issue. There are probably a lot less checkers doing this sort of chaining, so checking each callsite may be a valid option.

Maybe case-by-case fixing is enough.

steakhal · 2024-11-14T13:44:05Z

I see your point, but I'm still not convinced.
Anyways, that's partially beyond this PR. What we have here I can completely agree with. I just have the feeling it solved one particular case, and not a class of bugs - which is fine.

clang/test/Analysis/solver-sym-simplification-on-assumption.c

steakhal

LGTM now. Thank you for this high quality patch. This isn't the first time, I remember. Excellent track record.

danix800 · 2024-11-15T08:54:22Z

LGTM now. Thank you for this high quality patch. This isn't the first time, I remember. Excellent track record.

Thanks for your reviews and all of your kindness! @steakhal @NagyDonat

llvm-ci · 2024-11-15T08:58:38Z

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-sie-win running on sie-win-worker while building clang at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/46/builds/7928

Here is the relevant piece of the build log for the reference

Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang :: Analysis/symbol-simplification-fixpoint-two-iterations.cpp' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
z:\b\llvm-clang-x86_64-sie-win\build\bin\clang.exe -cc1 -internal-isystem Z:\b\llvm-clang-x86_64-sie-win\build\lib\clang\20\include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Analysis\symbol-simplification-fixpoint-two-iterations.cpp    -analyzer-checker=core    -analyzer-checker=debug.ExprInspection    2>&1 | z:\b\llvm-clang-x86_64-sie-win\build\bin\filecheck.exe Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Analysis\symbol-simplification-fixpoint-two-iterations.cpp
# executed command: 'z:\b\llvm-clang-x86_64-sie-win\build\bin\clang.exe' -cc1 -internal-isystem 'Z:\b\llvm-clang-x86_64-sie-win\build\lib\clang\20\include' -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Analysis\symbol-simplification-fixpoint-two-iterations.cpp' -analyzer-checker=core -analyzer-checker=debug.ExprInspection
# executed command: 'z:\b\llvm-clang-x86_64-sie-win\build\bin\filecheck.exe' 'Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Analysis\symbol-simplification-fixpoint-two-iterations.cpp'
# .---command stderr------------
# | �[1mZ:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Analysis\symbol-simplification-fixpoint-two-iterations.cpp:31:17: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m# | �[1m�[0m // CHECK-NEXT: { "symbol": "((reg_$0<int a>) + (reg_$2<int c>)) != (reg_$3<int d>)", "range": "{ [0, 0] }" },
# | �[0;1;32m                ^
�[0m# | �[0;1;32m�[0m�[1m<stdin>:26:18: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0m# | �[1m�[0m "constraints": [
# | �[0;1;32m                 ^
�[0m# | �[0;1;32m�[0m�[1m<stdin>:27:2: �[0m�[0;1;30mnote: �[0m�[1mpossible intended match here
�[0m# | �[1m�[0m { "symbol": "(reg_$0<int a>) != (reg_$3<int d>)", "range": "{ [0, 0] }" },
# | �[0;1;32m ^
�[0m# | �[0;1;32m�[0m
# | Input file: <stdin>
# | Check file: Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\test\Analysis\symbol-simplification-fixpoint-two-iterations.cpp
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# | �[1m�[0m�[0;1;30m           1: �[0m�[1m�[0;1;46m"program_state": { �[0m
# | �[0;1;30m           2: �[0m�[1m�[0;1;46m "store": null, �[0m
# | �[0;1;30m           3: �[0m�[1m�[0;1;46m "environment": { "pointer": "0x18a8bcdd400", "items": [ �[0m
# | �[0;1;30m           4: �[0m�[1m�[0;1;46m { "lctx_id": 1, "location_context": "#0 Call", "calling": "test", "location": null, "items": [ �[0m
# | �[0;1;30m           5: �[0m�[1m�[0;1;46m { "stmt_id": 809, "kind": "ImplicitCastExpr", "pretty": "clang_analyzer_printState", "value": "&code{clang_analyzer_printState}" } �[0m
# | �[0;1;30m           6: �[0m�[1m�[0;1;46m ]} �[0m
# | �[0;1;30m           7: �[0m�[1m�[0;1;46m ]}, �[0m
# | �[0;1;30m           8: �[0m�[1m�[0;1;46m �[0m"constraints": [�[0;1;46m �[0m
# | �[0;1;32mcheck:17       ^~~~~~~~~~~~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;30m           9: �[0m�[1m�[0;1;46m �[0m{ "symbol": "(((reg_$0<int a>) + (reg_$1<int b>)) + (reg_$2<int c>)) != (reg_$3<int d>)", "range": "{ [0, 0] }" },�[0;1;46m �[0m
# | �[0;1;32mnext:18        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;30m          10: �[0m�[1m�[0;1;46m �[0m{ "symbol": "(reg_$2<int c>) + (reg_$1<int b>)", "range": "{ [0, 0] }" }�[0;1;46m �[0m
# | �[0;1;32mnext:19        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;30m          11: �[0m�[1m�[0;1;46m �[0m],�[0;1;46m �[0m
# | �[0;1;32mnext:20        ^~
�[0m# | �[0;1;32m�[0m�[0;1;30m          12: �[0m�[1m�[0;1;46m �[0m"equivalence_classes": [�[0;1;46m �[0m
# | �[0;1;32mnext:21        ^~~~~~~~~~~~~~~~~~~~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;30m          13: �[0m�[1m�[0;1;46m �[0m[ "((reg_$0<int a>) + (reg_$1<int b>)) + (reg_$2<int c>)", "reg_$3<int d>" ]�[0;1;46m �[0m
# | �[0;1;32mnext:22        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;30m          14: �[0m�[1m�[0;1;46m �[0m],�[0;1;46m �[0m
# | �[0;1;32mnext:23        ^~
�[0m# | �[0;1;32m�[0m�[0;1;30m          15: �[0m�[1m�[0;1;46m �[0m"disequality_info": null,�[0;1;46m �[0m
# | �[0;1;32mnext:24        ^~~~~~~~~~~~~~~~~~~~~~~~~
...

#115579)" This reverts commit 4163136.

…g" (#116362) Reverts #115579 This introduced a breakage: https://lab.llvm.org/buildbot/#/builders/46/builds/7928

steakhal · 2024-11-15T09:54:44Z

@danix800 Could you please have a look at the failed test, such that we could reapply this PR?
I reverted this soon after I realized the broken test is from this PR.

danix800 · 2024-11-15T13:19:04Z

@danix800 Could you please have a look at the failed test, such that we could reapply this PR? I reverted this soon after I realized the broken test is from this PR.

Working on it!

…115579) This could deduce some complex syms derived from simple ones whose values could be constrainted to be concrete during execution, thus reducing some overconstrainted states. This commit also fix `unix.StdCLibraryFunctions` crash due to these overconstrainted states being added to the graph, which is marked as sink node (PosteriorlyOverconstrained). The 'assume' API is used in non-dual style so the checker should protectively test whether these newly added nodes are actually impossible. 1. The crash: https://godbolt.org/z/8KKWeKb86 2. The solver needs to solve equivalent: https://godbolt.org/z/ed8WqsbTh

…g" (llvm#116362) Reverts llvm#115579 This introduced a breakage: https://lab.llvm.org/buildbot/#/builders/46/builds/7928

danix800 · 2024-11-19T03:54:39Z

@danix800 Could you please have a look at the failed test, such that we could reapply this PR? I reverted this soon after I realized the broken test is from this PR.

The test randomly fails for unknown reason, on VS2019~2022, after 1c154bd (seems totally irrelevant to this randomness). Not sure if it's a compiler bug.

steakhal · 2024-11-19T08:53:58Z

@danix800 Could you please have a look at the failed test, such that we could reapply this PR? I reverted this soon after I realized the broken test is from this PR.

The test randomly fails for unknown reason, on VS2019~2022, after 1c154bd (seems totally irrelevant to this randomness). Not sure if it's a compiler bug.

I'm not sure I see why. The logs I linked in the revert only showed a single test failure. The new test we added in this PR. This suggests close correlations. Maybe the content of the "constraints" map in the State is dumped in a non-deterministic order? And we just happened to step on it now.

danix800 · 2024-11-19T10:36:03Z

I'm not sure I see why. The logs I linked in the revert only showed a single test failure. The new test we added in this PR. This suggests close correlations. Maybe the content of the "constraints" map in the State is dumped in a non-deterministic order? And we just happened to step on it now.

Yes it's the testcase touched by this PR that failed (not other random testcase inside the repo).

I mean that clang behaves randomly.

The randomness might not be introduced by this PR but by some earlier commit point.

I'm working on it.

steakhal · 2024-11-19T11:18:57Z

I'm not sure I see why. The logs I linked in the revert only showed a single test failure. The new test we added in this PR. This suggests close correlations. Maybe the content of the "constraints" map in the State is dumped in a non-deterministic order? And we just happened to step on it now.

Yes it's the testcase touched by this PR that failed (not other random testcase inside the repo).

I mean that clang behaves randomly.

The randomness might not be introduced by this PR but by some earlier commit point.

I'm working on it.

Be aware of that llvm DenseMap and sets are basically open-address hashtables, and the hash function is seeded by the address of some known function (IDK which). This means that these tables depend on where clang is loaded in memory. Consequently, address-space layout randomization could affect the reproducibility of some issues and lead to flaky cases.

The constraints are backed by llvm::Immutable map and set. which is some AVL-tree if I recall. I'm not sure how the "ordering" is implemented there but if they are in terms of comparing pointer addresses then we might have a problem.

When dumping the constraints, I think we just iterate the AVL-tree from begin to end, which will prefer "less-then" elements, but if those depend on where the process is loaded then the whole iteration order is RT dependent.
Could you please check if we sort before dumping the constraints?

danix800 · 2024-11-19T17:14:46Z

It is unstable ordering of elements from DenseMap/Set.

This PR actually breaks the recursive fixpoint simplification algorithm of the eq class.

One thing left which confused me is that why no randomness is observed when compiled
by gcc on linux, purely out of coincidence?

=====

EDIT: It's ImmutableMap/Set, not DenseMap/Set.

danix800 · 2024-11-19T17:25:05Z

For this testcase, two constrainst are collected on the path:

(1) a + b + c == d
(2) b + c = 0

when assuming the third constraint b == 0, (1) or (2) is selected at random order for simplifying
eq class.

The fixedpoint simplification algorithm will recurse by Re-evaluate an SVal with top-level State->assume logic.
Early return in this PR would break this simplification.

steakhal · 2024-11-19T18:00:02Z

Thank you for your dedication. What are your plans?
Do you plan to continue pushing this?

Btw why did this test only fail on Windows?

danix800 · 2024-11-20T03:05:57Z

Thank you for your dedication. What are your plans? Do you plan to continue pushing this?

Btw why did this test only fail on Windows?

I'll further investigate if it's possilbe to do similar improvement on the solver in other ways.

llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Nov 9, 2024

danix800 assigned martong and unassigned martong Nov 9, 2024

danix800 requested review from martong and balazske November 9, 2024 04:41

steakhal self-requested a review November 9, 2024 13:20

NagyDonat reviewed Nov 11, 2024

View reviewed changes

clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp Outdated Show resolved Hide resolved

clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp Show resolved Hide resolved

steakhal reviewed Nov 11, 2024

View reviewed changes

clang/test/Analysis/solver-sym-simplification-on-assumption.c Outdated Show resolved Hide resolved

clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c Show resolved Hide resolved

dump more test details & simplify logic test expr

082e45a

steakhal changed the title ~~[StaticAnalyzer] early return if sym is concrete on assuming~~ [analyzer][Solver] Early return if sym is concrete on assuming Nov 12, 2024

fix insufficient test output matching

bae44da

steakhal reviewed Nov 14, 2024

View reviewed changes

clang/test/Analysis/solver-sym-simplification-on-assumption.c Outdated Show resolved Hide resolved

pin target for fixed width of int

3ee424f

steakhal approved these changes Nov 14, 2024

View reviewed changes

danix800 merged commit 4163136 into llvm:main Nov 15, 2024
6 of 8 checks passed

danix800 deleted the fix/clang-analyzer-range-simplify-before-assume branch November 15, 2024 08:54

steakhal added a commit that referenced this pull request Nov 15, 2024

Revert "[analyzer][Solver] Early return if sym is concrete on assuming (

42c0948

#115579)" This reverts commit 4163136.

steakhal mentioned this pull request Nov 15, 2024

Revert "[analyzer][Solver] Early return if sym is concrete on assuming" #116362

Merged

steakhal added a commit that referenced this pull request Nov 15, 2024

Revert "[analyzer][Solver] Early return if sym is concrete on assumin…

8d43c88

…g" (#116362) Reverts #115579 This introduced a breakage: https://lab.llvm.org/buildbot/#/builders/46/builds/7928

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[analyzer][Solver] Early return if sym is concrete on assuming #115579

[analyzer][Solver] Early return if sym is concrete on assuming #115579

danix800 commented Nov 9, 2024 •

edited

Loading

llvmbot commented Nov 9, 2024 •

edited

Loading

steakhal commented Nov 9, 2024

danix800 commented Nov 10, 2024 •

edited

Loading

NagyDonat left a comment •

edited

Loading

steakhal left a comment

danix800 commented Nov 11, 2024

steakhal commented Nov 12, 2024

danix800 commented Nov 13, 2024

steakhal commented Nov 14, 2024

steakhal left a comment

danix800 commented Nov 15, 2024

llvm-ci commented Nov 15, 2024

steakhal commented Nov 15, 2024

danix800 commented Nov 15, 2024

danix800 commented Nov 19, 2024

steakhal commented Nov 19, 2024

danix800 commented Nov 19, 2024

steakhal commented Nov 19, 2024

danix800 commented Nov 19, 2024 •

edited

Loading

danix800 commented Nov 19, 2024

steakhal commented Nov 19, 2024

danix800 commented Nov 20, 2024

[analyzer][Solver] Early return if sym is concrete on assuming #115579

[analyzer][Solver] Early return if sym is concrete on assuming #115579

Conversation

danix800 commented Nov 9, 2024 • edited Loading

llvmbot commented Nov 9, 2024 • edited Loading

steakhal commented Nov 9, 2024

danix800 commented Nov 10, 2024 • edited Loading

NagyDonat left a comment • edited Loading

Choose a reason for hiding this comment

steakhal left a comment

Choose a reason for hiding this comment

danix800 commented Nov 11, 2024

steakhal commented Nov 12, 2024

danix800 commented Nov 13, 2024

steakhal commented Nov 14, 2024

steakhal left a comment

Choose a reason for hiding this comment

danix800 commented Nov 15, 2024

llvm-ci commented Nov 15, 2024

steakhal commented Nov 15, 2024

danix800 commented Nov 15, 2024

danix800 commented Nov 19, 2024

steakhal commented Nov 19, 2024

danix800 commented Nov 19, 2024

steakhal commented Nov 19, 2024

danix800 commented Nov 19, 2024 • edited Loading

danix800 commented Nov 19, 2024

steakhal commented Nov 19, 2024

danix800 commented Nov 20, 2024

danix800 commented Nov 9, 2024 •

edited

Loading

llvmbot commented Nov 9, 2024 •

edited

Loading

danix800 commented Nov 10, 2024 •

edited

Loading

NagyDonat left a comment •

edited

Loading

danix800 commented Nov 19, 2024 •

edited

Loading