-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[analyzer][Solver] Early return if sym is concrete on assuming #115579
[analyzer][Solver] Early return if sym is concrete on assuming #115579
Conversation
This could deduce some complex syms derived from simple ones whose values could be constrainted to be concrete during execution, thus reducing some overconstrainted states. This commit also fix 'unix.StdCLibraryFunctions' crash due to these overconstrainted states being added to the graph, which is marked as sink node (PosteriorlyOverconstrained). The 'assume' API is used in non-dual style so the checker should protectively test whether these newly added nodes are actually impossible.
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-static-analyzer-1 Author: Ding Fei (danix800) ChangesThis could deduce some complex syms derived from simple ones whose values could be constrainted to be concrete during execution, thus reducing some overconstrainted states. This commit also fix
Full diff: https://github.com/llvm/llvm-project/pull/115579.diff 6 Files Affected:
diff --git a/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
index 4f30b2a0e7e7da..5faaf9cf274531 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
@@ -1354,6 +1354,8 @@ void StdLibraryFunctionsChecker::checkPreCall(const CallEvent &Call,
if (BR.isInteresting(ArgSVal))
OS << Msg;
}));
+ if (NewNode->isSink())
+ break;
}
}
}
diff --git a/clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp b/clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp
index c0b3f346b654df..2b77167fab86f2 100644
--- a/clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp
@@ -74,7 +74,7 @@ ConstraintManager::assumeDualImpl(ProgramStateRef &State,
// it might happen that a Checker uncoditionally uses one of them if the
// other is a nullptr. This may also happen with the non-dual and
// adjacent `assume(true)` and `assume(false)` calls. By implementing
- // assume in therms of assumeDual, we can keep our API contract there as
+ // assume in terms of assumeDual, we can keep our API contract there as
// well.
return ProgramStatePair(StInfeasible, StInfeasible);
}
diff --git a/clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp b/clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp
index 4bbe933be2129e..cc2280faa6f730 100644
--- a/clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RangedConstraintManager.cpp
@@ -23,7 +23,14 @@ RangedConstraintManager::~RangedConstraintManager() {}
ProgramStateRef RangedConstraintManager::assumeSym(ProgramStateRef State,
SymbolRef Sym,
bool Assumption) {
- Sym = simplify(State, Sym);
+ SVal SimplifiedVal = simplifyToSVal(State, Sym);
+ if (SimplifiedVal.isConstant()) {
+ bool Feasible = SimplifiedVal.isZeroConstant() ? !Assumption : Assumption;
+ return Feasible ? State : nullptr;
+ }
+
+ if (SymbolRef SimplifiedSym = SimplifiedVal.getAsSymbol())
+ Sym = SimplifiedSym;
// Handle SymbolData.
if (isa<SymbolData>(Sym))
diff --git a/clang/test/Analysis/solver-sym-simplification-on-assumption.c b/clang/test/Analysis/solver-sym-simplification-on-assumption.c
new file mode 100644
index 00000000000000..941c584c598c52
--- /dev/null
+++ b/clang/test/Analysis/solver-sym-simplification-on-assumption.c
@@ -0,0 +1,31 @@
+// RUN: %clang_analyze_cc1 %s \
+// RUN: -analyzer-checker=debug.ExprInspection \
+// RUN: -verify
+
+void clang_analyzer_eval(int);
+
+void test_derived_sym_simplification_on_assume(int s0, int s1) {
+ int elem = s0 + s1 + 1;
+ if (elem-- == 0) // elem = s0 + s1
+ return;
+
+ if (elem-- == 0) // elem = s0 + s1 - 1
+ return;
+
+ if (s0 < 1) // s0: [1, 2147483647]
+ return;
+ if (s1 < 1) // s0: [1, 2147483647]
+ return;
+
+ if (elem-- == 0) // elem = s0 + s1 - 2
+ return;
+
+ if (s0 > 1) // s0: [-2147483648, 0] U [1, 2147483647] => s0 = 0
+ return;
+
+ if (s1 > 1) // s1: [-2147483648, 0] U [1, 2147483647] => s1 = 0
+ return;
+
+ // elem = s0 + s1 - 2 should be 0
+ clang_analyzer_eval(elem); // expected-warning{{FALSE}}
+}
diff --git a/clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c b/clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c
new file mode 100644
index 00000000000000..3b39bbe32dfc21
--- /dev/null
+++ b/clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c
@@ -0,0 +1,33 @@
+// RUN: %clang_analyze_cc1 %s \
+// RUN: -analyzer-checker=unix.StdCLibraryFunctions \
+// RUN: -analyzer-config unix.StdCLibraryFunctions:ModelPOSIX=true \
+// RUN: -analyzer-checker=debug.ExprInspection \
+// RUN: -triple x86_64-unknown-linux-gnu \
+// RUN: -verify
+
+// expected-no-diagnostics
+
+#include "Inputs/std-c-library-functions-POSIX.h"
+
+void _add_one_to_index_C(int *indices, int *shape) {
+ int k = 1;
+ for (; k >= 0; k--)
+ if (indices[k] < shape[k])
+ indices[k]++;
+ else
+ indices[k] = 0;
+}
+
+void PyObject_CopyData_sptr(char *i, char *j, int *indices, int itemsize,
+ int *shape, struct sockaddr *restrict sa) {
+ int elements = 1;
+ for (int k = 0; k < 2; k++)
+ elements += shape[k];
+
+ // no contradiction after 3 iterations when 'elements' could be
+ // simplified to 0
+ while (elements--) {
+ _add_one_to_index_C(indices, shape);
+ getnameinfo(sa, 10, i, itemsize, i, itemsize, 0);
+ }
+}
diff --git a/clang/test/Analysis/symbol-simplification-fixpoint-two-iterations.cpp b/clang/test/Analysis/symbol-simplification-fixpoint-two-iterations.cpp
index 679ed3fda7a7a7..3f34d9982e7c8a 100644
--- a/clang/test/Analysis/symbol-simplification-fixpoint-two-iterations.cpp
+++ b/clang/test/Analysis/symbol-simplification-fixpoint-two-iterations.cpp
@@ -28,13 +28,13 @@ void test(int a, int b, int c, int d) {
return;
clang_analyzer_printState();
// CHECK: "constraints": [
- // CHECK-NEXT: { "symbol": "(reg_$0<int a>) != (reg_$3<int d>)", "range": "{ [0, 0] }" },
+ // CHECK-NEXT: { "symbol": "((reg_$0<int a>) + (reg_$2<int c>)) != (reg_$3<int d>)", "range": "{ [0, 0] }" },
// CHECK-NEXT: { "symbol": "reg_$1<int b>", "range": "{ [0, 0] }" },
// CHECK-NEXT: { "symbol": "reg_$2<int c>", "range": "{ [0, 0] }" }
// CHECK-NEXT: ],
// CHECK-NEXT: "equivalence_classes": [
- // CHECK-NEXT: [ "(reg_$0<int a>) != (reg_$3<int d>)" ],
- // CHECK-NEXT: [ "reg_$0<int a>", "reg_$3<int d>" ],
+ // CHECK-NEXT: [ "((reg_$0<int a>) + (reg_$2<int c>)) != (reg_$3<int d>)" ],
+ // CHECK-NEXT: [ "(reg_$0<int a>) + (reg_$2<int c>)", "reg_$3<int d>" ],
// CHECK-NEXT: [ "reg_$2<int c>" ]
// CHECK-NEXT: ],
// CHECK-NEXT: "disequality_info": null,
|
Hi, thanks for the PR! I'm slightly confused that the compiler crash you refer to comes from the stdlibrary fn checker. However, this also couples with a solver change. Is this improving the solver that would also make the checker avoid the crash? If this is the case, then we should have separate PRs because we should harden the checker on one side, and also improve the solver in an other area. In any case, I'll have a look at this next week. |
Yes these two are related. Solver is the root cause, the checker crash is the effect. Either fix could cover the crash The solver is inherently incapable of solving some constraints, which would brings in The checker should protectively test against impossible states since they're inevitable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the commit, I'm satisfied with it :)
I actually like that these two related changes (the checker change and the constraint manager improvement) are handled together in a single commit -- this way somebody who browses the commit log can directly see the "other half" of the change (without following cumbersome links through "this commit is mentioned in the commit message of that one" or opening this github review). Also, the checker change is so trivial (+2 lines) that the full combined commit is still small enough to be easily understood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the commit, I'm satisfied with it :)
I actually like that these two related changes (the checker change and the constraint manager improvement) are handled together in a single commit -- this way somebody who browses the commit log can directly see the "other half" of the change (without following cumbersome links through "this commit is mentioned in the commit message of that one" or opening this github review). Also, the checker change is so trivial (+2 lines) that the full combined commit is still small enough to be easily understood.
I agree.
The patch looks good to me. I made some recommendations for improving the tests.
I have one question. In the StdLibraryFn checker you added a test for isSink()
.
I thought that the crash was that some assume(..., true|false)
returned a null State, that we dereferenced. Where was that assume call and how did we unconditionally dereference it?
clang/test/Analysis/std-c-library-functions-bufsize-nocrash-with-correct-solver.c
Show resolved
Hide resolved
It's not a null state being dereferenced, but an assertion failure crash. The checker operates on some impossible states |
First, let me thank you for posting a high quality patch like this.
Let's pretend that at the first So, we have to return a "valid" state just to discard everything the checker would do. This wasted work is the lesser evil. So, the fact that Now, why is this issue interesting? It's because the checkers can still observe the side effects of this machinery I shared when they chain ExplodedNodes returned by Let's say we have a checker that adds some assumptions, and the Consequently, I think this patch would mask the issue. We should rather prevent breaking the illusion by somehow allowing subsequent |
Thanks for these helpful details!
I think there's no clear boundaries between checkers and the engine. Checkers can decide whether the engine should So contracts are important, I think we should clarify these contracts and make more if needed, but not compromise to
How about: whenever a checker tries to chain states by itself, it should be more careful on node validation. Engine
As an anology, when checker try to transition to an unchanged state, engine will return the predecessor We might change the engine to return null node intentionaly if it's marked as sink by if (ExplodedNode *N = C.generateErrorNode(NewState, NewNode))
reportBug(Call, N, Constraint.get(), Summary, C); This is another similar contract between checkers and the engine. In this sense, how about testing like this in this case: if (!NewNode || NewNode->isSink())
break;
Maybe case-by-case fixing is enough. |
I see your point, but I'm still not convinced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now. Thank you for this high quality patch. This isn't the first time, I remember. Excellent track record.
Thanks for your reviews and all of your kindness! @steakhal @NagyDonat |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/46/builds/7928 Here is the relevant piece of the build log for the reference
|
…g" (#116362) Reverts #115579 This introduced a breakage: https://lab.llvm.org/buildbot/#/builders/46/builds/7928
@danix800 Could you please have a look at the failed test, such that we could reapply this PR? |
Working on it! |
…115579) This could deduce some complex syms derived from simple ones whose values could be constrainted to be concrete during execution, thus reducing some overconstrainted states. This commit also fix `unix.StdCLibraryFunctions` crash due to these overconstrainted states being added to the graph, which is marked as sink node (PosteriorlyOverconstrained). The 'assume' API is used in non-dual style so the checker should protectively test whether these newly added nodes are actually impossible. 1. The crash: https://godbolt.org/z/8KKWeKb86 2. The solver needs to solve equivalent: https://godbolt.org/z/ed8WqsbTh
…g" (llvm#116362) Reverts llvm#115579 This introduced a breakage: https://lab.llvm.org/buildbot/#/builders/46/builds/7928
The test randomly fails for unknown reason, on VS2019~2022, after 1c154bd (seems totally irrelevant to this randomness). Not sure if it's a compiler bug. |
I'm not sure I see why. The logs I linked in the revert only showed a single test failure. The new test we added in this PR. This suggests close correlations. Maybe the content of the "constraints" map in the State is dumped in a non-deterministic order? And we just happened to step on it now. |
Yes it's the testcase touched by this PR that failed (not other random testcase inside the repo). I mean that clang behaves randomly. The randomness might not be introduced by this PR but by some earlier commit point. I'm working on it. |
Be aware of that llvm DenseMap and sets are basically open-address hashtables, and the hash function is seeded by the address of some known function (IDK which). This means that these tables depend on where clang is loaded in memory. Consequently, address-space layout randomization could affect the reproducibility of some issues and lead to flaky cases. The constraints are backed by llvm::Immutable map and set. which is some AVL-tree if I recall. I'm not sure how the "ordering" is implemented there but if they are in terms of comparing pointer addresses then we might have a problem. When dumping the constraints, I think we just iterate the AVL-tree from begin to end, which will prefer "less-then" elements, but if those depend on where the process is loaded then the whole iteration order is RT dependent. |
It is unstable ordering of elements from DenseMap/Set. This PR actually breaks the recursive fixpoint simplification algorithm of the eq class. One thing left which confused me is that why no randomness is observed when compiled ===== EDIT: It's ImmutableMap/Set, not DenseMap/Set. |
For this testcase, two constrainst are collected on the path: (1) a + b + c == d
(2) b + c = 0 when assuming the third constraint The fixedpoint simplification algorithm will recurse by |
Thank you for your dedication. What are your plans? Btw why did this test only fail on Windows? |
I'll further investigate if it's possilbe to do similar improvement on the solver in other ways. |
This could deduce some complex syms derived from simple ones whose values could be constrainted to be concrete during execution, thus reducing some overconstrainted states.
This commit also fix
unix.StdCLibraryFunctions
crash due to these overconstrainted states being added to the graph, which is marked as sink node (PosteriorlyOverconstrained). The 'assume' API is used in non-dual style so the checker should protectively test whether these newly added nodes are actually impossible.