Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounds context: validate [4/n] #853

Merged
merged 69 commits into from
Jul 9, 2020
Merged

Bounds context: validate [4/n] #853

merged 69 commits into from
Jul 9, 2020

Conversation

kkjeer
Copy link
Contributor

@kkjeer kkjeer commented Jun 11, 2020

This PR validates the updated observed bounds context after every top-level CFG statement by checking that, for each variable in the observed bounds context, the observed bounds imply the declared bounds of the variable.

The PR currently targets bounds-context-assignments since it is dependent on the changes in #836.

Updated diagnostic messages

Since variable bounds are checked after each top-level statement, rather than at each individual assignment or initialization, this PR makes some changes to the way that errors, warnings, and notes are emitted for variable bounds.

More general wording
Before this PR, when bounds were checked after a single assignment or initialization, the error messages could differentiate between an assignment or initialization. Now that bounds are checked after a top-level statement, which could be an assignment, an initialization, or a more complex statement that may contain one or more assignments or initializations, the error messages use the word "statement" rather than "assignment" or "initialization". For example, an error message would say "declared bounds for 'p' are invalid after statement" rather than "declared bounds for p are invalid after assignment".

Unknown source bounds
In an assignment where the RHS has unknown bounds:

void f(array_ptr<int> p : count(1), array_ptr<int> q : bounds(unknown) {
    p = q;
}

Before this PR, the error message would be "expression has unknown bounds, right-hand side of assignment expected to have bounds because the left-hand side has bounds".
After this PR, the error message would be "inferred bounds for 'p' are unknown after statement".
The reason is that now there are multiple reasons why the inferred/observed bounds of a variable can be unknown after evaluating a statement, and assigning an expression with unknown bounds to the variable is only one of those ways.

More context for unknown observed bounds

There are now multiple reasons that the observed bounds of a variable p can be unknown after evaluating a statement. This PR introduces new note messages to help provide more context to the user as to why the unknown bounds error occurred.

Losing the value of a variable used in bounds
One way is for an assignment to be made to a variable 'w' used in the declared bounds of 'p', where 'w' had no original value. For example:

void f(array_ptr<int> p : count(len), int len) {
    // Initial observed bounds B of p: bounds(p, p + len)
    // len has no original value
    // After replacing len with null in B: observed bounds of p are bounds(unknown)
    len = 0;
}

The error message emitted here will be:

  • "error: inferred bounds for 'p' are unknown after statement"
  • "note: lost the value of the variable 'len', which is used in the (expanded) inferred bounds 'bounds(p, p + len)' of 'p'"

Assigning an expression with unknown bounds to a variable
Within a top-level statement, one or more expressions can be assigned to p which have unknown bounds. For example:

void f(array_ptr<int> p : count(1),
          array_ptr<int> q : bounds(unknown), 
          array_ptr<int> r : bounds(unknown) {
    p = q, p = r;
}

The error message emitted here will be:

  • "error: inferred bounds for 'p' are unknown after statement"
  • "note: assigned expression 'q' with unknown bounds to 'p'"
  • "note: assigned expression 'r' with unknown bounds to 'p'"

Target/source equality

This PR currently introduces a regression where the compiler is not able to prove the validity of bounds due to missing equality information recorded between the target and source of an initialization (see issue #845). A fix for this will be explored in a separate PR.

kakje added 30 commits April 11, 2020 00:24
…ility check to GetOriginalValue"

This reverts commit 2dba020.
kkjeer added 2 commits June 25, 2020 10:22
* Initial implementation with test updates

* Clean up

* Remove more expected warnings

* Use auto const reference in to iterate over State.TargetSrcEquality

* Replace additional uses of Pair with auto const &Pair in for loops
* Add EquivExprs parameter to ProveMemoryAcessInRange, CheckBoundsAtMemoryAccess, and AddBoundsCheck

* Add tests for memory access within multi-assignment statement
Base automatically changed from bounds-context-assignments to master June 30, 2020 19:59
@kkjeer kkjeer changed the title [WIP] Validate bounds context Bounds context: validate [4/n] Jun 30, 2020
@kkjeer kkjeer marked this pull request as ready for review June 30, 2020 21:18
@@ -13,7 +13,7 @@ void f1() {
{}

if (*p) {
p++; // expected-error {{declared bounds for p are invalid after assignment}}
p++; // expected-error {{declared bounds for 'p' are invalid after statement}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the statement p++, p no longer has widened bounds (the modification of p kills the widened bounds of p).

p has the original value p - 1. At the end of the statement p++, the updated observed bounds of p are bounds(p - 1, p - 1). These bounds do not imply the declared bounds (p, p).

Copy link
Member

@dtarditi dtarditi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks great! Thank you. It is excellent to see that you included diagnostic notes to explain cases where a different error message is produced. I have some suggestions for improving and clarifying the code, and a request for follow-up clean up in a separate check-in.

clang/lib/Sema/SemaBounds.cpp Outdated Show resolved Hide resolved
@@ -13,7 +13,7 @@ void f1() {
{}

if (*p) {
p++; // expected-error {{declared bounds for p are invalid after assignment}}
p++; // expected-error {{declared bounds for 'p' are invalid after statement}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you file an issue about this? This is an existing issue that is unrelated to your PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create issue #872 to track this.

clang/lib/Sema/SemaBounds.cpp Show resolved Hide resolved
clang/lib/Sema/SemaBounds.cpp Show resolved Hide resolved
// For all other binary operators `e1 @ e2`, use the SameValue sets for
// `e1` and `e2` stored in SubExprSameValueSets to update
// State.SameValue for `e1 @ e2`.
UpdateSameValue(E, SubExprSameValueSets, State.SameValue);
}

if (E->isAssignmentOp()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the code in this block confusing to follow, especially given that there is already a block testing if (E->isAssignmentOp())at line 2809, about 40 lines earlier. The code is now doing two different things: checking that the bounds are non-modifying for both variable and non-variable expressions and (2) implementing checking of bounds declarations for assignments to non-variables.

I suggest that in a follow-up check-in, you clean up the code and combine those two blocks to make the code easier to follow and understand. The current PR -in is already large, so I don't recommend cleaning things up as part of that.

Could you add a TODO about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created issue #873 to track this.

// Only emit an error for assignments to a non-variable.
// Assignments to variables will be checked after checking the
// current top-level CFG statement.
if (!LHSVar)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the check for if (!LHSVar) should be moved outward to line 2880 and its bound should include the code from line 2881 to 2897. It makes it easier to see that all of this code is handling the non-variable case.

Copy link
Member

@dtarditi dtarditi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants