Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Compiler V2] Check recursive struct definitions #12257

Merged
merged 44 commits into from
Mar 16, 2024

Conversation

fEst1ck
Copy link
Contributor

@fEst1ck fEst1ck commented Feb 27, 2024

Description

This PR adds check for recursive struct definitions in the model builder. That is, no struct S contains a descendant field of type S. See checking/typing/recursive_struct.move for examples.

TODO: add that we can't have recursive structs in the move book.

Fixes #12119

Test Plan

checking/typing/recursive_struct.move

Copy link

trunk-io bot commented Feb 27, 2024

⏱️ 27h 25m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-tests 12h 6m 🟥🟥
rust-unit-tests 12h 6m 🟥🟥
windows-build 1h 19m 🟩🟩🟩🟩
rust-move-unit-coverage 38m 🟩🟩
run-tests-main-branch 27m 🟥🟥🟥🟥
rust-lints 14m 🟥🟥
check 14m 🟩🟩🟩
check-dynamic-deps 8m 🟩🟩🟩🟩
general-lints 7m 🟩🟩🟩
semgrep/ci 2m 🟩🟩🟩🟩
file_change_determinator 57s 🟩🟩🟩🟩
file_change_determinator 51s 🟩🟩🟩🟩
permission-check 19s 🟩🟩🟩🟩
permission-check 14s 🟩🟩🟩🟩
permission-check 14s 🟩🟩🟩🟩
permission-check 13s 🟩🟩🟩🟩

🚨 4 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
rust-move-tests 6h 19m +1781%
rust-unit-tests 6h 31m +1049%
run-tests-main-branch 7m 4m +69%
rust-move-unit-coverage 16m 27m -41%

settingsfeedbackdocs ⋅ learn more about trunk.io

@fEst1ck fEst1ck force-pushed the check-recursive-struct1 branch from 6082975 to ed9e378 Compare February 27, 2024 10:06
Copy link

codecov bot commented Feb 27, 2024

Codecov Report

Attention: Patch coverage is 98.69281% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 64.0%. Comparing base (1b43e0e) to head (f25596c).
Report is 3 commits behind head on main.

Files Patch % Lines
...e/move-compiler-v2/src/recursive_struct_checker.rs 98.6% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main   #12257    +/-   ##
========================================
  Coverage    63.9%    64.0%            
========================================
  Files         816      817     +1     
  Lines      180914   181062   +148     
========================================
+ Hits       115779   115933   +154     
+ Misses      65135    65129     -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

4 │ │ }
│ ╰─────^
= S contains T...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could actually add locations to where T is defined. Check out error_with_hints e.g. in the reference_saftey_processor how to do it. The current approach is also good, but in case you have not seen this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


/// Gets the display name of the struct type
/// Requires: the type is a struct type
fn get_struct_type_name(&self, ty: &Type) -> String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use TypeDisplayContext for this, then you do not need those functions, and it prints the instantiation. For now it would be a fully qualified name, but the name will become relative in the local module once my error message PR lands.

For this, you construct a q: QualifiedInstId<StructId>, then call q.to_type().display(context). To construct the context, you'd put the reverse_struct_table and the type parameter names of the type in question into the context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

&self,
path: &Vec<Type>,
this_struct_id: QualifiedId<StructId>
) -> Vec<String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return Vec<(Loc, String)> and stick this into error_with_hints. It would be nice if Loc would be that of the field which contributes to the recursion.

Copy link
Contributor Author

@fEst1ck fEst1ck Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now included the name of the field in the error messages. I didn't use error_with_hints because the trace is not shown in order. See this commit 6c83893 where I used error_with_hints

let child_name = self.get_struct_type_name(&path[i + 1]);
loop_notes.push(format!("{} contains {}...", parent_name, child_name));
}
loop_notes.push(format!("{} contains {}, which forms a loop.", self.get_struct_type_name(path.last().unwrap()), self.get_struct_display_name_simple(this_struct_id)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please align with other error messages. Our convention is to put programming symbols into backticks, hence: "\backquote{}\backquote contains..."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Wolfgang means literal backquotes:

  "`{}` contains"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// checks if `ty` occurs in `path`
for parent in path.iter() {
if let Type::Struct(mid, sid, _) = parent {
let parent_id = QualifiedId { module_id: *mid, id: *sid };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mid.qualified(*sid)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if parent_id == this_struct_id {
if checking == this_struct_id {
let loop_notes = self.gen_error_msg_for_fields_loop(path, this_struct_id);
self.error_with_notes(loc_checking, &format!("recursive definition {}", self.get_struct_display_name_simple(this_struct_id)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't that formatted correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return false;
}
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If thats unreachable, the idiomatic way would be

let Struct(...) = parent else { panic!("unexpected") };
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

let child_name = self.get_struct_type_name(&path[i + 1]);
loop_notes.push(format!("{} contains {}...", parent_name, child_name));
}
loop_notes.push(format!("{} contains {}, which forms a loop.", self.get_struct_type_name(path.last().unwrap()), self.get_struct_display_name_simple(this_struct_id)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Wolfgang means literal backquotes:

  "`{}` contains"


/// Checks for recursive definition of structs and adds diagnostics
pub fn check_resursive_struct(&self, struct_entry: &StructEntry) {
let params = (0..struct_entry.type_params.len()).map(|i| Type::TypeParameter(i as u16)).collect_vec();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a bit more readable if you pull out the len calculation:

let num_params = struct_entry.type_params.len()
let params = (0..num_params).map(|i| Type::Parameter(i as u16)).collect_vec();

but I guess that's still pretty ugly. Oh, well..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored a bit

@fEst1ck fEst1ck force-pushed the check-recursive-struct1 branch 2 times, most recently from edcffce to b474938 Compare March 6, 2024 10:24
@fEst1ck fEst1ck requested review from brmataptos and wrwg March 6, 2024 10:42
@fEst1ck
Copy link
Contributor Author

fEst1ck commented Mar 6, 2024

All comments addressed. PTAL @wrwg @brmataptos

33 │ │ }
│ ╰─────^
= `type_param::S` contains field `f: type_param::G<type_param::S>`...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not displayed correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Teng already has a fix for this. Just remember to come back to here once his fix landed, can yours land in the meantime

Copy link
Contributor

@wrwg wrwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice error messages now, thanks for trying out the other way to display errors. Yes, those hints are ordered in source order, which maybe confusing for the cycle problem.

33 │ │ }
│ ╰─────^
= `type_param::S` contains field `f: type_param::G<type_param::S>`...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Teng already has a fix for this. Just remember to come back to here once his fix landed, can yours land in the meantime

}

/// Checks for recursive definition of structs and adds diagnostics
pub fn check_resursive_struct(&self, struct_entry: &StructEntry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn check_resursive_struct(&self, struct_entry: &StructEntry) {
pub fn check_recursive_struct(&self, struct_entry: &StructEntry) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

));
}
loop_notes.push(format!(
"`{}` contains field `{}: {}`, which forms a loop.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please change "loop" to "cycle"? (here and in all the func/var names).

IMO "loop" in this context seems very weird.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
loop_notes.push(format!(
"`{}` contains field `{}: {}`, which forms a loop.",
self.get_struct_type_name(&path.last().unwrap().2, this_struct_id),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use expect instead of unwrap.

Could you please add a note in the function doc that path is expected to be non-empty, else this function will panic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

&self,
path: &Vec<(Loc, Symbol, Type)>,
this_struct_id: QualifiedId<StructId>,
_loc: Loc,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have _loc as a parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to use it in the error messages.

loop_notes
}

/// Checks if a field type occurs in its access path,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please format the comments to have reasonable number of columns.
This line ends at 58, the next line ends at 132.

Rust fmt does not introduce line breaks in comments, so unfortunately you cannot rely on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

{
path.push((ty_loc.clone(), *field_name, ty.clone()));
let field_ty_instantiated = field_ty_uninstantiated.instantiate(insts);
// short-curcuit upon first recursive occurence found
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// short-curcuit upon first recursive occurence found
// short-circuit upon first recursive occurrence found

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we are not doing what this comment claims. It is because the return values are flipped.

Please add this test case (or similar), which demonstrates the issue:

struct X {
    f: Y,
    g: Y,
}

struct Y {
    f: X,
    g: X
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

),
loop_notes,
);
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this match the function doc on return value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated

}
}
} else {
panic!("field access path contains non-struct")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
panic!("field access path contains non-struct")
unreachable!("field access path contains non-struct")

Copy link
Contributor Author

@fEst1ck fEst1ck Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated

self.error_with_notes(
loc_checking,
&format!(
"recursive definition {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"recursive definition {}",
"recursive definition of struct `{}`",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or something similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change the name to "cyclic data" as in V1, since the problem is not always recursive definition

}
}
if let Some(fields) = &this_struct_entry.fields {
// check for descendant fields recursively
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we consider the code:

struct X {
    a: Y,
    b: Y,
    c: Y,
    ..
    z: Y
}

where Y is a struct with several other fields (including possibly other structs, but none of them recursive), then we will explore a multiplicative number of paths (most of them unnecessarily)?

That is, if X has 10 fields that are Ys, and Y has 10 fields that are Zs (and Z is just a wrapper struct), then we recursively call this function 100 times, where as we only have to explore path X -any-field-> Y -any-field-> Z to show there is no recursive struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solved in latest commit

@vineethk
Copy link
Contributor

vineethk commented Mar 7, 2024

@fEst1ck Also, note that "Aptos Move Test for Compiler v2" is failing (but it is not a required test), because it is taking too much time - it should finish with much less time. Please investigate before merging in (the performance issue I pointed out may be related, may be not).

@fEst1ck fEst1ck force-pushed the check-recursive-struct1 branch 2 times, most recently from ff19aac to 7c486de Compare March 15, 2024 06:40
@fEst1ck fEst1ck requested a review from vineethk March 15, 2024 07:56
Copy link
Contributor

@vineethk vineethk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have minor comments - please address them before merging.

LGTM otherwise - I thought this rewrite was much easier to reason about, and is certainly more performant than the previous version. Nice work!

// Copyright © Aptos Foundation
// SPDX-License-Identifier: Apache-2.0

//! Implements an AST pass that checks any struct `S` cannot contain descendant of type `S`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention if there are any prerequisites (other AST passes that must be run before it, cross module cyclic struct references have already been flagged as errors, etc.) or any side-effects, including if there are none. This would allow us to re-order passes if need be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//! Implements an AST pass that checks any struct `S` cannot contain descendant of type `S`.
//! Implements an AST pass that checks any struct `m::S` cannot contain descendant of type `m::S`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(either this, or mention that this check is only done within each module)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

checked: &mut BTreeSet<StructId>,
) {
// check for cyclic dependencies
for (i, (_ancestor_loc, _ancecsotr_name, ancestor)) in path.iter().enumerate() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (i, (_ancestor_loc, _ancecsotr_name, ancestor)) in path.iter().enumerate() {
for (i, (_ancestor_loc, _ancestor_name, ancestor)) in path.iter().enumerate() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
},
Type::Primitive(_) | Type::TypeParameter(_) => {},
_ => panic!("invalid field type"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_ => panic!("invalid field type"),
_ => unreachable!("invalid field type"),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

path.push((loc.clone(), field_name, struct_id));
match field_env.get_type() {
Type::Struct(field_mod_id, field_struct_id, insts) => {
// make sure the field struct has been checked
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// make sure the field struct has been checked
// make sure the field struct has not been checked already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
// check the type parameters of the fields cannot contain the struct we are checking
if insts.iter().any(|ty| {
self.ty_contains_struct(path, &ty, loc.clone(), struct_id, checked)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.ty_contains_struct(path, &ty, loc.clone(), struct_id, checked)
self.ty_contains_struct(path, ty, loc.clone(), struct_id, checked)

fn report_invalid_field(&self, struct_env: &StructEnv, field_env: &FieldEnv) {
let struct_name = self.get_struct_name(struct_env.get_id());
let note = format!(
"invalid field {} of {} containing {} itself",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"invalid field {} of {} containing {} itself",
"invalid field `{}` of `{}` containing `{}` itself",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,25 @@
address 0x42 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: having test file names like recursive_structs and recursive_struct (where both contains multiple recursive structs) can make them difficult to maintain or refer to. More descriptive names would be ideal, but even numbering them recursive_structs_[1,2] would be better than existing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@vineethk
Copy link
Contributor

From the PR description:

TODO: add that we can't have recursive structs in the move book.

We already say this in the move book, see: https://aptos.dev/move/book/structs-and-resources/#defining-structs. Did you mean something else?

@fEst1ck
Copy link
Contributor Author

fEst1ck commented Mar 16, 2024

From the PR description:

TODO: add that we can't have recursive structs in the move book.

We already say this in the move book, see: https://aptos.dev/move/book/structs-and-resources/#defining-structs. Did you mean something else?

Also found this https://aptos.dev/move/book/generics/#recursive-structs

@fEst1ck fEst1ck force-pushed the check-recursive-struct1 branch from 1a6837b to 3932ffb Compare March 16, 2024 01:59
@fEst1ck fEst1ck enabled auto-merge (squash) March 16, 2024 01:59

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on f25596c17edf9aa458ad65eea5eb07cad83b92e0

two traffics test: inner traffic : committed: 7703 txn/s, latency: 5092 ms, (p50: 5000 ms, p90: 6000 ms, p99: 10400 ms), latency samples: 3328020
two traffics test : committed: 100 txn/s, latency: 1928 ms, (p50: 1900 ms, p90: 2200 ms, p99: 6600 ms), latency samples: 1820
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.232, avg: 0.204", "QsPosToProposal: max: 0.282, avg: 0.265", "ConsensusProposalToOrdered: max: 0.493, avg: 0.448", "ConsensusOrderedToCommit: max: 0.335, avg: 0.318", "ConsensusProposalToCommit: max: 0.780, avg: 0.766"]
Max round gap was 1 [limit 4] at version 1630281. Max no progress secs was 4.637211 [limit 15] at version 1630281.
Test Ok

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.9.5 ==> f25596c17edf9aa458ad65eea5eb07cad83b92e0

Compatibility test results for aptos-node-v1.9.5 ==> f25596c17edf9aa458ad65eea5eb07cad83b92e0 (PR)
1. Check liveness of validators at old version: aptos-node-v1.9.5
compatibility::simple-validator-upgrade::liveness-check : committed: 6768 txn/s, latency: 4901 ms, (p50: 4800 ms, p90: 8400 ms, p99: 9300 ms), latency samples: 236900
2. Upgrading first Validator to new version: f25596c17edf9aa458ad65eea5eb07cad83b92e0
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1815 txn/s, latency: 16469 ms, (p50: 18300 ms, p90: 22500 ms, p99: 23200 ms), latency samples: 94380
3. Upgrading rest of first batch to new version: f25596c17edf9aa458ad65eea5eb07cad83b92e0
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 246 txn/s, submitted: 441 txn/s, expired: 194 txn/s, latency: 42737 ms, (p50: 41400 ms, p90: 56400 ms, p99: 71800 ms), latency samples: 22461
4. upgrading second batch to new version: f25596c17edf9aa458ad65eea5eb07cad83b92e0
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 2538 txn/s, latency: 12021 ms, (p50: 13100 ms, p90: 14400 ms, p99: 15600 ms), latency samples: 109140
5. check swarm health
Compatibility test for aptos-node-v1.9.5 ==> f25596c17edf9aa458ad65eea5eb07cad83b92e0 passed
Test Ok

@fEst1ck fEst1ck merged commit b69a7a3 into aptos-labs:main Mar 16, 2024
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Need check for structs not recursive
4 participants