Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix assess with the MIR linker by changing reachability modes #1816

Merged
merged 5 commits into from
Nov 1, 2022

Conversation

tedinski
Copy link
Contributor

Description of changes:

Introduces (hidden, unstable) --codegen-pub-fns to change linker reachability behavior, and uses it from assess.

This fixes the assess feature now that MIR linker is the default.

Resolved issues:

Resolves #1796

Can still be run with legacy linker, but defaults to mir and works now.

  • cargo kani --enable-unstable assess
  • cargo kani --enable-unstable --legacy-linker assess

Call-outs:

  1. Using the MIR linker significantly increases the "problem sizes" and verification times for my goto test crate (rand). We go from usually <1k properties and ~10s common verification times to 50k properties and 60s verification times for each test harness. I have not yet investigated why.
  2. Two tests from that crate go from unwind failures to unsupported_construct failures (try and inline asm). It's curious that this happened as a result (presumably) of linking the standard library. I think this also should be investigated.

I initially thought I'd block this change on these investigations, but actually this does apparently work and fixes the currently-broken behavior with MIR linker (now the default), so I decided let's merge and then follow up.

Testing:

  • How is this change tested? ci

  • Is this a refactor change? no

Checklist

  • Each commit message has a non-empty body, explaining why the change was made
  • Methods or procedures are documented
  • Regression or unit tests are included, or existing tests cover the modified code
  • My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

@tedinski tedinski requested a review from a team as a code owner October 28, 2022 21:10
Copy link
Contributor

@celinval celinval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried that we are adding switches to overcome architecture issues. I think we might need a better way to handle the subcommand structure. Have you explored any other route?

edit: After taking another careful look, I see that you are setting the argument manually inside assess. I really don't think that's a good pattern. We shouldn't be changing arguments.

@tedinski
Copy link
Contributor Author

I don't see a significant problem with adding switches. Or at least, this switch.

It is very much a better approach than trying to discover we're in "assess mode" inside call_cargo and doing things differently. (Concrete playback, for instance, made this design mistake, and it needs to be fixed eventually. It leads to unexpected behavior, is fragile, and causes issues any time things needs to be changed.)

I do think the way we structure arguments needs refactoring, but it's not that high on my priority list. It would be nice if there were a clean way of giving different defaults in different modes (which is really what we want here), but I'm not aware of a good way to do that with clap, so I'm punting on that problem for now to future me, basically.

@adpaco-aws
Copy link
Contributor

When it's about unsupported features, I thought cargo kani assess should perform no reachability analysis at all - just codegen everything. I revisited the MIR linker RFC and I see there's a none option, but that doesn't generate a goto-program. It's not clear to me why (maybe it ought to be changed), but an option like that is probably the closest to what we currently report for unsupported features.

@tedinski
Copy link
Contributor Author

Good point about that change in behavior. I do see a change in numbers there (crate count is broken, but we see different instance counts too).

I do think for assess we care about unsupported features that might actually impact the crate being analyzed. So this change is actually a good thing there. It means we won't complain about unsupported features that couldn't possibly be hit from this crate.

But it is an obstacle to exactly reproducing our old numbers. As an alternative we could look into splitting the table into a different organization. We could report "instances in THIS crate" vs "instances in dependencies reachable from this crate".

We should then be able to reproduce the old numbers by running assess on all the crates, then aggregating the "in THIS crate" numbers, right?

@celinval
Copy link
Contributor

celinval commented Oct 31, 2022

When it's about unsupported features, I thought cargo kani assess should perform no reachability analysis at all - just codegen everything. I revisited the MIR linker RFC and I see there's a none option, but that doesn't generate a goto-program. It's not clear to me why (maybe it ought to be changed), but an option like that is probably the closest to what we currently report for unsupported features.

None is meant to skip everything. @adpaco-aws, maybe what you are looking for is an all option which just blindly considers every symbol being reachable.

Note that the legacy linker still has reachability analysis in it. A dead function with an unsupported construct would not be reported in Kani in the first place.

@celinval
Copy link
Contributor

I don't see a significant problem with adding switches. Or at least, this switch.

It is very much a better approach than trying to discover we're in "assess mode" inside call_cargo and doing things differently. (Concrete playback, for instance, made this design mistake, and it needs to be fixed eventually. It leads to unexpected behavior, is fragile, and causes issues any time things needs to be changed.)

I do think the way we structure arguments needs refactoring, but it's not that high on my priority list. It would be nice if there were a clean way of giving different defaults in different modes (which is really what we want here), but I'm not aware of a good way to do that with clap, so I'm punting on that problem for now to future me, basically.

Why do we need a switch though? They are meant to reflect the user input not some processing information. Can it be an internal field for now?

Yes, I do agree that the business logic of call_cargo shouldn't be checking features one by one.

@tedinski
Copy link
Contributor Author

I created #1819 to track the unsupported features issue.

Why do we need a switch though? They are meant to reflect the user input not some processing information. Can it be an internal field for now?

I don't think we need a switch, but there isn't anything else that's an "internal field" right now. It's a hidden unstable, if we want to refactor it later, we can right?

@adpaco-aws
Copy link
Contributor

None is meant to skip everything. @adpaco-aws, maybe what you are looking for is an all option which just blindly considers every symbol being reachable.

Note that the legacy linker still has reachability analysis in it. A dead function with an unsupported construct would not be reported in Kani in the first place.

Thanks for the clarification, they way I interpreted none is that no analysis is done so all functions are considered reachable (i.e., no slicing occurs).

I need to do some thinking about #1819 - Will write my thoughts there.

@celinval
Copy link
Contributor

Can you please add an internal field instead of the fake switch? Is the effort that much higher?

It is just the right thing to do. Why should we add a switch that we will ignore its value? It is very confusing when we start interfering with the value of options that the user requested.

We should also add proper validation for options that are conflicting.

I tried to use assess the other day, and all the tests were failing because of the unwind assertion. So I ran kani assess --default-unwind 10 which for my surprise didn't fix the problem. It just quietly ignored what I requested. So I had to go read to code to understand what was going on.

@tedinski
Copy link
Contributor Author

It's easy enough to write, my concern is only that it's a regression in code quality. But, patched added.

It gains not adding an unstable/hidden switch (which... why not? The cost of that seems nil), and loses that we now spread configuration across two sources (args and session) in a completely one-off way.

But works for me if this is what you want. Both are equally refactorable in the future in my eyes.

@celinval
Copy link
Contributor

In terms of architecture, you want the information flow to be clear. It is much easier to maintain and understand. The command arguments represent things that reflect the user input. It should be fairly dumb and just reflect the user wishes. The kind of logic that usually makes sense to be inside arguments is validation logic. That's about it.

I do agree that we probably want to introduce a layer between Kani driver business logic and command arguments. In my mind, this was the actual role of Kani session (and that's how it works on rustc too). This new layer basically represent the implications of the user request and other possible variables. For example, the number of threads Kani may depend on some business logic that looks at the user input, the current system and the subcommand that the user invoked.

Initially, kani-driver wasn't doing complicated stuff, so it made sense to skip the second layer for simplicity. But our software is evolving, and kani-driver is getting more complex. The concrete playback and assess are two features that do more than just invoke a few commands in a specific order.

You are correct, we don't want to keep using the pattern that concrete playback did, which is basically checking if that feature is enabled to determine how we invoke kani-compiler or CBMC. It makes it hard to figure out the implications of enabling the feature.

I do think that what we have we assess today is even more confusing. Changing the value of arguments is counter intuitive. It goes against the expected flow and it makes it really hard to understand what's going on. More than a code quality regression, this is a big UX regression. As I mentioned, in the unwind case, we are literally ignoring the user choice. I really don't want us to make a pattern out of this.

@tedinski
Copy link
Contributor Author

I do think that what we have we assess today is even more confusing. Changing the value of arguments is counter intuitive. It goes against the expected flow and it makes it really hard to understand what's going on.

Ah, this might help then: this "arguments before the subcommand" thing is an awkward side-effect of how our argument parser works currently, and should be removed when we restructure/refactor this code.

I see assess as currently taking no arguments and having no configuration options at all (yet), and this ability to prepend arguments to the subcommand is a hack that needs to go away.

Copy link
Contributor

@celinval celinval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for changing this. We should talk later about how we can improve the driver architecture to handle more complex feature configurations.

} else {
pkg_args.push("--reachability=harnesses".into());
match self.reachability_mode() {
crate::session::ReachabilityMode::Legacy => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: import crate::session::ReachabilityMode

// e.g. successs 6
// unwind 234
println!("{}", build_failure_reasons_table(&results));

// TODO: Should add another interesting table: Count the actually hit constructs (e.g. 'try', 'InlineAsm', etc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please add the link for #1819?

@tedinski tedinski changed the title Add --codegen-pub-fns and use it from assess Fix assess with the MIR linker by changing reachability modes Nov 1, 2022
@tedinski tedinski merged commit 22b78b0 into model-checking:main Nov 1, 2022
@tedinski tedinski deleted the assess-mir branch November 1, 2022 21:13
@tedinski tedinski added this to the Create Kani Assess tool milestone Jan 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assess requires user to pass --legacy-linker
3 participants