Fix `assess` with the MIR linker by changing reachability modes #1816

tedinski · 2022-10-28T21:10:11Z

Description of changes:

Introduces (hidden, unstable) --codegen-pub-fns to change linker reachability behavior, and uses it from assess.

This fixes the assess feature now that MIR linker is the default.

Resolved issues:

Resolves #1796

Can still be run with legacy linker, but defaults to mir and works now.

cargo kani --enable-unstable assess
cargo kani --enable-unstable --legacy-linker assess

Call-outs:

Using the MIR linker significantly increases the "problem sizes" and verification times for my goto test crate (rand). We go from usually <1k properties and ~10s common verification times to 50k properties and 60s verification times for each test harness. I have not yet investigated why.
Two tests from that crate go from unwind failures to unsupported_construct failures (try and inline asm). It's curious that this happened as a result (presumably) of linking the standard library. I think this also should be investigated.

I initially thought I'd block this change on these investigations, but actually this does apparently work and fixes the currently-broken behavior with MIR linker (now the default), so I decided let's merge and then follow up.

Testing:

How is this change tested? ci
Is this a refactor change? no

Checklist

Each commit message has a non-empty body, explaining why the change was made
Methods or procedures are documented
Regression or unit tests are included, or existing tests cover the modified code
My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

celinval

I am a bit worried that we are adding switches to overcome architecture issues. I think we might need a better way to handle the subcommand structure. Have you explored any other route?

edit: After taking another careful look, I see that you are setting the argument manually inside assess. I really don't think that's a good pattern. We shouldn't be changing arguments.

tedinski · 2022-10-31T14:53:23Z

I don't see a significant problem with adding switches. Or at least, this switch.

It is very much a better approach than trying to discover we're in "assess mode" inside call_cargo and doing things differently. (Concrete playback, for instance, made this design mistake, and it needs to be fixed eventually. It leads to unexpected behavior, is fragile, and causes issues any time things needs to be changed.)

I do think the way we structure arguments needs refactoring, but it's not that high on my priority list. It would be nice if there were a clean way of giving different defaults in different modes (which is really what we want here), but I'm not aware of a good way to do that with clap, so I'm punting on that problem for now to future me, basically.

adpaco-aws · 2022-10-31T15:24:39Z

When it's about unsupported features, I thought cargo kani assess should perform no reachability analysis at all - just codegen everything. I revisited the MIR linker RFC and I see there's a none option, but that doesn't generate a goto-program. It's not clear to me why (maybe it ought to be changed), but an option like that is probably the closest to what we currently report for unsupported features.

tedinski · 2022-10-31T16:58:39Z

Good point about that change in behavior. I do see a change in numbers there (crate count is broken, but we see different instance counts too).

I do think for assess we care about unsupported features that might actually impact the crate being analyzed. So this change is actually a good thing there. It means we won't complain about unsupported features that couldn't possibly be hit from this crate.

But it is an obstacle to exactly reproducing our old numbers. As an alternative we could look into splitting the table into a different organization. We could report "instances in THIS crate" vs "instances in dependencies reachable from this crate".

We should then be able to reproduce the old numbers by running assess on all the crates, then aggregating the "in THIS crate" numbers, right?

celinval · 2022-10-31T17:51:34Z

When it's about unsupported features, I thought cargo kani assess should perform no reachability analysis at all - just codegen everything. I revisited the MIR linker RFC and I see there's a none option, but that doesn't generate a goto-program. It's not clear to me why (maybe it ought to be changed), but an option like that is probably the closest to what we currently report for unsupported features.

None is meant to skip everything. @adpaco-aws, maybe what you are looking for is an all option which just blindly considers every symbol being reachable.

Note that the legacy linker still has reachability analysis in it. A dead function with an unsupported construct would not be reported in Kani in the first place.

celinval · 2022-10-31T17:56:00Z

I don't see a significant problem with adding switches. Or at least, this switch.

It is very much a better approach than trying to discover we're in "assess mode" inside call_cargo and doing things differently. (Concrete playback, for instance, made this design mistake, and it needs to be fixed eventually. It leads to unexpected behavior, is fragile, and causes issues any time things needs to be changed.)

I do think the way we structure arguments needs refactoring, but it's not that high on my priority list. It would be nice if there were a clean way of giving different defaults in different modes (which is really what we want here), but I'm not aware of a good way to do that with clap, so I'm punting on that problem for now to future me, basically.

Why do we need a switch though? They are meant to reflect the user input not some processing information. Can it be an internal field for now?

Yes, I do agree that the business logic of call_cargo shouldn't be checking features one by one.

tedinski · 2022-10-31T19:19:06Z

I created #1819 to track the unsupported features issue.

Why do we need a switch though? They are meant to reflect the user input not some processing information. Can it be an internal field for now?

I don't think we need a switch, but there isn't anything else that's an "internal field" right now. It's a hidden unstable, if we want to refactor it later, we can right?

adpaco-aws · 2022-10-31T20:03:43Z

None is meant to skip everything. @adpaco-aws, maybe what you are looking for is an all option which just blindly considers every symbol being reachable.

Note that the legacy linker still has reachability analysis in it. A dead function with an unsupported construct would not be reported in Kani in the first place.

Thanks for the clarification, they way I interpreted none is that no analysis is done so all functions are considered reachable (i.e., no slicing occurs).

I need to do some thinking about #1819 - Will write my thoughts there.

celinval · 2022-10-31T20:25:11Z

Can you please add an internal field instead of the fake switch? Is the effort that much higher?

It is just the right thing to do. Why should we add a switch that we will ignore its value? It is very confusing when we start interfering with the value of options that the user requested.

We should also add proper validation for options that are conflicting.

I tried to use assess the other day, and all the tests were failing because of the unwind assertion. So I ran kani assess --default-unwind 10 which for my surprise didn't fix the problem. It just quietly ignored what I requested. So I had to go read to code to understand what was going on.

tedinski · 2022-10-31T21:00:59Z

It's easy enough to write, my concern is only that it's a regression in code quality. But, patched added.

It gains not adding an unstable/hidden switch (which... why not? The cost of that seems nil), and loses that we now spread configuration across two sources (args and session) in a completely one-off way.

But works for me if this is what you want. Both are equally refactorable in the future in my eyes.

celinval · 2022-10-31T21:43:07Z

In terms of architecture, you want the information flow to be clear. It is much easier to maintain and understand. The command arguments represent things that reflect the user input. It should be fairly dumb and just reflect the user wishes. The kind of logic that usually makes sense to be inside arguments is validation logic. That's about it.

I do agree that we probably want to introduce a layer between Kani driver business logic and command arguments. In my mind, this was the actual role of Kani session (and that's how it works on rustc too). This new layer basically represent the implications of the user request and other possible variables. For example, the number of threads Kani may depend on some business logic that looks at the user input, the current system and the subcommand that the user invoked.

Initially, kani-driver wasn't doing complicated stuff, so it made sense to skip the second layer for simplicity. But our software is evolving, and kani-driver is getting more complex. The concrete playback and assess are two features that do more than just invoke a few commands in a specific order.

You are correct, we don't want to keep using the pattern that concrete playback did, which is basically checking if that feature is enabled to determine how we invoke kani-compiler or CBMC. It makes it hard to figure out the implications of enabling the feature.

I do think that what we have we assess today is even more confusing. Changing the value of arguments is counter intuitive. It goes against the expected flow and it makes it really hard to understand what's going on. More than a code quality regression, this is a big UX regression. As I mentioned, in the unwind case, we are literally ignoring the user choice. I really don't want us to make a pattern out of this.

tedinski · 2022-10-31T22:23:10Z

I do think that what we have we assess today is even more confusing. Changing the value of arguments is counter intuitive. It goes against the expected flow and it makes it really hard to understand what's going on.

Ah, this might help then: this "arguments before the subcommand" thing is an awkward side-effect of how our argument parser works currently, and should be removed when we restructure/refactor this code.

I see assess as currently taking no arguments and having no configuration options at all (yet), and this ability to prepend arguments to the subcommand is a hack that needs to go away.

celinval

Thanks for changing this. We should talk later about how we can improve the driver architecture to handle more complex feature configurations.

celinval · 2022-10-31T21:49:47Z

kani-driver/src/call_cargo.rs

-            } else {
-                pkg_args.push("--reachability=harnesses".into());
+        match self.reachability_mode() {
+            crate::session::ReachabilityMode::Legacy => {


nit: import crate::session::ReachabilityMode

celinval · 2022-11-01T17:13:51Z

kani-driver/src/assess.rs

    //    e.g.  successs   6
    //          unwind     234
    println!("{}", build_failure_reasons_table(&results));
+
+    // TODO: Should add another interesting table: Count the actually hit constructs (e.g. 'try', 'InlineAsm', etc)


can you please add the link for #1819?

tedinski requested a review from a team as a code owner October 28, 2022 21:10

celinval requested changes Oct 28, 2022

View reviewed changes

tedinski added 2 commits October 31, 2022 20:40

Add --codegen-pub-fns and use it from assess

5c2fe99

Move codegen_pub_fns from Args to Session

a102c57

tedinski force-pushed the assess-mir branch from 06a289e to a102c57 Compare October 31, 2022 20:59

tedinski mentioned this pull request Nov 1, 2022

Investigate assess performance regression with MIR linker #1828

Open

celinval approved these changes Nov 1, 2022

View reviewed changes

Merge remote-tracking branch 'origin/main' into assess-mir

de9669c

tedinski changed the title ~~Add --codegen-pub-fns and use it from assess~~ Fix assess with the MIR linker by changing reachability modes Nov 1, 2022

tedinski added 2 commits November 1, 2022 18:09

address PR comments

14e4738

Merge branch 'main' into assess-mir

14387ba

tedinski merged commit 22b78b0 into model-checking:main Nov 1, 2022

tedinski deleted the assess-mir branch November 1, 2022 21:13

tedinski added this to the Create Kani Assess tool milestone Jan 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `assess` with the MIR linker by changing reachability modes #1816

Fix `assess` with the MIR linker by changing reachability modes #1816

tedinski commented Oct 28, 2022

celinval left a comment •

edited

Loading

tedinski commented Oct 31, 2022

adpaco-aws commented Oct 31, 2022

tedinski commented Oct 31, 2022

celinval commented Oct 31, 2022 •

edited

Loading

celinval commented Oct 31, 2022

tedinski commented Oct 31, 2022

adpaco-aws commented Oct 31, 2022

celinval commented Oct 31, 2022

tedinski commented Oct 31, 2022

celinval commented Oct 31, 2022

tedinski commented Oct 31, 2022

celinval left a comment

celinval Oct 31, 2022

celinval Nov 1, 2022

Fix assess with the MIR linker by changing reachability modes #1816

Fix assess with the MIR linker by changing reachability modes #1816

Conversation

tedinski commented Oct 28, 2022

Description of changes:

Resolved issues:

Call-outs:

Testing:

Checklist

celinval left a comment • edited Loading

Choose a reason for hiding this comment

tedinski commented Oct 31, 2022

adpaco-aws commented Oct 31, 2022

tedinski commented Oct 31, 2022

celinval commented Oct 31, 2022 • edited Loading

celinval commented Oct 31, 2022

tedinski commented Oct 31, 2022

adpaco-aws commented Oct 31, 2022

celinval commented Oct 31, 2022

tedinski commented Oct 31, 2022

celinval commented Oct 31, 2022

tedinski commented Oct 31, 2022

celinval left a comment

Choose a reason for hiding this comment

celinval Oct 31, 2022

Choose a reason for hiding this comment

celinval Nov 1, 2022

Choose a reason for hiding this comment

Fix `assess` with the MIR linker by changing reachability modes #1816

Fix `assess` with the MIR linker by changing reachability modes #1816

celinval left a comment •

edited

Loading

celinval commented Oct 31, 2022 •

edited

Loading