-
Notifications
You must be signed in to change notification settings - Fork 236
Large scale separate verification & compilation
This is a summary of a lengthly discussion between Cédric Fournet, (jet-lagged) Tahina Ramananandro, and Jonathan Protzenko (current author).
New: the new behavior has been implemented! See What is verified, what is extracted?
With Project Everest making steady progress, we are now facing the problem of large-scale F* developments, and how to achieve some degree of modularity. This is important for:
- performance (parsing and lax-checking a huge dependency graph can make verifying even a standalone file difficult)
- collaboration (we want people working on, say, HACL/lib, to be independent from people working on, say, miTLS)
- proofs (we want to say that we can verify miTLS without having access to any of the low-level modules from HACL/lib).
The running example is miTLS, which wants to verify against HACL/lib (née low-level/crypto
), but really only uses a handful of functions, not the whole pyramid of modules.
Consider the following dependency graph. Arrows are found by the dependency discovery mechanism, starting from miTLS as the root of the traversal.
`-,
\ Crypto.fsti <--- miTLS (contains "Crypto.foo")
C1 `-, <--, ^ |
\ \ | | !!dubious arrow!!
C2 | <--'----,----- Crypto.fst <-------------'
| ____,|
C3 / <-' /
C4 ,-' <------'
/
| (a dependency graph -- artist's view)
,-'
The gist of the proposal is to add an option to control whether the "dubious arrow" exists or not in the dependency graph. In other words, what it means to "have a dependency on Crypto".
The situation is as follows:
- there are series of cryptographic modules C1 .. C4;
- miTLS only needs a handful of functions exported by Crypto.fsti;
- Crypto.fsti does NOT depend on any of the crypto modules C1 ...C4; rather, it uses a combination of abstract types and type abbreviations to expose a standalone cryptographic interface
- the implementation of Crypto.fsti, that is, Crypto.fst, provides concrete functions and contains a myriad of dependencies against each of the crypto modules C1 through C4.
This means that we can split verification, extraction and build in two steps:
- verify C1 .. C4, Crypto.fst and Crypto.fsti together
- verify miTLS against Crypto.fsti ONLY.
The latter step is a net improvement, because we operate in a special mode where:
- the dependency graph is cut off (the dubious arrow from miTLS to Crypto.fst is gone), meaning that we ignore all the C1 .. C4 files
- we have a bulletproof argument that miTLS won't traverse the abstraction to use some unsafe low-level crypto primitive (F* doesn't even read the files)
- people can work on C1 .. C4 separately.
Right now, when faced with both Crypto.fst and Crypto.fsti, F* generates a dependency on both files. We introduce a new option, --partial
, that controls whether the arrow from miTLS to Crypto.fst exists in the dependency graph or not.
We have:
-
--verify_all
: this implies NOT(--partial
), because we want to verify all the project -
--verify_module Mi
: this implies--partial
, because we're only interested in specific modules Mi; the dependency graph may have several connected components, but each Mi MUST be in the graph (JP: check that this is indeed a hard error) - default behavior: verify modules passed on the command-line only; rely on the user-provided
--partial
to determine whether mentioning "Crypto" generates a dependency onCrypto.fsti
only ORCrypto.fsti
andCrypto.fst
Workflow:
- Verifying the Crypto part means invoking
fstar Crypto.fst
- Verifying miTLS against Crypto means invoking
fstar miTLSMain.fst --partial
, which stops at theCrypto.fsti
boundary.
Implementation note: this is a two-liner because the code for --partial
is still there with a note saying that this is the --stratified
behavior.
Corner cases:
-
fstar --partial foo.fsti foo.fst
(still verifies both) -
fstar --partial --verify_module Foo foo.fsti foo.fst
(must verify the interleaving of the two) -
fstar --interactive
(implies--partial
) -
fstar foo.fsti
(does NOT pick upfoo.fst
because the arrow is from thefst
to thefsti
)
We have:
- `--no_extract``
--codegen (OCaml|FSharp|Kremlin)
Workflow:
-
fstar --codegen OCaml Crypto.fst
for the crypto library -
fstar --partial --codegen OCaml miTLSMain.fst
for miTLS
Since the latter invocation "sees" only Crypto.fsti
, and since having only an fsti
in the dependency graph means "realized module, don't extract" (see F★ Interfaces (simple version) for a reminder), this means that the latter invocation of F* generates ML modules that assume a properly-implemented Crypto.ml
, which is great because we produced it using the first invocation of F*.
We can even go further, and build a crypto.cmxa
(OCaml) and crypto.a
(Kremlin) for the crypto library, and never rebuild it.
--partial
controls what it means to "take a dependency" on Crypto
. The dependency arrow from an fst to an fsti always exists, meaning that fstar foo.fst
picks up foo.fsti
(and interleaves) while fstar foo.fsti
does NOT pick up foo.fst
and verifies a standalone interface.
Implementation note (JP): this is a known incorrect behavior that has had a corresponding TODO item on my whiteboard for several months now.
This is all fine, but:
- no one enforces that
Crypto.fsti
is up-to-date with regards to the Crypto modules. That is, miTLS may keep verifying against a modified Crypto.fsti that no one verified against Crypto.fst - proliferation of files
Cédric suggests (for the long-term), a fstx
package that contains:
Crypto.fsti
crypto.cmxa
crypto.a