diff --git a/doc/manual/galois-whitepaper.cls b/doc/manual/galois-whitepaper.cls index a4b29bc2e7..50bcbb878a 100644 --- a/doc/manual/galois-whitepaper.cls +++ b/doc/manual/galois-whitepaper.cls @@ -82,7 +82,7 @@ \rhead{\sffamily \textit{\@title{}}} \makeatother -\lfoot{\sffamily\color{lightgray}\small\copyright{} 2019 Galois, Inc.} +\lfoot{\sffamily\color{lightgray}\small\copyright{} 2021 Galois, Inc.} \rfoot{ % \sffamily \color{galoislightgray}\small diff --git a/doc/manual/manual.md b/doc/manual/manual.md index c1c958edf4..efc8e5ba14 100644 --- a/doc/manual/manual.md +++ b/doc/manual/manual.md @@ -74,7 +74,7 @@ command-line options: `-b path, --java-bin-dirs` - ~ Specfy a colon-delimited list of paths to search for a Java + ~ Specify a colon-delimited list of paths to search for a Java executable. `-d num, --sim-verbose=num` @@ -498,7 +498,7 @@ understand the role of the Cryptol language in SAW. # Cryptol and its Role in SAW -Cyptol is a domain-specific language originally designed for the +Cryptol is a domain-specific language originally designed for the high-level specification of cryptographic algorithms. It is general enough, however, to describe a wide variety of programs, and is particularly applicable to describing computations that operate on @@ -1082,7 +1082,7 @@ Whether proving small lemmas (in the form of rewrite rules) or a top-level theorem, the process builds on the idea of a *proof script* that is run by one of the top level proof commands. -* `prove_print : ProofScript SatResult -> Term -> TopLevel Theorem` +* `prove_print : ProofScript () -> Term -> TopLevel Theorem` takes a proof script (which we'll describe next) and a `Term`. The `Term` should be of function type with a return value of `Bool` (`Bit` at the Cryptol level). It will then use the proof script to attempt to @@ -1090,11 +1090,11 @@ show that the `Term` returns `True` for all possible inputs. If it is successful, it will print `Valid` and return a `Theorem`. If not, it will abort. -* `sat_print : ProofScript SatResult -> Term -> TopLevel ()` is similar +* `sat_print : ProofScript () -> Term -> TopLevel ()` is similar except that it looks for a *single* value for which the `Term` evaluates to `True` and prints out that value, returning nothing. -* `prove_core : ProofScript SatResult -> String -> TopLevel Theorem` +* `prove_core : ProofScript () -> String -> TopLevel Theorem` proves and returns a `Theorem` from a string in SAWCore syntax. ## Automated Tactics @@ -1197,11 +1197,11 @@ provers is to unfold everything before sending a goal to a prover. However, with some provers it is possible to indicate that specific named subterms should be represented as uninterpreted functions. -* `unint_cvc4 : [String] -> ProofScript SatResult` +* `unint_cvc4 : [String] -> ProofScript ()` -* `unint_yices : [String] -> ProofScript SatResult` +* `unint_yices : [String] -> ProofScript ()` -* `unint_z3 : [String] -> ProofScript SatResult` +* `unint_z3 : [String] -> ProofScript ()` The list of `String` arguments in these cases indicates the names of the subterms to leave folded, and therefore present as uninterpreted @@ -1218,7 +1218,7 @@ In addition to the built-in automated provers already discussed, SAW supports more generic interfaces to other arbitrary theorem provers supporting specific interfaces. -* `external_aig_solver : String -> [String] -> ProofScript SatResult` +* `external_aig_solver : String -> [String] -> ProofScript ()` supports theorem provers that can take input as a single-output AIGER file. The first argument is the name of the executable to run. The second argument is the list of command-line parameters to pass to that @@ -1226,7 +1226,7 @@ executable. Any element of this list equal to `"%f"` will be replaced with the name of the temporary AIGER file generated for the proof goal. The output from the solver is expected to be in DIMACS solution format. -* `external_cnf_solver : String -> [String] -> ProofScript SatResult` +* `external_cnf_solver : String -> [String] -> ProofScript ()` works similarly but for SAT solvers that take input in DIMACS CNF format and produce output in DIMACS solution format. @@ -1237,15 +1237,15 @@ until a later time, there are functions to write the current goal to a file in various formats, and then assume that the goal is valid through the rest of the script. -* `offline_aig : String -> ProofScript SatResult` +* `offline_aig : String -> ProofScript ()` -* `offline_cnf : String -> ProofScript SatResult` +* `offline_cnf : String -> ProofScript ()` -* `offline_extcore : String -> ProofScript SatResult` +* `offline_extcore : String -> ProofScript ()` -* `offline_smtlib2 : String -> ProofScript SatResult` +* `offline_smtlib2 : String -> ProofScript ()` -* `offline_unint_smtlib2 : [String] -> String -> ProofScript SatResult` +* `offline_unint_smtlib2 : [String] -> String -> ProofScript ()` These support the AIGER, DIMACS CNF, shared SAWCore, and SMT-Lib v2 formats, respectively. The shared representation for SAWCore is @@ -1259,23 +1259,26 @@ listed in its first argument as uninterpreted functions. Some proofs can be completed using unsound placeholders, or using techniques that do not require significant computation. -* `assume_unsat : ProofScript SatResult` indicates that the current goal -should be assumed to be unsatisfiable. At the moment, -`jvm_verify` and `llvm_verify` (described below) run -their proofs in a satisfiability-checking (negated) context, so -`assume_unsat` indicates that the property being checked should be -assumed to be true. This is likely to change in the future. +* `assume_unsat : ProofScript ()` indicates that the current goal +should be assumed to be unsatisfiable. This is an alias for +`assume_valid`. Users should prefer to use `admit` instead. -* `assume_valid : ProofScript ProofResult` indicates that the current -goal should be assumed to be valid. +* `assume_valid : ProofScript ()` indicates that the current +goal should be assumed to be valid. Users should prefer to +use `admit` instead -* `quickcheck : Int -> ProofScript SatResult` runs the goal on the given +* `admit : String -> ProofScript ()` indicates that the current +goal should be assumed to be valid without proof. The given +string should be used to record why the user has decided to +assume this proof goal. + +* `quickcheck : Int -> ProofScript ()` runs the goal on the given number of random inputs, and succeeds if the result of evaluation is always `True`. This is unsound, but can be helpful during proof development, or as a way to provide some evidence for the validity of a specification believed to be true but difficult or infeasible to prove. -* `trivial : ProofScript SatResult` states that the current goal should +* `trivial : ProofScript ()` states that the current goal should be trivially true (i.e., the constant `True` or a function that immediately returns `True`). It fails if that is not the case. @@ -1354,9 +1357,6 @@ operate on AIGs. "blasting" all of its primitive operations (things like bit-vector addition) down to the level of individual bits. -* `cec : AIG -> AIG -> TopLevel ProofResult` compares two `AIG` values, -returning a `ProofResult` representing whether the two are equivalent. - * `load_aig : String -> TopLevel AIG` loads an `AIG` from an external AIGER file. @@ -2757,3 +2757,223 @@ let main : TopLevel () = do { ~~~~ [^4]: https://en.wikipedia.org/wiki/Salsa20 + + +# Extraction to the Coq theorem prover + +In addition to the (semi-)automatic and compositional proof modes +already discussed above, SAW has experimental support for exporting +Cryptol and SAWCore values as terms to the Coq proof assistant[^5]. +This is intended to support more manual proof efforts for properties +that go beyond what SAW can support (for example, proofs requiring +induction) or for connecting to preexisting formalizations in Coq +of useful algorithms (e.g. the fiat crypto library[^6]). + +This support consists of two related pieces. The first piece is a library of +formalizations of the primitives underlying Cryptol and SAWCore and +various supporting concepts that help bridge the conceptual gap +between SAW and Coq. The second piece is a term translation +that maps the syntactic forms of SAWCore onto corresponding +concepts in Coq syntax, designed to dovetail with the concepts +defined in the support library. SAWCore is a quite similar language +to the core calculus underlying Coq, so much of this translation is +quite straightforward; however, the languages are not exactly +equivalent, and there are some tricky cases that mostly arise from +Cryptol code that can only be partially supported. We will note +these restrictions later in the manual. + +We expect this extraction process to work with a fairly wide range of Coq +versions, as we are not using bleeding-edge Coq features. It has been +most fully tested with Coq version 8.13.2. + +## Support Library + +In order to make use of SAW's extraction capabilities, one must first +compile the support library using Coq so that the included definitions +and theorems can be referenced by the extracted code. From the top +directory of the SAW source tree, the source code for this support +library can be found in the `saw-core-coq/coq` subdirectory. +In this subdirectory you will find a `_CoqProject` and a `Makefile`. +A simple `make` invocation should be enough to compile +all the necessary files, assuming Coq is installed and `coqc` is +available in the user's `PATH`. HTML documentation for the support +library can also be generated by `make html` from the same directory. + +Once the library is compiled, the recommended way to import +it into your subsequent development is by adding the following +lines to your `_CoqProject` file: + +``` +-Q /saw-core-coq/coq/generated/CryptolToCoq CryptolToCoq +-Q /saw-core-coq/coq/handwritten/CryptolToCoq CryptolToCoq +``` + +Here `` refers to the location on your system where the +SAWScript source tree is checked out. This will add the relevant +library files to the `CryptolToCoq` namespace, where the extraction +process will expect to find them. + +The support library for extraction is broken into two parts: those +files which are handwritten, versus those that are automatically +generated. The handwritten files are generally fairly readable and +are reasonable for human inspection; they define most of the +interesting pipe-fitting that allows Cryptol and SAWCore definitions +to connect to corresponding Coq concepts. In particular the +file `SAWCoreScaffolding.v` file defines most of the bindings of base +types to Coq types, and the `SAWCoreVectorsAsCoqVectors.v` defines the +core bitvector operations. The automatically generated files are direct +translations of the SAWCore source files +(`saw-core/prelude/Prelude.sawcore` and +`cryptol-saw-core/saw/Cryptol.sawcore`) that correspond to the +standard libraries for SAWCore and Cryptol, respectively. + +The autogenerated files are intended to be kept up-to-date with +changes in the corresponding `sawcore` files, and end users should +not need to generate them. Nonetheless, if they are out of sync for some +reason, these files may be regenerated using the `saw` executable +by running `(cd saw-core-coq/coq; saw saw/generate_scaffolding.saw)` +from the top-level of the SAW source directory before compiling them +with Coq as described above. + +You may also note some additional files and concepts in the standard +library, such as `CompM.v`, and a variety of lemmas and definitions +related to it. These definitions are related to the "heapster" system, +which form a separate use-case for the SAWCore to Coq translation. +These definitions will not be used for code extracted from Cryptol. + +## Cryptol module extraction + +There are several modes of use for the SAW to Coq term extraction +facility, but the easiest to use is whole Cryptol module extraction. +This will extract all the definitions in the given Cryptol module, +together with it's transitive dependencies, into a single Coq module +which can then be compiled and pulled into subsequence developments. + +Suppose we have a Cryptol source file named `source.cry` and we want +to generate a Coq file named `output.v`. We can accomplish this by +running the following commands in saw (either directly from the `saw` +command prompt, or via a script file) + +``` +enable_experimental; +write_coq_cryptol_module "source.cry" "output.v" [] []; +``` + +In this default mode, identifiers in the Cryptol source will be +directly translated into identifiers in Coq. This may occasionally +cause problems if source identifiers clash with Coq keywords or +preexisting definitions. The third argument to +`write_coq_cryptol_module` can be used to remap such names if +necessary by giving a list of `(in,out)` pairs of names. The fourth +argument is a list of source identifiers to skip translating, if +desired. Authoritative online documentation for this command can be +obtained directly from the `saw` executable via `:help +write_coq_cryptol_module` after `enable_experimental`. + +The resulting "output.v" file will have some of the usual hallmarks of +computer-generated code; it will have poor formatting and, explicit +parenthesis and fully-qualified names. Thankfully, once loaded into +Coq, the Coq pretty-printer will do a much better job of rendering +these terms in a somewhat human-readable way. + +## Proofs involving uninterpreted functions + +It is possible to write a Cryptol module that references uninterpreted +functions by using the `primitive` keyword to declare them in your +Cryptol source. Primitive Cryptol declarations will be translated into +Coq section variables; as usual in Coq, uses of these section +variables will be translated into additional parameters to the +definitions from outside the section. In this way, consumers of +the translated module can instantiate the declared Cryptol functions +with corresponding terms in subsequent Coq developments. + +Although the Cryptol interpreter itself will not be able to compute +with declared but undefined functions of this sort, they can be used +both to provide specifications for functions to be verified with +`llvm_verify` or `jvm_verify` and also for Coq extraction. + +For example, if I write the following Cryptol source file: + +``` +primitive f : Integer -> Integer + +g : Integer -> Bool +g x = f (f x) > 0 +``` + +After extraction, the generated term `g` will have Coq type: + +``` +(Integer -> Integer) -> Integer -> Bool +``` + +## Translation limitations and caveats + +Translation from Cryptol to Coq has a number of fundamental +limitations that must be addressed. The most severe of these is that +Cryptol is a fully general-recursive language, and may exhibit runtime +errors directly via calls to the `error` primitive, or via partial +operations (such as indexing a sequence out-of-bounds). The internal +language of Coq, by contrast, is a strongly-normalizing language of +total functions. As such, our translation is unable to extract all +Cryptol programs. + +### Recursive programs + +The most severe of the currently limitations for our system +is that the translation is unable to translate any recursive Cryptol +program. Doing this would require us to attempt to find some +termination argument for the recursive function sufficient to satisfy +Coq; for now, no attempt is made to do so. if you attempt to +extract a recursive function, SAW will produce an error +about a "malformed term" with `Prelude.fix` as the head symbol. + +Certain limited kinds of recursion are available via the +`foldl` Cryptol primitive operation, which is translated directly +into a fold operation in Coq. This is sufficient for many basic +iterative algorithms. + +### Type coercions + +Another limitation of the translation system is that Cryptol uses SMT +solvers during its typechecking process and uses the results of solver +proofs to justify some of its typing judgments. When extracting these +terms to Coq, type equality coercions must be generated. Currently, we +do not have a good way to transport the reasoning done inside +Cryptol's typechecker into Coq, so we just supply a very simple `Ltac` +tactic to discharge these coercions (see `solveUnsafeAssert` in +`CryptolPrimitivesForSAWCoreExtra.v`). This tactic is able to +discover simple coercions, but for anything nontrivial it may +fail. The result will be a typechecking failure when compiling the +generated code in Coq when the tactic fails. If you encounter this +problem, it may be possible to enhance the `solveUnsafeAssert` tactic +to cover your use case. + +### Error terms + +A final caveat that is worth mentioning is that Cryptol can sometimes +produce runtime errors. These can arise from explicit calls to the +`error` primitive, or from partially defined operations (e.g., +division by zero or sequence access out of bounds). Such instances +are translated to occurrences of an unrealized Coq axiom named `error`. +In order to avoid introducing an inconsistent environment, the `error` +axiom is restricted to apply only to inhabited types. All the types +arising from Cryptol programs are inhabited, so this is no problem +in principle. However, collecting and passing this information around +on the Coq side is a little tricky. + +The main technical device we use here is the `Inhabited` type class; +it simply asserts that a type has some distinguished inhabitant. We +provide instances for the base types and type constructors arising +from Cryptol, so the necessary instances ought to be automatically +constructed when needed. However, polymorphic Cryptol functions add +some complications, as type arguments must also come together with +evidence that they are inhabited. The translation process takes care +to add the necessary `Inhabited` arguments, so everything ought to +work out. However, if Coq typechecking of generated code fails with +errors about `Inhabited` class instances, it likely represents some +problem with this aspect of the translation. + + +[^5]: https://coq.inria.fr +[^6]: https://github.com/mit-plv/fiat-crypto diff --git a/doc/manual/manual.pdf b/doc/manual/manual.pdf index 4cac1d62e6..0b42365742 100644 Binary files a/doc/manual/manual.pdf and b/doc/manual/manual.pdf differ